Thread: calling assembly routine from C program

  1. #1
    Registered User
    Join Date
    Feb 2015
    Posts
    22

    calling assembly routine from C program

    I'm trying to call a simple assembly routine assembled with 64-bit NASM called add(int a int b) from a simple C program compiled with 64-bit GCC. The example I'm working from used C++ not C - I converted it to C. The linker error I get is:

    undefined reference to `add'

    Here's the assembly:

    ;command line: nasm -felf64 add1.asm -F dwarf -o add1_ref.o

    Code:
    global add
    global _start
    
    add:
    _start:
    
    push rbp
    mov rbp, rsp
    mov rax, [rbp+16]
    mov rbx, [rbp+24]
    add rax, rbx
    leave
    ret
    Here's the C program

    // gcc add1.c -o add1.o && ld add1_ref.o add1.o -o add1

    Code:
    #include <stdio.h>
    
    extern int add(int a, int b);
    
    int ans1=0;
    
    int main()
    {
       ans1=add(40, 2);
    printf("\n the output is: %i \n",ans1);
    
       return 0;
    }
    The ld linker can't find the label add:. Does any one know what I should do? TIA Bill S.

  2. #2
    misoturbutc Hodor's Avatar
    Join Date
    Nov 2013
    Posts
    1,787
    Hmm. Well, it really depends on the compiler, but if you google for "NASM cdecl" (without the quotes) you'll probably find your answer.

    If you're using C++ then google "extern C"

  3. #3
    Registered User
    Join Date
    Feb 2015
    Posts
    22
    Thanks for the suggestion.

  4. #4
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    Could you put an add function in the .C file and have GCC output assembly code to see what the syntax should be?

  5. #5
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    The 'nm' command will tell you what the compiler generated name is.
    Eg.
    Code:
    $ gcc -c bar.c
    $ nm -u bar.o
                     U add
                     U printf
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  6. #6
    Registered User
    Join Date
    Feb 2015
    Posts
    22
    Quote Originally Posted by rcgldr View Post
    Could you put an add function in the .C file and have GCC output assembly code to see what the syntax should be?
    I made this C program:
    Code:
    // addd.c
    
    #include <stdio.h>
    
    int c;
    int add(int x, int y);
    
    int main(void){
    int a=3, b=5;
    
    c = add(a, b);
    printf("\nsum1 = %i \n",c);
    }
    
    int add(x,y){
    c = x + y;
    return 0;
    }
    Then I used this command line:
    gcc -S addd.c -o addd.asm

    Which gave me this :
    Code:
        .file    "addd.c"
        .comm    c,4,4
        .section    .rodata
    .LC0:
        .string    "\nsum1 = %i \n"
        .text
        .globl    main
        .type    main, @function
    main:
    .LFB0:
        .cfi_startproc
        pushq    %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        subq    $16, %rsp
        movl    $3, -8(%rbp)
        movl    $5, -4(%rbp)
        movl    -4(%rbp), %edx
        movl    -8(%rbp), %eax
        movl    %edx, %esi
        movl    %eax, %edi
        call    add
        movl    %eax, c(%rip)
        movl    c(%rip), %eax
        movl    %eax, %esi
        movl    $.LC0, %edi
        movl    $0, %eax
        call    printf
        leave
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
    .LFE0:
        .size    main, .-main
        .globl    add
        .type    add, @function
    add:
    .LFB1:
        .cfi_startproc
        pushq    %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movl    %edi, -4(%rbp)
        movl    %esi, -8(%rbp)
        movl    -8(%rbp), %eax
        movl    -4(%rbp), %edx
        addl    %edx, %eax
        movl    %eax, c(%rip)
        movl    $0, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
    .LFE1:
        .size    add, .-add
        .ident    "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4"
        .section    .note.GNU-stack,"",@progbits
    But there's too much extraneous stuff and the assembly instructions are weird. I don't know what part of it too use.

  7. #7
    misoturbutc Hodor's Avatar
    Join Date
    Nov 2013
    Posts
    1,787
    Ok, if you're going to try this route, the first thing to do is simplify your "program". Don't have main()... just add() and don't include any headers. Something like:

    Code:
    int add(int a, int b)
    {
        return a + b;
    }
    Should be enough.

    Try: gcc -S -masm=intel addd.c -o addd.asm

    This should, I think, give you asm syntax you're more used to. (edit: I'd make sure myself instead of relying on memory but I don't have a compiler on this computer )

    After that you have to identify parts of the code that are the function prologue and epilogue.

    Edit: In a nutshell, what we're discussing here is, and what you need, is the C calling convention for the compiler and OS in question. Generally, it's two parts: (a) The function prologue; and b) the function epilogue. Your code goes after the prologue and before the epilogue. The prologue is responsible for saving the state of registers and whatever else the calling convention requires, and the epilogue is responsible for restoring that state of registers and whatever else is required and also storing the return value (if any) in the correct register (it doesn't have to be a register but it usually is).

    Edit 2: The calling convention will also specify where/how function arguments are stored. They're usually on the stack, in the order specified by the calling convention (they don't have to be on the stack... the calling convention might specify that they or some of them might be in registers; I can't recall the last time I encountered this scenario but I have encountered it so I thought I'd mention it for interests sake).

    See also: c++ - Function Prologue and Epilogue in C - Stack Overflow
    Last edited by Hodor; 11-01-2015 at 03:37 AM.

  8. #8
    Registered User
    Join Date
    Feb 2015
    Posts
    22
    Quote Originally Posted by Hodor View Post
    Ok, if you're going to try this route, the first thing to do is simplify your "program". Don't have main()... just add() and don't include any headers. Something like:

    Code:
    int add(int a, int b)
    {
        return a + b;
    }
    Should be enough.

    Try: gcc -S -masm=intel addd.c -o addd.asm

    This should, I think, give you asm syntax you're more used to. (edit: I'd make sure myself instead of relying on memory but I don't have a compiler on this computer )

    After that you have to identify parts of the code that are the function prologue and epilogue.

    Edit: In a nutshell, what we're discussing here is, and what you need, is the C calling convention for the compiler and OS in question. Generally, it's two parts: (a) The function prologue; and b) the function epilogue. Your code goes after the prologue and before the epilogue. The prologue is responsible for saving the state of registers and whatever else the calling convention requires, and the epilogue is responsible for restoring that state of registers and whatever else is required and also storing the return value (if any) in the correct register (it doesn't have to be a register but it usually is).

    Edit 2: The calling convention will also specify where/how function arguments are stored. They're usually on the stack, in the order specified by the calling convention (they don't have to be on the stack... the calling convention might specify that they or some of them might be in registers; I can't recall the last time I encountered this scenario but I have encountered it so I thought I'd mention it for interests sake).

    See also: c++ - Function Prologue and Epilogue in C - Stack Overflow

    Here's what I got:
    Code:
    add:
        push    rbp
        mov    rbp, rsp
        mov    DWORD PTR [rbp-4], edi
        mov    DWORD PTR [rbp-8], esi
        mov    eax, DWORD PTR [rbp-8]
        mov    edx, DWORD PTR [rbp-4]
        add    eax, edx
        pop    rbp
        ret
    The prologue and epilogue are in 64-bit and the rest is in 32-bit. I must need a different command line.

  9. #9
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    The add function can be simplified to:

    Code:
    add:
            mov     eax,edi
            add     eax,esi
            ret
    Last edited by rcgldr; 11-01-2015 at 05:32 AM.

  10. #10
    misoturbutc Hodor's Avatar
    Join Date
    Nov 2013
    Posts
    1,787
    Is that a problem?

    add.asm
    Code:
    global  add
    
    add:
        push    rbp
        mov    rbp, rsp
        mov    DWORD [rbp-4], edi
        mov    DWORD [rbp-8], esi
        mov    eax, DWORD [rbp-8]
        mov    edx, DWORD [rbp-4]
        add    eax, edx
        pop    rbp
        ret
    test.c
    Code:
    #include <stdio.h>
    
    extern int add(int a, int b);
    
    int main(void)
    {
        printf("%d\n", add(40, 20));
        return 0;
    }

    nasm -felf64 add.asm
    gcc test.c add.o
    ./a.out
    60

  11. #11
    Registered User
    Join Date
    Feb 2015
    Posts
    22
    Quote Originally Posted by Hodor View Post
    Is that a problem?

    add.asm
    Code:
    global  add
    
    add:
        push    rbp
        mov    rbp, rsp
        mov    DWORD [rbp-4], edi
        mov    DWORD [rbp-8], esi
        mov    eax, DWORD [rbp-8]
        mov    edx, DWORD [rbp-4]
        add    eax, edx
        pop    rbp
        ret
    test.c
    Code:
    #include <stdio.h>
    
    extern int add(int a, int b);
    
    int main(void)
    {
        printf("%d\n", add(40, 20));
        return 0;
    }

    nasm -felf64 add.asm
    gcc test.c add.o
    ./a.out
    60

    I'm back to original problem: Compiling ad2.c gives error: undefined reference to `ad2'.
    Note I changed add to ad2. Here's our assembly:
    Code:
    ;  nasm -felf64 ad2.asm -o ad2_ref.o
    
    global  ad2
     
    ad2:
        push    rbp
        mov    rbp, rsp
        mov    DWORD [rbp-4], edi
        mov    DWORD [rbp-8], esi
        mov    eax, DWORD [rbp-8]
        mov    edx, DWORD [rbp-4]
        add    eax, edx
        pop    rbp
        ret
    Here's the C that 'calls' the assembly:
    Code:
    //    gcc ad2.c -o ad2.o
    
    #include <stdio.h>
     
    extern int ad2(int a, int b);
     
    int main(void)
    {
        printf("%d\n", ad2(40, 20));
        return 0;
    }
    Compiling ad2.c gives error: undefined reference to `ad2'.

  12. #12
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    Quote Originally Posted by bilsch02 View Post
    // gcc ad2.c -o ad2.o
    There's your problem. That command tries to compile ad2.c into an executable named ad2.o. Use
    Code:
    nasm -felf64 ad2.asm -o ad2_ref.o
    gcc -Wall -O2 -m64 -c ad2.c
    gcc -Wall -O2 -m64 ad2.o ad2_ref.o -o prog

  13. #13
    misoturbutc Hodor's Avatar
    Join Date
    Nov 2013
    Posts
    1,787
    Quote Originally Posted by Nominal Animal View Post
    There's your problem. That command tries to compile ad2.c into an executable named ad2.o. Use
    Code:
    nasm -felf64 ad2.asm -o ad2_ref.o
    gcc -Wall -O2 -m64 -c ad2.c
    gcc -Wall -O2 -m64 ad2.o ad2_ref.o -o prog
    Oh, gosh. I wonder if that was what the problem was in the first place. No harm done I guess... at least calling conventions and stuff were introduced and some handy skills (like examining what gcc -S produces). nasm has macros (directives?) that make the prologue and epilogue much easier to implement as well (not that it's hard anyway, but perhaps for future proofing using what nasm provides is better: NASM Manual)

  14. #14
    Registered User
    Join Date
    Feb 2015
    Posts
    22
    Quote Originally Posted by Hodor View Post
    Oh, gosh. I wonder if that was what the problem was in the first place. No harm done I guess... at least calling conventions and stuff were introduced and some handy skills (like examining what gcc -S produces). nasm has macros (directives?) that make the prologue and epilogue much easier to implement as well (not that it's hard anyway, but perhaps for future proofing using what nasm provides is better: NASM Manual)

    I'm getting error: undefined reference to `printf'.
    Here's the assembly:
    Code:
    ;  nasm -felf64 ad2.asm -o ad2_ref.o
    
    global  _start
    global  ad2
    
    _start: 
    ad2:
        push   rbp
        mov    rbp, rsp
        mov    DWORD [rbp-4], edi
        mov    DWORD [rbp-8], esi
        mov    eax, DWORD [rbp-8]
        mov    edx, DWORD [rbp-4]
        add    eax, edx
        pop    rbp
        ret
    Here's the C that uses the assembly:
    Code:
    //    gcc ad2.c -c && ld ad2.o ad2_ref.o -o ad2
    
    #include <stdio.h>
     
    extern int ad2(int a, int b);
     
    int main(void)
    {
        printf("%d\n", ad2(40, 20));
        return 0;
    }
    I reinstalled gcc to fix this but didn't work.
    I'm getting error: undefined reference to `printf'.

  15. #15
    Ticked and off
    Join Date
    Oct 2011
    Location
    La-la land
    Posts
    1,728
    You're consistently using the compilation commands wrong.

    Computers are stupid, you have to be precise, or they'll try to do stupid stuff.

    Quote Originally Posted by bilsch02 View Post
    nasm -felf64 ad2.asm -o ad2_ref.o
    gcc ad2.c -c
    ld ad2.o ad2_ref.o -o ad2
    You tried to use ld to link your object files. That won't work, because GCC by default tries to use dynamic linking for the standard libraries. You need to use GCC for the linking, too, or supply (the pretty complicated) options to ld to get it to understand it has to link to the C libraries.

    Because GCC uses ld internally to link anyway, you do not save anything by not linking using GCC.

    So, to recap. The correct compilation options are
    Code:
    nasm -felf64 ad2.asm -o ad2_ref.o
    gcc -Wall -c ad2.c
    gcc -Wall ad2.o ad2_ref.o -o ad2
    I deliberately added the -Wall option, because you should always use it, and act on the warnings it gives. There are certain cases where warnings are useless, but they're not that common. It's better practice to write code that quells the warnings instead.

    I also tend to use -O2, because I want GCC to produce relatively optimized code, it does not do anything particularly dangerous, and since it's what I use in production, I'd rather have my coding skills and debugging done at that optimization level from the get go.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Calling assembly from c
    By Kuluzuva in forum C Programming
    Replies: 9
    Last Post: 05-16-2011, 03:54 AM
  2. Calling Assembly Language programs from C++ program
    By EonsNearby in forum C++ Programming
    Replies: 6
    Last Post: 04-23-2011, 04:58 AM
  3. function adapted from assembly routine
    By curlious in forum C++ Programming
    Replies: 0
    Last Post: 11-18-2003, 06:30 PM
  4. Calling C++ from assembly?
    By Chris361 in forum C++ Programming
    Replies: 2
    Last Post: 10-22-2003, 06:17 AM
  5. Linking an assembly routine into a GCC project
    By huh in forum C++ Programming
    Replies: 3
    Last Post: 11-21-2002, 03:14 PM