Thread: How would I know which registers are used for the returned value of a register?

  1. #1
    Registered User
    Join Date
    Oct 2021
    Posts
    138

    How would I know which registers are used for the returned value of a register?

    Again I want to use inline assembly with GCC. In my previous question I asked about clobbers, now I want to ask about the returned values from system calls.

    So let's consider the "write" system call on 64bit Linux. The RAX register takes the number of the system call (classic), the RDI register takes the file descriptor to write in, the RSI takes the pointer to the data (string) to write and the RDX takes the number of bytes to write. The code will be the following (in GCC inline assembly):

    Code:
    #include "stdio.h"
    
    void write(int fd, const char* buf, unsigned long len) {
      asm (     
           "syscall"     
           : : "a" (1), "D" (1), "S" (buf), "d" (len)     
           : "memory", "rcx"
      ); 
    }
    
    int main() {   
      const char* hello  = "Hello world!\n";
      write(1, hello, 13);
      return 0;
    }
    This will write "Hello world!" to the stdout. Now I just happened to have played a little bit with DDD which is a (very outdated) graphical frontend for GDB. There, I used breakpoints to check the value of the registers after each line. For my surprise, I found out that the value of RAX was changed after the system call and tbh at that time, I didn't knew why that was happening. I was always wondering how system calls return values and where the place them. As you can see, I'm not so smart and I can't connect pieces so easily but at least I'm trying...

    As you probably know, "unistd.h" defines function declarations for system calls to be used with the C programing language. The actual definition is in libc. I was looking at the "write" system call and I saw that it returns a "size_t" value. This is when I realized that the value that RAX got after the system call was probably the returned value from the system call so here is how system calls return values (again I'm not so smart). I checked it out and it is indeed true! Let's now consider the following code:

    Code:
    #include "stdio.h"
    
    unsigned long write(int fd, const char* buf, unsigned long len) {
      unsigned long val;
      asm (     
           "syscall"     
           : "=a"(val) // Output
           : "a" (1), "D" (1), "S" (buf), "d" (len) // Input     
           : "memory", "rcx"
      );
      
      return val;
    }
    
    int main() {   
      const char* hello  = "Hello world!\n";
      const char* name = "John\n";
      unsigned long n;
    
      n = write(1, hello, 13);
      printf("System call done! You have written %lu characters!\n\n", n);
    
      n = write(1, name, 5);
      printf("System call done! You have written %lu characters!\n", n);
    
      return 0;
    }
    Compiling and executing this code, we will output saying the the first called has written 13 characters and the second has written 5 characters.

    Now my question. How would I know which value each system call returns? Now of course I can use the debugger but it is not available in my main home machine and this is tedious. Everything I see seems to talk about "unistd.h" and the libc interface rather than the system call itself. Is there an official documentation? For 64-bit Linux at least...

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > How would I know which registers are used for the returned value of a register?
    The same way you figure out which registers are used to pass in parameters.

    > So let's consider the "write" system call on 64bit Linux.
    And if your machine happens to be 64-bit ARM rather than 64-bit X86, all bets are off.

    > Is there an official documentation?
    The source code probably.
    The number of architectures Linux is ported to (a couple of dozen or so) would make it all but impossible to maintain this microscopic detail.
    Yes the API for functions like 'write' are well documented, but the minute detail of which parameter goes in which register on which platform is best left in the code.

    > Everything I see seems to talk about "unistd.h"
    There are multiple files with this name.
    $ find usr/ lib* -name "unistd*.h" finds over 100 files on my system.


    Which brings me to my next point, inline assembler is like 'goto'. It's there if you need it, but it should be your last resort solution.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Oct 2021
    Posts
    138
    First of all, good job at using markdown style syntax. It actually makes life much easier when trying to quote different parts of the post. Anyways, let's see

    > The same way you figure out which registers are used to pass in parameters.
    Well... Like I said I don't know a lot about assembly. I watched some videos on YT and a [pdf](http://www.egr.unlv.edu/~ed/assembly64.pdf) so I'm not 100% sure
    that I'm doing things the right way all the time. Also, I can see which registers to use [here](Linux System Call Table for x86 64 * Ryan A. Chapman)
    and there are more other system call tables but they don't see which register holds the returned value. I tried for "read" and it is RAX (in 64-bit x86_x64 Linux) so I
    guess this is the one used for every system call.

    > And if your machine happens to be 64-bit ARM rather than 64-bit X86, all bets are off.
    Yeah, sorry. My machine is 64-bit x86_x64 Linux. It would be cool to code in a 64-bit ARM raspberry pie (or something like that) tho :P

    > The source code probably.
    > The number of architectures Linux is ported to (a couple of dozen or so) would make it all but impossible to maintain this microscopic detail.
    > Yes the API for functions like 'write' are well documented, but the minute detail of which parameter goes in which register on which platform is best left in the code.
    WHAT??? You are kidding me right now right? Please tell me that this is a joke! So I must find where this info is buried between those million lines of code? This reminds me of
    when I asked about how to convert a float to a string and you send me a link to the glibc source code XD!!!! Seriously now, I would expect them to at least document 64-bit/32-bit
    x86 and 64-bit ARM. Of course documenting all the platforms would be a hell but at least these 3 that are the most used. Damn I don't want to imagine what programmers had to deal with before
    internet...

    > There are multiple files with this name.
    > $ find usr/ lib* -name "unistd*.h" finds over 100 files on my system.
    First of all the command has a typo (unless something else is going on). Ok now in the main point. I don't know why there are so many file but the file we are talking about is
    "/usr/include/unistd.h" which is "the name of the header file that provides access to the POSIX operating system API" as described by Wikipedia.

    > Which brings me to my next point, inline assembler is like 'goto'. It's there if you need it, but it should be your last resort solution.
    Well in the funny note, I would disagree that it is like "goto". "goto" can make some things easier (I love it!!) but I don't think that there is something that you cannot do without it.
    Inline assembly for the other hand is necessary when you want to build something with no dependencies (nothing, no libc, rt, pthread etc.). At least for system calls. And I want
    to make a system library so I need it. And hell! I promised myself that I won't give up no matter what! So in the end, thanks a lot for the help and happy holidays my friend!
    Last edited by rempas; 12-27-2021 at 12:03 PM.

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Oct 2021
    Posts
    138
    Thanks! They will come in handy! Tho I actually implemented some more system calls in it seems that I was right and everything puts the returned value in "RAX" (in my system). So we are fine. Thanks a lot for all the help!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Registers, C++
    By yanol in forum C++ Programming
    Replies: 6
    Last Post: 06-05-2008, 02:07 AM
  2. Using MMX & XMM Registers
    By HyperShadow in forum C++ Programming
    Replies: 3
    Last Post: 07-14-2007, 12:53 AM
  3. RAM vs Cache & Registers
    By Grins2Pain in forum C++ Programming
    Replies: 3
    Last Post: 09-26-2003, 09:02 AM
  4. moving byte pointed to by contents of a register into another register. dos debug.
    By Brian in forum A Brief History of Cprogramming.com
    Replies: 6
    Last Post: 04-18-2003, 05:48 PM
  5. Registers
    By Golden in forum C Programming
    Replies: 2
    Last Post: 09-04-2001, 11:48 AM

Tags for this Thread