Thread: Calling exit() in hex assembly

  1. #16
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    358
    Forgot to mention my thanks for the example and clearing up my understanding of where the offset begins from

  2. #17
    Programming Wraith GReaper's Avatar
    Join Date
    Apr 2009
    Location
    Greece
    Posts
    2,637
    I see you struggling with the opcodes, I'd suggest using this website as a reference:
    coder64 edition | X86 Opcode and Instruction Reference 1.12

    It was extremely useful to me when I was trying to compile my own x86 bootstrapper. Compiling by hand... omg, it's difficult.
    Devoted my life to programming...

  3. #18
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    358
    Quote Originally Posted by GReaper View Post
    I see you struggling with the opcodes, I'd suggest using this website as a reference:
    coder64 edition | X86 Opcode and Instruction Reference 1.12

    It was extremely useful to me when I was trying to compile my own x86 bootstrapper. Compiling by hand... omg, it's difficult.
    Yeah I already found that reference but it's not exactly clear about the bytes that come after the instruction, for example some say they expect an immediate 16 or 32 byte value, but how does one tell it which to expect and where to send it? I managed barely to understand that the 1st 4 bits of an M byte indicate which register to send it to but what about the last 4 bits, what do they represent? The bitwise struct I posted earlier showes what I thought it was judging by the bytes I copied and that very reference, but a reference without links to examples is only useful if you already know how to use the bytes in question

  4. #19
    Programming Wraith GReaper's Avatar
    Join Date
    Apr 2009
    Location
    Greece
    Posts
    2,637
    Well, not all instructions follow that simple pattern. Most of the legacy (1-byte opcode) ones do, but then you have 1-byte "opcode+register" instructions, and 2-byte ones follow an entirely different pattern, not to mention abominations such as 3Dnow! that append the opcode as an immediate at the end!!! That is why I quit trying to recognize pattern in the opcodes and only concern myself with the standard bitfields, I mean the ModRM and SIB ones.

    Remember that what determines the immediate size are the opcode itself plus any operand-size prefix plus the current CPU mode. For example, "0xB0" is always "mov al, imm8" but "0xB8" can be any of the other three "mov ax/eax/rax, imm16/32/64".
    Devoted my life to programming...

  5. #20
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    358
    Quote Originally Posted by GReaper View Post
    Well, not all instructions follow that simple pattern. Most of the legacy (1-byte opcode) ones do, but then you have 1-byte "opcode+register" instructions, and 2-byte ones follow an entirely different pattern, not to mention abominations such as 3Dnow! that append the opcode as an immediate at the end!!! That is why I quit trying to recognize pattern in the opcodes and only concern myself with the standard bitfields, I mean the ModRM and SIB ones.

    Remember that what determines the immediate size are the opcode itself plus any operand-size prefix plus the current CPU mode. For example, "0xB0" is always "mov al, imm8" but "0xB8" can be any of the other three "mov ax/eax/rax, imm16/32/64".
    Could you show me what you mean some examples of the same action but with variaing sizes please? It would go a long way to getting me started I think.

  6. #21
    Programming Wraith GReaper's Avatar
    Join Date
    Apr 2009
    Location
    Greece
    Posts
    2,637
    Sure. The following are for long mode:
    Code:
    0x66 0xB9 0x34 0x12                               ; mov cx, 0x1234
    0xB9 0x34 0x12 0x00 0x00                          ; mov ecx, 0x1234
    0x48 0xB9 0x34 0x12 0x00 0x00 0x00 0x00 0x00 0x00 ; mov rcx, 0x1234
    x86-64 cannibalized the 1-byte INC and DEC instructions to define the REX prefixes. Here, 0x48 is REX.W, which makes the processor use 64-bit operands.
    Last edited by GReaper; 1 Week Ago at 11:17 AM.
    Devoted my life to programming...

  7. #22
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    358
    Quote Originally Posted by GReaper View Post
    Sure. The following are for long mode:
    Code:
    0x66 0xB9 0x34 0x12                               ; mov cx, 0x1234
    0xB9 0x34 0x12 0x00 0x00                          ; mov ecx, 0x1234
    0x48 0xB9 0x34 0x12 0x00 0x00 0x00 0x00 0x00 0x00 ; mov rcx, 0x1234
    x86-64 cannibalized the 1-byte INC and DEC instructions to define the REX prefixes. Here, 0x48 is REX.W, which makes the processor use 64-bit operands.
    Yep that definitely helped, I was totally misunderstanding where the definition of which register to use was, probably won't get round to trying anymore until monday though. Anyways thanks

  8. #23
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    358
    Quote Originally Posted by GReaper View Post
    Sure. The following are for long mode:
    Code:
    0x66 0xB9 0x34 0x12                               ; mov cx, 0x1234
    0xB9 0x34 0x12 0x00 0x00                          ; mov ecx, 0x1234
    0x48 0xB9 0x34 0x12 0x00 0x00 0x00 0x00 0x00 0x00 ; mov rcx, 0x1234
    x86-64 cannibalized the 1-byte INC and DEC instructions to define the REX prefixes. Here, 0x48 is REX.W, which makes the processor use 64-bit operands.
    Now that I think of it could I also have an example for the lea instruction too? Since it uses extra bytes to define which register and while I can guess the 1st 4bits of those bytes I don't know what the last 4 bits represent.

  9. #24
    Programming Wraith GReaper's Avatar
    Join Date
    Apr 2009
    Location
    Greece
    Posts
    2,637
    LEA is complicated (in long mode, anything accessing memory is complicated, really...), so I'll just show you examples for 32-bit mode:
    Code:
    0x67 0x66 0x8D 0x45 0x0C ; lea  ax, [di+12] ;
    0x67      0x8D 0x45 0x0C ; lea eax, [di+12] ; 16-bit addressing
    
    0x66 0x8D 0x47 0x0C ; lea  ax, [edi+12] ;
         0x8D 0x47 0x0C ; lea eax, [edi+12] ; 32-bit addressing
    
    ; Here are some examples with the 32-bit SIB byte
    0x8D 0x04 0x7D 0x00 0xC0 0x07 0x00 ; lea eax, [edi*2 + 0x7C000]
    0x8D 0x0C 0x82                     ; lea ecx, [eax*4+edx]
    In long mode, you no longer have access to 16-bit addressing. Rather, 64-bit is the default and 0x67 changes to 32-bit.

    EDIT: Now that I think about it though, the 64-bit ModRM and SIB bytes behave identically to the 32-bit ones, assuming the upper 32 bits of the operands are cleared and no REX prefixes are present. So, the 32-bit examples I gave above, in long mode are:
    Code:
    ; Same exact binary values
    
    lea ax, [rdi+12]  ;
    lea eax, [rdi+12] ; 64-bit addressing
    
    lea eax, [rdi*2 + 0x7C000]
    lea ecx, [rax*4+rdx]
    Last edited by GReaper; 1 Week Ago at 09:30 AM.
    Devoted my life to programming...

  10. #25
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    358
    Okay still struggling, here's the produced elf I'm trying to get running:
    test.elf - Google Drive
    The section of code that initialises _start() & main()
    Code:
    start_t begin__start = {
    	/* mov edi,DWORD PTR [rsp] #Move argc into param1 register */
    	{ 0x8B, 0x3C, 0x24},
    	/* lea rsi,[rsp+0x8] #Move argv into param2 register */
    	{ 0x48, 0x8D, 0x74, 0x24, 0x08 },
    	/* call main() */
    	{ 0xE8, (2 + sizeof(syscall_t)) },
    	/* mov edi,eax #Move result of main() into param1 register */
    	{ 0x89, 0xC7 },
    	{
    		/* mov eax,id? of exit() #Set exit() for syscall instruction */
    		{ 0xB8, __NR_exit },
    		/* syscall */
    		0x50F
    	}
    };
    main_t begin_main = {
    #ifdef USE_STACK
    	/* push rbp */
    	{ 0x48, 0x55 },
    	/* mov rbp,rsp */
    	{ 0x48, 0x89, 0xE5 },
    #endif
    	/* mov eax,0x00000000 */
    	{ 0xB8, 0 },
    #ifdef USE_STACK
    	/* pop rbp */
    	{ 0x48, 0x5D },
    #endif
    	/* ret */
    	0xC3
    };
    And the output of readelf:
    Code:
    readelf -all test.elf
    ELF Header:
      Magic:   7f 45 4c 46 02 01 01 ff 00 00 00 00 00 00 00 00 
      Class:                             ELF64
      Data:                              2's complement, little endian
      Version:                           1 (current)
      OS/ABI:                            <unknown: ff>
      ABI Version:                       0
      Type:                              EXEC (Executable file)
      Machine:                           Advanced Micro Devices X86-64
      Version:                           0x1
      Entry point address:               0x298
      Start of program headers:          64 (bytes into file)
      Start of section headers:          176 (bytes into file)
      Flags:                             0x0
      Size of this header:               64 (bytes)
      Size of program headers:           56 (bytes)
      Number of program headers:         2
      Size of section headers:           64 (bytes)
      Number of section headers:         4
      Section header string table index: 1
    readelf: Warning: Section 2 has an out of range sh_info value of 7
    
    Section Headers:
      [Nr] Name              Type             Address           Offset
           Size              EntSize          Flags  Link  Info  Align
      [ 0]                   NULL             0000000000000000  00000000
           0000000000000000  0000000000000000           0     0     0
      [ 1] .shstrtab         STRTAB           00000000000001b0  000001b0
           0000000000000040  0000000000000001  AS       0     0     1
      [ 2] .symtab           SYMTAB           00000000000001f0  000001f0
           00000000000000a8  0000000000000018 WAI       1     7     8
      [ 3] .text             PROGBITS         0000000000000298  00000298
           000000000000001c  0000000000000001 WAXlp       0     0     1
    Key to Flags:
      W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
      L (link order), O (extra OS processing required), G (group), T (TLS),
      C (compressed), x (unknown), o (OS specific), E (exclude),
      l (large), p (processor specific)
    
    There are no section groups in this file.
    
    Program Headers:
      Type           Offset             VirtAddr           PhysAddr
                     FileSiz            MemSiz              Flags  Align
      NULL           0x0000000000000000 0x0000000000000000 0x0000000000000000
                     0x0000000000000000 0x0000000000000000         0x0
      LOAD           0x0000000000000000 0x0000000000000298 0x0000000000000298
                     0x000000000000001c 0x000000000000001c  RWE    0x0
    
     Section to Segment mapping:
      Segment Sections...
       00     
       01     
    
    There is no dynamic section in this file.
    
    There are no relocations in this file.
    
    The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.
    
    Symbol table '.symtab' contains 7 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
         1: 00000000000001b0    64 SECTION LOCAL  DEFAULT    1 .shstrtab
         2: 00000000000001f0   168 SECTION LOCAL  DEFAULT    2 .symtab
         3: 0000000000000298    28 SECTION LOCAL  DEFAULT    3 .text
         4: 0000000000000298    28 FILE    LOCAL  DEFAULT  ABS mitsy.c
         5: 0000000000000298    22 NOTYPE  GLOBAL DEFAULT    3 _start
         6: 00000000000002ae     6 FUNC    GLOBAL DEFAULT    3 main
    
    No version information found in this file.
    Any ideas? I get the feeling that perhaps the problem is not in the executable bytes but how I'm declaring them since I outright copied some from a sample elf that was forced to bare minimum at the start which is how I got the argc/argv stuff and the setup of the exit() call, the only thing I actually modified to begin with was the call to main()'s relative address because it was no longer valid where it was referencing.

  11. #26
    Programming Wraith GReaper's Avatar
    Join Date
    Apr 2009
    Location
    Greece
    Posts
    2,637
    That may not be the problem, but I see you use "0x48" with push and pop. I don't think that's right, because in long mode you can only push and pop 64-bit registers (the instruction defaults to it).

    EDIT: Oh, I just realized. Your mov instruction is wrong. If you want to use an immediate, it should be 5 bytes, not 2 (since the immediate is 32-bit).
    Last edited by GReaper; 6 Days Ago at 07:04 AM.
    Devoted my life to programming...

  12. #27
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    358
    Quote Originally Posted by GReaper View Post
    That may not be the problem, but I see you use "0x48" with push and pop. I don't think that's right, because in long mode you can only push and pop 64-bit registers (the instruction defaults to it).

    EDIT: Oh, I just realized. Your mov instruction is wrong. If you want to use an immediate, it should be 5 bytes, not 2 (since the immediate is 32-bit).
    Ahh kk thx, I put those in thinking it would upgrade from ebp to rbp as a result, since they're not needed then I'll just remove them then.

    Edit: lol I must've caught your message before it was modified when re-loading but caught the modified when quoting, that mov is actually declared via a struct now hence only 2 integers:
    Code:
    typedef struct ATTR_PACKED {
    	u8 x;
    	u32 s;
    } x8s32_t;
    Last edited by awsdert; 6 Days Ago at 07:10 AM.

  13. #28
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    358
    Here's my current code with the elfs I was referencing:
    mitsy.tar.gz - Google Drive

  14. #29
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    358
    If ya'll are wandering what the progress is it's zilch, since I couldn't think of how to fix the execution problem I decided to first redesign my code to no longer require fixed positions when building the elf headers, so far as a means of managing memory I have each be built in temporary files while sharing a buffer for reading/writing each header, the next part I'm doing now is translating my old code to the new code and build the same file using the new code, when that's done I'll circle back to the execution issue by trying to replicate ALL the various headers/data seen in the example.elf and sample.elf I created with gcc instead of just what I thought was needed, once done if it still fails to execute then I'll know it's the byte code and not missing information that's the issue.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Calling assembly from c
    By Kuluzuva in forum C Programming
    Replies: 9
    Last Post: 05-16-2011, 03:54 AM
  2. Calling exit()... without exiting
    By JordyD in forum C Programming
    Replies: 3
    Last Post: 11-21-2009, 04:06 PM
  3. calling functions: exit and return
    By 911help in forum C Programming
    Replies: 3
    Last Post: 12-28-2007, 01:24 PM
  4. calling exit() crashes
    By ichijoji in forum C++ Programming
    Replies: 4
    Last Post: 12-19-2005, 12:26 AM
  5. Calling exit() with dynamic memory
    By miclus in forum C Programming
    Replies: 11
    Last Post: 10-05-2004, 11:49 AM

Tags for this Thread