Thread: How to map machine instctions in memory and execute them? (like a loader does)

  1. #16
    Registered User
    Join Date
    Dec 2017
    Posts
    1,633
    But still I do wonder why the 32bit instructions worked in my 64bit machine
    What did your original assembly code look like?
    How did you create the machine code from it? What assembler did you use? What options?
    How exactly did you create the ELF file?
    As long as the string data was stored in a memory address representable in 32-bits there's no reason why the code wouldn't work.
    A little inaccuracy saves tons of explanation. - H.H. Munro

  2. #17
    Registered User
    Join Date
    Oct 2021
    Posts
    138
    Quote Originally Posted by john.c View Post
    What did your original assembly code look like?
    How did you create the machine code from it? What assembler did you use? What options?
    How exactly did you create the ELF file?
    I didn't used assembly. Just created an ELF file (manually wrote the bytes) and then used the
    OS's loader (executing it with './' from the shell) to load and execute it. The original code is from
    a github repository called `go-x64-executable`. You can find it here:
    GitHub - vishen/go-x64-executable: Generate ELF Linux 64-bit (x86-64) executable manually

    Also, I have created a version of the program in the D programming language. My version also
    includes the structs and values for some parts of the elf file (what you would get with the "elf.h"
    header file). Both of these versions result in the same file that contains 32bit instructions while having the `E_CLASS` field to 2
    and it runs in my 64bit machine...

    Of course using a hex editor (or any editor that allows you to enter raw numbers and not "unicode characters")
    and manually creating the ELF file would do the work too and will result in the same thing.

    Quote Originally Posted by john.c View Post
    As long as the string data was stored in a memory address representable in 32-bits there's no reason why the code wouldn't work.
    Yeah but other than addresses, the SAME "hex values" for the SAME opcodes mean different things between the 32-bit version and the 64-bit version as @flp1969 said so how
    would me CPU react if it saw these instructions? It would treat an opcode as its 64-bit version which would have a complete different meaning that its 32-bit counterpart. Unless
    I'm getting something wrong...
    Last edited by rempas; 06-08-2022 at 12:08 AM. Reason: making a clarification

  3. #18
    Registered User
    Join Date
    Feb 2019
    Posts
    1,078
    I didn't say ALL instructions opcodes are different...

  4. #19
    Registered User
    Join Date
    Dec 2017
    Posts
    1,633
    Yeah but other than addresses, the SAME "hex values" for the SAME opcodes mean different things between the 32-bit version and the 64-bit version as @flp1969 said so how
    would me CPU react if it saw these instructions?
    Yeah, that's not what he said at all. He said a very small number of x86 instructions are different in x64, none of which you used.
    A little inaccuracy saves tons of explanation. - H.H. Munro

  5. #20
    Registered User
    Join Date
    Oct 2021
    Posts
    138
    @flp1969 @john.c

    Oh, sorry guys! And thank you for helping me start out. I'm already starting to feel more confident.
    I also use an assembler (can decide between "nasm" and "gas" yet as they seem to produce differnt
    output for some instructions) and then "objdump" to see the instruction syntax in binary (well, hex
    actually but you know what I mean).

  6. #21
    Registered User
    Join Date
    Feb 2019
    Posts
    1,078
    NASM "optimizes" some instructions... Example, sometimes xor rax,rax is optimized to xor eax,eax (do the same, but smaller). GAS isn't an optimizing assembler..
    I prefer NASM because of this and it is easier to deal with structures than GAS.

    I use GAS only for processors other than Intel/AMD.

  7. #22
    Registered User
    Join Date
    Oct 2021
    Posts
    138
    Quote Originally Posted by flp1969 View Post
    NASM "optimizes" some instructions... Example, sometimes xor rax,rax is optimized to xor eax,eax (do the same, but smaller). GAS isn't an optimizing assembler..
    I prefer NASM because of this and it is easier to deal with structures than GAS.

    I use GAS only for processors other than Intel/AMD.
    Yeah, that's exactly the behavior I noticed and I'm personally against that. And well, I wouldn't mind it if I had a way to force this behavior. But it seems that
    even when I type something like this: `mov rax, qword 0x10`, NASM will still do: `mov eax, 0x10` even tho I explicitly told it I want to move a quad-word.

    I do understand that GAS was created as a backend for GCC so there is not point on making optimizations as GCC will do them but still, NASM could at least
    have a flag to allow us to choose if you want optimizations or not. And also, Assembly is considered a low level language so probably, you know what to do
    and you have control over anything so there is no reason for the compile to "help" you. But again, just my humble opinion. I hope everyone can have fun!
    Last edited by rempas; 06-10-2022 at 02:18 AM. Reason: fixed typo

  8. #23
    Registered User
    Join Date
    Feb 2019
    Posts
    1,078
    Quote Originally Posted by rempas View Post
    Yeah, that's exactly the behavior I noticed and I'm personally against that. And well, I wouldn't mind it if I had a way to force this behavior. But it seems that
    even when I type something like this: `mov rax, qword 0x10`, NASM will still do: `mov eax, 0x10` even tho I explicitly told it I want to move a quad-word.
    mov eax,0x10 isn't, let's say, mov al,0x10, because, in x86-64 mode, updating 32 bits aliases (E??) of 64 bits registers (R??) automatically zero the upper 32 bits (See Intel SDM or AMD development manuals). Other optimizations can occur if, for example, you use push 0x10. In optimized code this instruction use an 8 bit immediate (and corresponding opcode). The same thing happens with conditional jumps (nasm tends to use an 8 bit RIP relative displacement).

    Other interesing optimization happens when you code this "non existent" instruction:

    Code:
    mov eax,[5*eax]
    movl (,%eax,5),%eax    # Invalid on GAS!
    This, in NASM, is translated to:

    Code:
    mov eax,[eax + 4*eax]
    You can force NASM to not optimize some of those instructions using -O0 option at command line. See nasm --help.

    Why do you want to create bigger code if you can generate a smaller one?
    Last edited by flp1969; 06-10-2022 at 04:43 AM.

  9. #24
    Registered User
    Join Date
    Oct 2021
    Posts
    138
    Quote Originally Posted by flp1969 View Post
    mov eax,0x10 isn't, let's say, mov al,0x10, because, in x86-64 mode, updating 32 bits aliases (E??) of 64 bits registers (R??) automatically zero the upper 32 bits (See Intel SDM or AMD development manuals). Other optimizations can occur if, for example, you use push 0x10. In optimized code this instruction use an 8 bit immediate (and corresponding opcode). The same thing happens with conditional jumps (nasm tends to use an 8 bit RIP relative displacement).

    Other interesing optimization happens when you code this "non existent" instruction:

    Code:
    mov eax,[5*eax]
    movl (,%eax,5),%eax    # Invalid on GAS!
    This, in NASM, is translated to:

    Code:
    mov eax,[eax + 4*eax]
    You can force NASM to not optimize some of those instructions using -O0 option at command line. See nasm --help.

    Why do you want to create bigger code if you can generate a smaller one?
    Again, thank you for the info! And btw, the only way I would prefer to not optimize is if I wanted to check the machine instructions to learn.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 3
    Last Post: 09-29-2021, 04:31 AM
  2. Replies: 4
    Last Post: 01-18-2008, 07:05 PM
  3. look at my messed up md3 loader
    By Shadow12345 in forum Game Programming
    Replies: 0
    Last Post: 12-10-2002, 07:10 PM
  4. multiple os loader
    By codefx in forum C Programming
    Replies: 1
    Last Post: 11-23-2002, 09:45 PM
  5. IDEA: A Slot Machine (aka a fruit machine)
    By ygfperson in forum Contests Board
    Replies: 0
    Last Post: 08-12-2002, 11:13 PM

Tags for this Thread