Thread: Size of Assembly instructions

  1. #1
    Registered User
    Join Date
    Jan 2006
    Location
    Latvia
    Posts
    102

    Size of Assembly instructions

    Hello. I'm not sure where to post this, but here goes.

    I need to read instructions one by one from a binary, but the fact that IA-32 instructions are not fixed-size makes it difficult. Is there some resource available on the Internet that has the size information for every opcode?

    Here's an example
    Code:
    //I could read the first byte and determine the length of the rest of the instrucition if I had some sort of a reference.
    
    B8 78 56 34 12          mov eax, 0x12345678
    //B8 would suggest that it's an instruction that moves a dword into eax and I'd have to read additional 4 bytes to complete the instruction
    
    6A 3E                push 0x3e
    //6A, the opcode to push a byte value on the stack, would mean that I only need to read one more byte
    So, is there a list with all the opcodes and their instruction size? Thanks!

  2. #2
    Password:
    Join Date
    Dec 2009
    Location
    NC
    Posts
    587
    There's libraries for dealing with x86 opcodes, but I don't know if they can do that. A Synaptic search of "opcode" return libasm0, that might be useful.

  3. #3
    Password:
    Join Date
    Dec 2009
    Location
    NC
    Posts
    587
    Or you could do it the fun way... Here's a good ref: coder32 edition | X86 Opcode and Instruction Reference 1.11

  4. #4
    Registered User
    Join Date
    Oct 2006
    Posts
    250
    Intel® 64 and IA-32 Architectures Software Developer's Manuals

    You will want to take a look at parts 2A and 2B.

  5. #5
    Registered User
    Join Date
    Jan 2006
    Location
    Latvia
    Posts
    102
    Thanks for the links, although it still seems to be a lot of work and hassle to get the size from that information (that's what I'll resort to if nothing better shows up). Is there a summary somewhere with the size of each opcode's instruction in bytes?

  6. #6
    Registered User
    Join Date
    Oct 2006
    Posts
    250
    Laziness will get you know where. Did you even bother to take a look at the referenced documents?

    Hint: Part 2B, Appendix A. I can assure you, that's the most concise, clear, not to mention complete overview you will find, anywhere.

  7. #7
    Registered User
    Join Date
    Jan 2006
    Location
    Latvia
    Posts
    102
    Yes, I had seen the documents before I started this thread. I was just hoping to find ready-made list of all opcodes taking 1 byte, all opcodes taking 2 bytes and so on. If there's no such resource, I'll work with the Intel manual, thank you for that.

  8. #8
    Guest Sebastiani's Avatar
    Join Date
    Aug 2001
    Location
    Waterloo, Texas
    Posts
    5,708
    Quote Originally Posted by Overlord View Post
    Yes, I had seen the documents before I started this thread. I was just hoping to find ready-made list of all opcodes taking 1 byte, all opcodes taking 2 bytes and so on. If there's no such resource, I'll work with the Intel manual, thank you for that.
    Unfortunately, it doesn't work like that; the length of the opcode often depends on the value of preceding bytes. Another problem, which isn't so obvious, is that data can be embedded in the code; you'll need to skip over it so that it doesn't get construed as instructions. Also, keep in mind that the code can even generate *additional* code! If you want to handle that, too, you'll need to implement a virtual machine of sorts.

    In other words, be prepared for a major undertaking.
    Code:
    #include <cmath>
    #include <complex>
    bool euler_flip(bool value)
    {
        return std::pow
        (
            std::complex<float>(std::exp(1.0)), 
            std::complex<float>(0, 1) 
            * std::complex<float>(std::atan(1.0)
            *(1 << (value + 2)))
        ).real() < 0;
    }

  9. #9
    Registered User
    Join Date
    Oct 2006
    Posts
    250
    Quote Originally Posted by Overlord View Post
    Yes, I had seen the documents before I started this thread. I was just hoping to find ready-made list of all opcodes taking 1 byte, all opcodes taking 2 bytes and so on. If there's no such resource, I'll work with the Intel manual, thank you for that.
    I am not sure why I even bother answering, but here we go, straight from the reference I gave you:

    "A.3 ONE, TWO, AND THREE-BYTE OPCODE MAPS"

    The document then goes on to give you:

    "Tabel A-2. One-byte Opcode Map: (00H - F7H) *"

    And so on.

    As already mentioned by Sebastiani, disassembling a piece of code is a lot more involved for the majority of cases.

    Now get off your lazy butt and take a look at the references you have been given.

  10. #10
    Registered User
    Join Date
    Jan 2006
    Location
    Latvia
    Posts
    102
    Quote Originally Posted by Sebastiani
    Unfortunately, it doesn't work like that; the length of the opcode often depends on the value of preceding bytes
    Yeah, I'm aware of that, but fortunately it seems the operations with preceeding bytes (two and three byte opcodes, starting with 0x0F) are not as often used as one byte operations. In fact they are all MMX and SSE opcodes, which I will not have in the binaries I'm working with. Thanks for the hints

    Quote Originally Posted by MWAAAHAAA
    straight from the reference I gave you:

    "A.3 ONE, TWO, AND THREE-BYTE OPCODE MAPS"
    As I said I have looked at those references and it's not the size of the opcodes I'm after, but the size of the instructions, preceded by opcodes (I'm only going to work with one-byte opcodes). There is no such information in the manual. Thanks anyway.

    EDIT:
    This is what I was looking for: http://webcache.googleusercontent.co...v&client=opera
    Too bad the site is dead and I have to rely on google's cache. I can't even download the source

    EDIT2:
    I'll use this library: http://ragestorm.net/distorm/
    Thanks for suggesting a library, User Name:
    Last edited by Overlord; 07-31-2010 at 10:05 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Invalid conversion from 'void*' to 'BYTE' help
    By bikr692002 in forum C++ Programming
    Replies: 9
    Last Post: 02-22-2006, 11:27 AM
  2. An exercise in optimization
    By Prelude in forum Contests Board
    Replies: 10
    Last Post: 04-29-2005, 03:06 PM
  3. True ASM vs. Fake ASM ????
    By DavidP in forum A Brief History of Cprogramming.com
    Replies: 7
    Last Post: 04-02-2003, 04:28 AM
  4. C,C++,Perl,Java
    By brusli in forum C Programming
    Replies: 9
    Last Post: 12-31-2001, 03:35 AM

Tags for this Thread