Thread: _asm - what is it?

  1. #1
    Reverse Engineer maxorator's Avatar
    Join Date
    Aug 2005
    Location
    Estonia
    Posts
    2,318

    _asm - what is it?

    I have no idea what _asm does.
    I found no good explanations from google and MSDN explanation didn't make anything clearer. I just understood that it's some kind of assembler which has it's own commands.

    Also, I would like somebody to explain to me what this function exactly does (a function from Vice City Multiplayer 0.1c source code):
    Code:
    BYTE CVehicle::GetVehicleSubtype()
    {
    	if(!m_pVehicle) return 0;
    	DWORD dwVehicle = (DWORD)m_pVehicle;
    
    	_asm mov ecx, dwVehicle
    	_asm mov edx, [ecx+288]
    	_asm mov eax, [edx+204]
    
    	_asm and eax, 0F0000h
    	_asm jz ret_car
    	_asm sub eax, 10000h
    	_asm jz ret_bike
    	_asm sub eax, 10000h
    	_asm jz ret_heli
    	_asm sub eax, 20000h
    	_asm jz ret_boat
    	_asm sub eax, 40000h
    	_asm jz ret_plane
    
    	return 0;
    
    ret_car:	return	VEHICLE_SUBTYPE_CAR;
    ret_bike:	return	VEHICLE_SUBTYPE_BIKE;
    ret_heli:	return	VEHICLE_SUBTYPE_HELI;
    ret_boat:	return	VEHICLE_SUBTYPE_BOAT;
    ret_plane:	return	VEHICLE_SUBTYPE_PLANE;
    }

  2. #2
    erstwhile
    Join Date
    Jan 2002
    Posts
    2,227
    ecx,eax and edx are x86 registers, the 'mov' instruction means 'move', 'jz' means 'jump if zero flag set'(the zero flag is set, unsurprisingly, if, among other conditions, a previous operation - like 'and' or 'sub' - results in zero) to the label that follows(the various ret_*), 'sub' is 'subtract' and 'and' is a logical 'and'. The _asm keyword instructs the compiler that what follows is a command in assembly; in this context the use of assembly language is inline.

    Assembly language has a one to one mnemonic command to cpu instruction relationship, whereas higher level languages, such as c have a one-to-many i.e., one command in c will translate to many cpu instructions.

    For a complete list of cpu instructions visit intel's website and download the instruction set manual for your processor.
    CProgramming FAQ
    Caution: this person may be a carrier of the misinformation virus.

  3. #3
    Reverse Engineer maxorator's Avatar
    Join Date
    Aug 2005
    Location
    Estonia
    Posts
    2,318
    Did I get it right that...
    Code:
    _asm mov ecx, dwVehicle
    _asm mov edx, [ecx+288]
    _asm mov eax, [edx+204]
    ... this basically means:
    Code:
    ecx=dwVehicle;
    edx=ecx+288;
    eax=edx+204;
    And this:
    Code:
    	_asm and eax, 0F0000h
    	_asm jz ret_car
    Means if eax is equal to 0F0000 where "h" means it's hex and if it is, it goes to ret_car and this...
    Code:
    	_asm sub eax, 10000h
    	_asm jz ret_bike
    ... subtracts 10000 from eax and then checks again if it's equal now and if it is, it goes to ret_bike etc.

    I think I didn't get it right...
    This doesn't seem to be something very long so can someone please code it into normal C++?
    Last edited by maxorator; 06-26-2006 at 06:48 AM.

  4. #4
    Registered User (TNT)'s Avatar
    Join Date
    Aug 2001
    Location
    UK
    Posts
    339
    Just to point out that when you say eax=edx+204, thats incorrect. [edx+204] is basically a pointer to a memory address. So mov eax, [edx+204] copies that memory contents into eax. The [] refer to the item at the said address. edx+204 referes to the physical memory address.

    Also subtracting 10000h does not subtact 10000, the h denotes that its hex, which is 65536 decimal.
    Last edited by (TNT); 06-26-2006 at 07:51 AM.
    TNT
    You Can Stop Me, But You Cant Stop Us All

  5. #5
    Reverse Engineer maxorator's Avatar
    Join Date
    Aug 2005
    Location
    Estonia
    Posts
    2,318
    Does it check if it's equal, smaller, larger, what does it actually compare?

    And is the first value (ecx) the memory address of dwVehicle or the value of dwVehicle?

  6. #6

  7. #7
    int x = *((int *) NULL); Cactus_Hugger's Avatar
    Join Date
    Jul 2003
    Location
    Banks of the River Styx
    Posts
    902
    I'm left scratching my head as to why someone would write such a simple function in assembly... to kill portability, and to obfuscate code? (Though you could comment assembly all the same.)

    To me: A well written C/C++ program will generally not need assembly. When it does need it, you'll know why. (Code does processor specific fun, code needs extreme speed, etc.) Other than that, mixing assembly with C (when not needed) should be avoided.
    With this code, aside from different representations in compilers breaking it, there's always issues of how structures/classes are packed. If m_pVehicle is a pointer to a structure (I'm assuming given the 'p' and the large offsets) then future changes to the structure could break the offsets in the assembly...
    long time; /* know C? */
    Unprecedented performance: Nothing ever ran this slow before.
    Any sufficiently advanced bug is indistinguishable from a feature.
    Real Programmers confuse Halloween and Christmas, because dec 25 == oct 31.
    The best way to accelerate an IBM is at 9.8 m/s/s.
    recursion (re - cur' - zhun) n. 1. (see recursion)

  8. #8
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    I have to agree with Cactus_Hugger, that asm insert seems entirely gratuitous and completely unnecessary.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  9. #9
    Reverse Engineer maxorator's Avatar
    Join Date
    Aug 2005
    Location
    Estonia
    Posts
    2,318
    Does AND compare if they're equal?
    Are 288 and 204 hex values?
    Does condition after SUB mean if the value is 0 now after subtracting?
    Do [] brackets change anything in the first lines?

    Could anyone please rewrite it into C++ code then?
    I still don't totally understand it.
    Last edited by maxorator; 06-27-2006 at 03:00 AM.

  10. #10
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    Ok, roughly speaking

    > DWORD dwVehicle = (DWORD)m_pVehicle;
    m_pVehicle is a pointer, which we cast into a dword to store in a register

    > _asm mov edx, [ecx+288]
    This is basically either
    var = m_pVehicle->member;
    var = m_pVehicle[288/sizeof(type)];

    Exactly which member depends on how the compiler laid out that structure, or the size of each element of the array. Knowing the type of m_pVehicle would help make sense of the offset value.

    > _asm mov eax, [edx+204]
    Ditto again - another pointer dereference, either an offset into another struct, or an array index.

    Post some types if you want specifics.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  11. #11
    Reverse Engineer maxorator's Avatar
    Join Date
    Aug 2005
    Location
    Estonia
    Posts
    2,318
    Structures used:
    Code:
    typedef struct _VECTOR {
    	float X,Y,Z;
    } VECTOR, *PVECTOR;
    
    typedef struct _MATRIX4X4 {
    	VECTOR vLookRight;
    	float  pad_r;
    	VECTOR vLookUp;
    	float  pad_u;
    	VECTOR vLookAt;
    	float  pad_a;
    	VECTOR vPos;
    	float  pad_p;
    } MATRIX4X4, *PMATRIX4X4;
    
    #define _pad(x,y) BYTE x[y]
    
    #pragma pack(1)
    typedef struct _ENTITY_TYPE {
    
    	DWORD	func_table; // 0
    	MATRIX4X4 mat; // 4-68
    	_pad(__pad0,8); // 68-76
    	PDWORD	pModel; // 76-80
    	BYTE	nControlFlags; // 80-81
    	BYTE	nControlFlags2; // 81-82
    	_pad(__pad1,10); // 82-92
    	WORD	nModelIndex; // 92-94
    	_pad(__pad2,18); // 94-112
    	VECTOR  vecMoveSpeed; // 112-124
    	VECTOR  vecTurnSpeed; // 124-136
    	_pad(__pad3,146); // 136-282
    	BYTE	byteSunkFlags; // 282-284
    	BYTE	byteLockedFlags; // 283-284
    
    } ENTITY_TYPE;
    
    #pragma pack(1)
    typedef struct _VEHICLE_TYPE {
    	ENTITY_TYPE entity; // 0-284
    
    	_pad(__pad0a,132); // 284-416
    	BYTE	byteColor1; // 416-417
    	BYTE	byteColor2; // 417-418
    	_pad(__pad1a,6); // 418-424
    	PED_TYPE * pDriver; // 424-428
    	PED_TYPE * pPassengers[7]; // 428-456 (probably 8)
    	_pad(__pad2a,4); // 456-460
    	BYTE	bytePassengersCount; // 460-461
    	_pad(__pad2b,3); // 461-464
    	BYTE	byteMaxPassengers; // 464-465
    	_pad(__pad3a,23); // 465-488
    	float	fSteerAngle1; // 488-492
    	float	fSteerAngle2; // 492-496
    	float	fAcceleratorPedal; // 496-500
    	float	fBrakePedal; // 500-504
    	_pad(__pad4a,12); // 504-516
    	float	fHealth; // 516-520
    	_pad(__pad5a,40); // 520-560
    	DWORD	dwDoorsLocked; // 560-564
    	_pad(__pad6a,4); // 564-568
    	PDWORD  pdwDamageEntity; // 568-572
    	DWORD	nRadio; // 572-576
    	BYTE	byteHorn; // 576-577
    	DWORD   dwUnk1; // 577-581
    	BYTE	byteSiren; // 581-582
    	_pad(__pad7a,874); // 582-1456
    	float	fSpecialWeaponRotation1; // 1456 (following 2 are rhino turret and firetruck spray)
    	float	fSpecialWeaponRotation2; // 1460	
    	/// ............
    } VEHICLE_TYPE;
    And:
    Code:
    VEHICLE_TYPE	*m_pVehicle;
    One strange question - does an executable file consist of assembler commands?
    Last edited by maxorator; 06-27-2006 at 08:35 AM.

  12. #12
    int x = *((int *) NULL); Cactus_Hugger's Avatar
    Join Date
    Jul 2003
    Location
    Banks of the River Styx
    Posts
    902
    AND does a bitwise and on the two operands. (EAX and 0x0F0000) The result is in EAX.

    And executable file consists of machine code. (And some other stuff.) It's what assembly source code is translated to. For the most part, 1 line in assembly = 1 machine code - it's almost a straightforward translation. C/C++ ends up as machine code, and some compilers will output assembly code if you ask them to.

    Assembly is a lot lower than C, but you can get a lot of power and speed out of it, as you have control over every instruction the processor executes. Of course, it takes more code, and it generally considered harder to write.

    The assembly makes no sense to me now. That first dereference seems to point to the middle of the structure. Does:
    Code:
    	DWORD dwVehicle = (DWORD)m_pVehicle;
    
    	_asm mov ecx, dwVehicle
    	_asm mov edx, [ecx+288]
    Not access m_pVehicle + 288 bytes? And is that not in the middle of the first pad? But, perhaps I'm reading this entirely wrong.
    Last edited by Cactus_Hugger; 06-27-2006 at 10:45 AM.
    long time; /* know C? */
    Unprecedented performance: Nothing ever ran this slow before.
    Any sufficiently advanced bug is indistinguishable from a feature.
    Real Programmers confuse Halloween and Christmas, because dec 25 == oct 31.
    The best way to accelerate an IBM is at 9.8 m/s/s.
    recursion (re - cur' - zhun) n. 1. (see recursion)

  13. #13
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    I think I see what's happened.

    It isn't a C++ program at all, it's a barely reverse engineered old ASM program with random bits converted to C++ (more likely just C in a C++ file).

    All those _pad()'s are basically "I don't know what that bit does yet".
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  14. #14
    Reverse Engineer maxorator's Avatar
    Join Date
    Aug 2005
    Location
    Estonia
    Posts
    2,318
    Vice City Multiplayer gave a name only to some memory addresses they needed to use in their code.

    Can machine code be converted to asm? I mean someone already said that it's straightforward translation.
    How is machine code built up of asm?
    I want to learn as much as I can about these things
    Last edited by maxorator; 06-27-2006 at 01:35 PM.

  15. #15

Popular pages Recent additions subscribe to a feed