-
_asm - what is it?
I have no idea what _asm does.
I found no good explanations from google and MSDN explanation didn't make anything clearer. I just understood that it's some kind of assembler which has it's own commands.
Also, I would like somebody to explain to me what this function exactly does (a function from Vice City Multiplayer 0.1c source code):
Code:
BYTE CVehicle::GetVehicleSubtype()
{
if(!m_pVehicle) return 0;
DWORD dwVehicle = (DWORD)m_pVehicle;
_asm mov ecx, dwVehicle
_asm mov edx, [ecx+288]
_asm mov eax, [edx+204]
_asm and eax, 0F0000h
_asm jz ret_car
_asm sub eax, 10000h
_asm jz ret_bike
_asm sub eax, 10000h
_asm jz ret_heli
_asm sub eax, 20000h
_asm jz ret_boat
_asm sub eax, 40000h
_asm jz ret_plane
return 0;
ret_car: return VEHICLE_SUBTYPE_CAR;
ret_bike: return VEHICLE_SUBTYPE_BIKE;
ret_heli: return VEHICLE_SUBTYPE_HELI;
ret_boat: return VEHICLE_SUBTYPE_BOAT;
ret_plane: return VEHICLE_SUBTYPE_PLANE;
}
-
ecx,eax and edx are x86 registers, the 'mov' instruction means 'move', 'jz' means 'jump if zero flag set'(the zero flag is set, unsurprisingly, if, among other conditions, a previous operation - like 'and' or 'sub' - results in zero) to the label that follows(the various ret_*), 'sub' is 'subtract' and 'and' is a logical 'and'. The _asm keyword instructs the compiler that what follows is a command in assembly; in this context the use of assembly language is inline.
Assembly language has a one to one mnemonic command to cpu instruction relationship, whereas higher level languages, such as c have a one-to-many i.e., one command in c will translate to many cpu instructions.
For a complete list of cpu instructions visit intel's website and download the instruction set manual for your processor.
-
Did I get it right that...
Code:
_asm mov ecx, dwVehicle
_asm mov edx, [ecx+288]
_asm mov eax, [edx+204]
... this basically means:
Code:
ecx=dwVehicle;
edx=ecx+288;
eax=edx+204;
And this:
Code:
_asm and eax, 0F0000h
_asm jz ret_car
Means if eax is equal to 0F0000 where "h" means it's hex and if it is, it goes to ret_car and this...
Code:
_asm sub eax, 10000h
_asm jz ret_bike
... subtracts 10000 from eax and then checks again if it's equal now and if it is, it goes to ret_bike etc.
I think I didn't get it right...
This doesn't seem to be something very long so can someone please code it into normal C++?
-
Just to point out that when you say eax=edx+204, thats incorrect. [edx+204] is basically a pointer to a memory address. So mov eax, [edx+204] copies that memory contents into eax. The [] refer to the item at the said address. edx+204 referes to the physical memory address.
Also subtracting 10000h does not subtact 10000, the h denotes that its hex, which is 65536 decimal.
-
Does it check if it's equal, smaller, larger, what does it actually compare?
And is the first value (ecx) the memory address of dwVehicle or the value of dwVehicle?
-
-
I'm left scratching my head as to why someone would write such a simple function in assembly... to kill portability, and to obfuscate code? (Though you could comment assembly all the same.)
To me: A well written C/C++ program will generally not need assembly. When it does need it, you'll know why. (Code does processor specific fun, code needs extreme speed, etc.) Other than that, mixing assembly with C (when not needed) should be avoided.
With this code, aside from different representations in compilers breaking it, there's always issues of how structures/classes are packed. If m_pVehicle is a pointer to a structure (I'm assuming given the 'p' and the large offsets) then future changes to the structure could break the offsets in the assembly...
-
I have to agree with Cactus_Hugger, that asm insert seems entirely gratuitous and completely unnecessary.
-
Does AND compare if they're equal?
Are 288 and 204 hex values?
Does condition after SUB mean if the value is 0 now after subtracting?
Do [] brackets change anything in the first lines?
Could anyone please rewrite it into C++ code then?
I still don't totally understand it.
-
Ok, roughly speaking
> DWORD dwVehicle = (DWORD)m_pVehicle;
m_pVehicle is a pointer, which we cast into a dword to store in a register
> _asm mov edx, [ecx+288]
This is basically either
var = m_pVehicle->member;
var = m_pVehicle[288/sizeof(type)];
Exactly which member depends on how the compiler laid out that structure, or the size of each element of the array. Knowing the type of m_pVehicle would help make sense of the offset value.
> _asm mov eax, [edx+204]
Ditto again - another pointer dereference, either an offset into another struct, or an array index.
Post some types if you want specifics.
-
Structures used:
Code:
typedef struct _VECTOR {
float X,Y,Z;
} VECTOR, *PVECTOR;
typedef struct _MATRIX4X4 {
VECTOR vLookRight;
float pad_r;
VECTOR vLookUp;
float pad_u;
VECTOR vLookAt;
float pad_a;
VECTOR vPos;
float pad_p;
} MATRIX4X4, *PMATRIX4X4;
#define _pad(x,y) BYTE x[y]
#pragma pack(1)
typedef struct _ENTITY_TYPE {
DWORD func_table; // 0
MATRIX4X4 mat; // 4-68
_pad(__pad0,8); // 68-76
PDWORD pModel; // 76-80
BYTE nControlFlags; // 80-81
BYTE nControlFlags2; // 81-82
_pad(__pad1,10); // 82-92
WORD nModelIndex; // 92-94
_pad(__pad2,18); // 94-112
VECTOR vecMoveSpeed; // 112-124
VECTOR vecTurnSpeed; // 124-136
_pad(__pad3,146); // 136-282
BYTE byteSunkFlags; // 282-284
BYTE byteLockedFlags; // 283-284
} ENTITY_TYPE;
#pragma pack(1)
typedef struct _VEHICLE_TYPE {
ENTITY_TYPE entity; // 0-284
_pad(__pad0a,132); // 284-416
BYTE byteColor1; // 416-417
BYTE byteColor2; // 417-418
_pad(__pad1a,6); // 418-424
PED_TYPE * pDriver; // 424-428
PED_TYPE * pPassengers[7]; // 428-456 (probably 8)
_pad(__pad2a,4); // 456-460
BYTE bytePassengersCount; // 460-461
_pad(__pad2b,3); // 461-464
BYTE byteMaxPassengers; // 464-465
_pad(__pad3a,23); // 465-488
float fSteerAngle1; // 488-492
float fSteerAngle2; // 492-496
float fAcceleratorPedal; // 496-500
float fBrakePedal; // 500-504
_pad(__pad4a,12); // 504-516
float fHealth; // 516-520
_pad(__pad5a,40); // 520-560
DWORD dwDoorsLocked; // 560-564
_pad(__pad6a,4); // 564-568
PDWORD pdwDamageEntity; // 568-572
DWORD nRadio; // 572-576
BYTE byteHorn; // 576-577
DWORD dwUnk1; // 577-581
BYTE byteSiren; // 581-582
_pad(__pad7a,874); // 582-1456
float fSpecialWeaponRotation1; // 1456 (following 2 are rhino turret and firetruck spray)
float fSpecialWeaponRotation2; // 1460
/// ............
} VEHICLE_TYPE;
And:
Code:
VEHICLE_TYPE *m_pVehicle;
One strange question - does an executable file consist of assembler commands?
-
AND does a bitwise and on the two operands. (EAX and 0x0F0000) The result is in EAX.
And executable file consists of machine code. (And some other stuff.) It's what assembly source code is translated to. For the most part, 1 line in assembly = 1 machine code - it's almost a straightforward translation. C/C++ ends up as machine code, and some compilers will output assembly code if you ask them to.
Assembly is a lot lower than C, but you can get a lot of power and speed out of it, as you have control over every instruction the processor executes. Of course, it takes more code, and it generally considered harder to write.
The assembly makes no sense to me now. That first dereference seems to point to the middle of the structure. Does:
Code:
DWORD dwVehicle = (DWORD)m_pVehicle;
_asm mov ecx, dwVehicle
_asm mov edx, [ecx+288]
Not access m_pVehicle + 288 bytes? And is that not in the middle of the first pad? But, perhaps I'm reading this entirely wrong.
-
I think I see what's happened.
It isn't a C++ program at all, it's a barely reverse engineered old ASM program with random bits converted to C++ (more likely just C in a C++ file).
All those _pad()'s are basically "I don't know what that bit does yet".
-
Vice City Multiplayer gave a name only to some memory addresses they needed to use in their code.
Can machine code be converted to asm? I mean someone already said that it's straightforward translation.
How is machine code built up of asm?
I want to learn as much as I can about these things :D
-