I know that asking here is a long shot but what the hell.
We are reverse engineering dwg for autocad 2004. Autodesk has changed the format of the dwg somewhat. The information that I've been able to gather from different sources is contradictory. Evan Yares of OpenDWG.org claims a simple Xor / Magic number encryption scheme on file headers and section headers. Autodesk says there is no encryption but that opendwg just hasn't figured out the format yet. That sounds good to me but I still can't figure out the headers. Here's what I do know.
-Using a utility called FileMon.exe I was able to "watch" the file access that autocad was doing. It reads the first 64 bytes, then seeks to 128 and reads 108 bytes. That much happens every time. Then it seeks to some variable position. From save to save, the only bytes that change are within the 108 bytes. This tells me that the variable position (offset) is stored in the 108 bytes somewhere.
Well rather than rambling on and on, I'll wait to see if one of you guys has even looked at this sort of thing. Consider it a hacking challenge like that website someone posted.
What happens if you save a blank file? That will simplify things a little when trying to decipher what things are in the file... hmmm... this sounds kinda interesting. Post whatever you know...
blank file is where I usually start these things actually. autocad saves out a bunch of things even with no entities drawn.
What I THINK I know is that you get a "master key" 32 bit value that you have to XOR with each "section key" to get a final section key. that section key has to be XOR'd with the 16 bytes that follow to get your section header. It's producing values that seem reasonable but not obvious as to what purpose they serve. I do have some idea that the third long of the section header is a section length. make any sense? Anyone that wants to take a stab can probably get Autocad 2004 on kazaa if they are feeling brave.
A run over with SoftIce or some other debugger should give you all the information you need. Tried that yet?
you know it's funny. I have softice and I've tried to use it but maybe I don't know enough about it or assembly. Basically if I knew how to break at the point where certain things are read in then great! I would have it. I can't figure out where that is though. any tips?
Well, you can break on API calls or window messages. (others, too, but these are usually the most useful)
Is there any API call(s) you can think of which are relevant to the parsing of these DWG files? GetFileTime? GetOpenFilename? CreateFile? MessageBox? Anything?
You just need to catch some API call that's close to the desired code. After breaking out of the call, you'll most likely land right where you want to be, or at lease close by.
I'm not *that* great with this kind of stuff (trying to get better) so I don't know how much more help i'll be able to provide.
Sadly, I don't have a copy of SoftIce or Autocad, so I can't help you directly.
There goes Autodesk again, phasing out older software. They did the same thing when they jumped from 98 to 2000, bastages.
I didn't even know version 2004 was out yet.
they just released it apparently (so you wonder why its not called 2003). Anyway, autocad changes the version at least a little every time. This time was major though. The encryption seems like its very basic and a few companies have cracked it already.
Eibro, thanks for the advice. I was thinking of breaking at ReadFile but I don't know how to "break at API calls" yet. I can look that up though. I was worried about doing that though because ReadFile is used for lots of things (temp files, settings files, etc...) But that sounds like a good plan.
dramatic progress today. Softice turned out to be VERY helpful. 108 byte encryption key is cracked!!! WooHoo!!
still 32 bit keys to be handled. I understand those to be a bit more irritating as they aren't constant.
// this had better get me a raise :p
If you haven't looked it up already (which you probably have done) you can break on window messages using bpmsg and break on API calls using bpx
I have have AutoCAD 2004 installed. I have no clue what you are
talking about. If I can be of help you can PM me-- if possible in
terms a child can understand. I'll be glad to help any way.
rick, I'm trying to read the new dwg format. First of all, there is no spec. Its a proprietary format so you have to figure it out on your own.
I usually get started by looking at the format in HexWorkshop (or any hex editor) Second I save the file out with a slight change (line moved). HexWorkshop will give me a file comparison on the two files.
Next, I want to know what Autocad does, I open Filemon.exe and watch the disk access. I see that it reads 64 bytes from offset 0, then 108 bytes from offset 128, etc...
Then you can start getting some idea which parts of the file serve what purpose. looking at the hex you can sometimes see number that you recognize from the lines you saved out. In the case of autocad there are 2 bit and 4 bit values for a couple of the more common numbers so it is a little bit harder to read. Plus there is bit shifting so that they can use 7 bit bytes etc..
Now the point. The new format encrypts the file header with a 108 byte key (simple XOR). That part I have. The section headers are 20 bytes in length and appear to be encrypted with their own 32 bit keys. The keys vary from section to section but are very similar to each other. Using softice I can see in assembly as autocad works. I do not have full understanding of assembly yet so it's difficult to find where these 32 bit keys are coming from. Here's what I've seen:
- 1st and 2nd sections headers read by autocad don't appear to be encrypted
- on the encrypted ones, the key comes from an operation on the "master key" as follows:
masterkey ^ (somevar1 + somevar2 + somevar3)
somevars are variables that i can't figure out where they came from yet. two of them are usually 128 though. This is where assembly knowledge comes in. Softice helps a lot but its not going to tell me the basics that I don't know.
Anyway, today is another day of debugging for me.