Thread: Generating a Microsoft word document using C

  1. #1
    Registered User
    Join Date
    Dec 2011
    Posts
    3

    Generating a Microsoft word document using C

    Hi,

    I hope somebody can point me in the right direction with this. Is it possible to generate a Microsoft Word document using C? Is it very difficult? Or would I be better off using Visual Basic or something like that?

    Ideally I'd like to get it going in C, but I'm not sure where to start or what I would need.

    Some advice on this would be greatly appreciated.

    Thanks.

  2. #2
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Of course it's possible... but you would need to know the file format, including it's embedded text formatting commands before you could do much more than create plain text files.

    And FWIW... when you use Word to create MSWord files... you are using C.
    (Word was written in C)

  3. #3
    Registered User
    Join Date
    Dec 2011
    Posts
    3
    Are the file formats easy to access, as in if I have MSWord installed can I just look them up in the program files or something? I'm guessing I'd need various headers and such in the C program too?
    Does anyone know of any tutorials or anything like that?

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    Well you can read the spec's here -> Microsoft Office Binary (doc, xls, ppt) File Formats

    > Or would I be better off using Visual Basic or something like that?
    But if you have some choice of implementation language, then choose something else other than C.
    In particular, look for languages which have high level bindings which give you access to "word" objects like paragraphs, pages and so forth.

    Picking these apart from a low level language will be hard work.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    This is not a simple task... the .doc (etc) file formats are not published (that I'm aware of) and you're not going to just look them up. You will most likely have to create test files with known, exact content, and reverse engineer a description of the file format before you can being coding any kind of file creation task.

    Take a step back... What EXACTLY are you trying to do?


    EDIT... I see Salem has provided info.... excellent.

  6. #6
    &TH of undefined behavior Fordy's Avatar
    Join Date
    Aug 2001
    Posts
    5,793
    There wont be much info out there. The format of the pre-2007 version is proprietary and MS don't want you to mess with it unless its through one of their APIs. The COM APIs can be used but to do so in C is an uphill struggle. C++ is easier and if you are using MSVC++, its much easier. If you want to avoid that route, you can always try to look into the raw format, but its not easy (the wv project on the web has reverse-engineered the format and publishes a library to read these files, but not write them).

    With the 2007 xml version files life is a little easier. Open one of these files in a decompression program (7zip works well) and you will see its a zipped archive. The body of the word content is in "word/document.xml", but to edit this file directly you'd need to use: a zip library to access the content, an xml library to read and edit it and knowledge of the xml format for word 2007 - at least this is published by MS.

    If it were me and I had to produce a word file from my code I'd either go through MSVC++ and use COM or use VB or C#

  7. #7
    Registered User
    Join Date
    Dec 2011
    Posts
    3
    Ok thanks for the info. I do have a choice of languages so it's not a problem choosing a different one. It just would have been nice to use C since it's what I'm most familiar with, but I can see it would be quite a difficult task, and I'm no expert.

  8. #8
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    Probably best for your C program to just output plain text. Then spawn Word with whatever run-line flags it needs to open, read, and export using its native format. I think Word can be run like that.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. MS Word 2007 document reading problem in C
    By mgnidhi_3july in forum C Programming
    Replies: 3
    Last Post: 05-02-2011, 06:31 AM
  2. Opening link from Word document
    By BobDole in forum A Brief History of Cprogramming.com
    Replies: 0
    Last Post: 02-20-2008, 09:11 AM
  3. Dirty Word (Microsoft Word that is, you sicko!)
    By nickname_changed in forum A Brief History of Cprogramming.com
    Replies: 12
    Last Post: 07-02-2005, 07:27 AM
  4. Saving a document using Word COM
    By UnclePunker in forum Windows Programming
    Replies: 16
    Last Post: 04-19-2005, 11:11 AM
  5. changing a microsoft document file
    By Unregistered in forum C++ Programming
    Replies: 0
    Last Post: 01-10-2002, 06:36 PM