Thread: Operating Files using c...???

  1. #1
    Registered User
    Join Date
    Jan 2012
    Location
    Chennai, Tamil nadu, India
    Posts
    30

    Operating Files using c...???

    Hello friends.., Is it possible to read a Microsoft office document file (.docx format) in c...?
    I know it's complicated task.. Also, the file that is to be read is well-formatted.. It contains tables., not images.., but fonts with different sizes.. Please help me... How can i solve this..? I just want to read that document and display it in the console screen (monitor).Thanks in advance...!!!

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Rehman khan
    Is it possible to read a Microsoft office document file (.docx format) in c...?
    Yes.

    Quote Originally Posted by Rehman khan
    I know it's complicated task.. Also, the file that is to be read is well-formatted.. It contains tables., not images.., but fonts with different sizes.. Please help me... How can i solve this..? I just want to read that document and display it in the console screen (monitor).Thanks in advance...!
    You could investigate the Office Open XML format (not to be confused with the, um, more open Open Document format). A caveat is that from what I heard, Microsoft does not quite implement the format according to the specification that it pushed for standardisation itself, but hopefully that will not be a problem. Alternatively, if you can get the tables exported in say, CSV format, then your life will be much easier.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User ledow's Avatar
    Join Date
    Dec 2011
    Posts
    435
    Possible: Yes. Easy: No. Worthwhile: Almost certainly not.

    Have a look at the code complexity for any major file format. You're literally talking about writing an OpenOffice/LibreOffice filter or thereabouts to make it display. Can you pull SOME plain-text information out of an office file - yeah, quite easily. docx can be opened with the zlib library but *interpreting* them is another matter entirely. You have to parse one of the most hideously documented and inconsistent standards of XML data known to man in order to work out what text is displayed where and in what format. And that's *before* you even touch on things like images, tables, etc.

    For older versions of Office, the problem was even worse. Projects like antiword and wvWare are incredibly complex just to do simple extractions of data from those files.

    If you're set on doing this, you're going to need to learn zlib, XML and read the documentation of the docx formats. Good luck doing that. To my knowledge, outside of closed-source offerings from Microsoft itself, there isn't anything short of a full office suite that's capable of displaying an MS office document anywhere near reliably (and even there, it's cited as the worst-operating part of suites like OpenOffice/LibreOffice because there's just so many things that aren't documented and don't work how they should).

    If you want to display a Word document, use Word. You could probably do some old-style embedding tricks like the way that modern browsers "embed" PDF and Java plugin content into their windows (used to be called OLE in my day, but apparently that's old-hat now) but that would need Word on the PC and is no different to just opening up the document in Word, really. Otherwise, you have an awfully long rocky road in front of you that virtually nobody in the world except for large organisations with huge codebases that are millions of lines of code developed over decades has even seriously attempted in the last 20 years. Even the OpenOffice/LibreOffice import filters came from StarOffice originally (which was commercially developed by Sun/Oracle - the company also responsible for Java).

    - Compiler warnings are like "Bridge Out Ahead" warnings. DON'T just ignore them.
    - A compiler error is something SO stupid that the compiler genuinely can't carry on with its job. A compiler warning is the compiler saying "Well, that's bloody stupid but if you WANT to ignore me..." and carrying on.
    - The best debugging tool in the world is a bunch of printf()'s for everything important around the bits you think might be wrong.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. operating with vectors
    By kraghavan in forum C++ Programming
    Replies: 20
    Last Post: 06-08-2010, 05:17 AM
  2. What is your operating system?
    By undisputed007 in forum A Brief History of Cprogramming.com
    Replies: 41
    Last Post: 04-17-2004, 02:56 PM
  3. Operating system
    By sopranosomega in forum C Programming
    Replies: 6
    Last Post: 10-07-2002, 06:12 AM
  4. Operating System
    By kas2002 in forum C++ Programming
    Replies: 18
    Last Post: 06-20-2002, 12:03 AM
  5. Operating System
    By Unregistered in forum A Brief History of Cprogramming.com
    Replies: 115
    Last Post: 03-31-2002, 06:34 AM