![]() |
| | #1 |
| Registered User Join Date: Sep 2006
Posts: 98
| Reading Microsoft Word documents |
| Mavix is offline | |
| | #2 |
| Kernel hacker Join Date: Jul 2007 Location: Farncombe, Surrey, England
Posts: 15,686
| The file-format produced by Word (in a .doc file) is a binary format that contains all sorts of "extra data" beyond the basic text that is "the real documet". I would guess there are libraries available to read it, but not sure really. The easiest solution is perhaps to save the document as text or "mostly text" document (rtf for example). -- Mats |
| matsp is offline | |
| | #4 |
| Registered User Join Date: Sep 2006
Posts: 98
| I have Word installed, but it's quite old (Office 2000). Isn't there some other way to read Word documents? |
| Mavix is offline | |
| | #5 |
| Kernel hacker Join Date: Jul 2007 Location: Farncombe, Surrey, England
Posts: 15,686
| Depends on what you want to do - what are you trying to achieve? If you can describe what your end goal is, then we can almost certainly describe some way of getting there (or towards that goal) - but "just reading a word document" isn't trivial, because the information in the file is stored in quite a complex manner (for example, if you enable "visible changes" both the previous text and the new text for multiple generations of the document may be kept in the document). -- Mats |
| matsp is offline | |
| | #6 |
| Anti-Poster Join Date: Feb 2002
Posts: 1,241
| Another alternative is to use the COM IFilter interface provided by Microsoft Desktop Search. You'll lose all the font information, but you will be able to at least get the words out of the document. It may be possible to use OpenOffice to programatically open Word docs. In any case, it's not a simple endeavor.
__________________ Rule #1: Every rule has exceptions |
| pianorain is offline | |
| | #7 | |
| Registered User Join Date: Sep 2006
Posts: 98
| Quote:
| |
| Mavix is offline | |
| | #8 |
| Kernel hacker Join Date: Jul 2007 Location: Farncombe, Surrey, England
Posts: 15,686
| Well, if you want to do that, you could try an approach of this: Code: fin = fopen("something.doc", "rb");
fout = fopen("someelse.txt", "w");
while ((c = fgetc(fin)) != EOF) {
if (isascii(c)) fputc(c, fout);
}
Alternatively, try the "Word Viewer": Microsoft Word Viewer download site It allows you to copy text out of a word document, it's free and you don't have to write a single line of code (and it's probably going to do a better job of sorting out what's what in your document too). -- Mats |
| matsp is offline | |
| | #9 |
| Registered User Join Date: Sep 2006
Posts: 98
| Thanks for the help! |
| Mavix is offline | |
![]() |
| Thread Tools | |
| Display Modes | |
|
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Hangman game and strcmp | crazygopedder | C Programming | 12 | 11-23-2008 06:13 PM |
| Reading a Whole Word in C | Chinfrim | C Programming | 2 | 10-19-2008 12:54 PM |
| Microsoft Word Automation | BobS0327 | Windows Programming | 12 | 11-22-2007 05:53 PM |
| Apps that act "differently" in XP SP2 | Stan100 | Tech Board | 6 | 08-16-2004 10:38 PM |
| im so stuck. how can i write a program to forward word documents to email addresses | Britney | C++ Programming | 1 | 04-01-2003 06:02 AM |