Thread: Reading PDF with C++ Console appl

  1. #1
    Registered User
    Join Date
    Mar 2005
    Posts
    4

    Exclamation Reading PDF with C++ Console appl

    Hi,

    I would like to write a little program that converts a PDF (only text as contents) to a TXT file format. I searched the web (mostly googled the web) but didnt found any example code to make such a program. (only found pre-made programs) I don't have to adjust the PDF contents, only converting the PDF to TXT.

    Can somebody help me out ?

    Thankx, Codorke

  2. #2
    Registered User
    Join Date
    Sep 2001
    Posts
    4,912
    If you go to wotsit.org you can get the file format for PDF. Shouldn't be too difficult if you're just wanting to extrapolate the text, but keep in mind that it may not all be recorded in logical order because of formatting.

  3. #3
    Registered User
    Join Date
    Mar 2005
    Posts
    4
    thankx to reply so quick !

    I never wrote a converter before or worked a lot with files. So, i still have some questions (after visiting wotsit.org).

    To open a PDF file, i can do this still with fstream and read the contents with the '<<'-operator ?
    If so, after opening the file and putting the contents into a var, i
    need to search for the real text depending on the PDF format i
    found on wotsit.org

    Or am i wrong ?

    Codorke

  4. #4
    Registered User
    Join Date
    Sep 2001
    Posts
    4,912
    Correct. Just use fstreams to read the data into your program into a char array. Then you decode it and read off the information you want.

  5. #5
    Registered User
    Join Date
    Mar 2005
    Posts
    4
    Thanks for the help !!

  6. #6
    Registered User
    Join Date
    Mar 2005
    Posts
    4
    Mmm, it's quit difficult to really understand the PDF file format. Is there any other (easier) way to readin the contents of a PDF (only text) with a c++ program ? Or has anyone done this before ?

    Can somebody help me out pl ? Its for my final year project ...

  7. #7
    S Sang-drax's Avatar
    Join Date
    May 2002
    Location
    Göteborg, Sweden
    Posts
    2,072
    There are open-source PDF programs. Download them and look at the source code.
    The PDF format isn't easy...
    Last edited by Sang-drax : Tomorrow at 02:21 AM. Reason: Time travelling

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Reading console input (character mode applications)
    By maxhavoc in forum Windows Programming
    Replies: 12
    Last Post: 11-27-2005, 04:13 AM
  2. Console Appl without Dos window(box)
    By Aenaos in forum Windows Programming
    Replies: 1
    Last Post: 04-17-2003, 10:17 AM
  3. Console PDF Viewer For Linux?
    By mart_man00 in forum Tech Board
    Replies: 3
    Last Post: 01-16-2003, 03:14 AM
  4. Linux console window questions
    By GaPe in forum Linux Programming
    Replies: 1
    Last Post: 12-28-2002, 12:18 PM
  5. Reading Arrow keys from console
    By D in forum C Programming
    Replies: 5
    Last Post: 02-22-2002, 04:46 PM