Hello, I am new to c++, I am having trouble implementing my book index program, here are the requirements:
There are two components to converting a text file into the desired "paged" format: generating HTML versions of the pages and preparing an index of the pages.
Splitting the Book into Pages
Input to the system will be a book in ASCII .txt format, such as this one. The name of the file containing this book will be supplied as a command line parameter (the only one required by this program.)
The first step in preparing this for the web will be to split this text into web pages, each page but the last containing MAX_LINES_PER_PAGE lines.
The generated web pages will be written to files named "pageNNNN.html", where NNNN is a 4 digit number starting at 0001, then 0002, and so on.
Each generated page will consist of an HTML "wrapper" around the selected lines of text. The wrapper includes the book title (extracted from the Gutenberg text file) and links to the previous page, the next page, and the index page. For example, page 0024 of a book would look like:
<html> <head> <title>BookTitle</title> </head> <body> <p> <a href="page0001.html">First</a>, <a href="page0023.html">Prev</a>, <a href="page0025.html">Next</a>, <a href="indexPage.html">Index</a> </p> <hr/> Lines of text from the book appear here, exactly as they appear in the text file. </body> </html> The first page will not have the "Prev" link. The final page will not have the "Next" link. The book title can be extracted from the earliest line in the text file that begins with "Title:".
This is the first stage in a semester project that will challenge you to design and develop a larger and more complicated program than you have been accustomed to in the past. Generating an Index
The final page generated by the program will be stored in "indexPage.html". This page will look like:
<html> <head> <title>BookTitle</title> </head> <body> <p> <a href="page0001.html">First</a> </p> <hr/> <p> <a href="#A">A</a> <a href="#B">B</a> <a href="#C">C</a> ... <a href="#Z">Z</a> </p> <hr/> <h1>Index</h1> <h2 id="A">A</h2> <ul> <li>angle <a href="page0001.html">1</a> <a href="page0003.html">3</a> <a href="page0023.html">23</a> </li> <li>arcs <a href="page0025.html">25</a> <a href="page0026.html">26</a> </li> </ul> <h2 id="B">B</h2> <ul> <li>bars ... </body> </html> The main portion of the page has a section for each letter from A..Z. Each section has an <h2> header and a <ul>...</ul> list. Inside that list will be one <li>...</li> entry for each index term beginning with the corresponding letter. Each such entry will contain the index term followed by a list of <a>...</a> links to pages where that term occurs.
- The index terms will be listed in alphabetical order.
- An index term is a word occurring in the book. It consists of consecutive alphabetic characters an must either occur at the beginning of a line or must be preceded by a blank.
- Words of 3 letters or less will not be used as index terms.
- All index terms will be converted to lower case before being inserted into the index. Words in the text that differ only in the upper/lower case of their letters will be considered to be instances of the same index term.
- For an index term to be useful, it must direct one to a limited portion of the book. Consequently, any word that occurs on more than PAGE_THRESHOLD percentage of the total pages will not be treated as an index term.
- The constants MAX_LINES_PER_PAGE and PAGE_THRESHOLD will be declared in a header file indexConstants.h
Here is my driver so far:
here is indexConstants.cpp:
using namespace std;
*This program run only with one command line parameter which is:
1) The name of the bookfile to be generated into webpages
int main (int argc, char** argv)
if (argc != 2)
cerr << "Usage: " << argv << " textFileName" << endl;
istringstream bookIn (argv);
// Special constants controlling the indexing program extern const
int PAGE_THRESHOLD = 25;
extern const int MAX_LINES_PER_PAGE = 75;