I have a worksheet saved in XML Spreadsheet 2003 file format.

I’m supposed to read a specific value from it, given it’s cell address, such as „A5“ or „B3“. As I have almost-absolutely no previous knowlege of XML, I though to ask you advice how to parse through this thing (most easily) without using a thirdparty XML parser or building one myself (which I couldn’t do anyway), before I start cracking.
Looking through the XML file (shown at the end of the post), I see that I can easily ignore most of it. The first thing that I care about seems to be the <Worksheet ss:Name=“Sheet1“> tag. The next object of interest seems to be:


Code:
<Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="2" x:FullColumns="1"
   x:FullRows="1" ss:DefaultRowHeight="15">
Which tells me how large the spreadsheet is (ExpanedColumnCount and ExpandedRowCount), which in the current example is 2, 2 (which is B2 in Excel’s language).
From there on comes the listing of rows and cells, like this:
Code:
   <Row ss:AutoFitHeight="0"/>
   <Row ss:AutoFitHeight="0">
    <Cell ss:Index="2"><Data ss:Type="String">THISCELL!</Data></Cell>
   </Row>
With „B2“ given as the address, I see that there’s no data on the first row. In the second row, however, is one cell value. What’s important with CELL tag’s is that it has an Index argument. Given in the example above, the B2 cell value („THISCELL!“) has no cells before it. So reading the index is crucial to count which cell value to read (or write).
So, knowing all this, I thought of to solve this problem like this:
1) Convert given Excel address to normal value ((ie „B2“ -> „2, 2“)
2) Search for the starting tag of the worksheet.
3) Count how many <Row> tags I have.
4) Loop through the rows until given y coordinate (from address). Also stop if reach the end of the worksheet.
5) Check if there are cells in the row, then read how many cells there are and what’s the first cell’s index is (if present).
6) Loop through all cells until given x coordinate (subtracting first cell’s index value from it first).
7) Read the value.
What do you think ? Is it the optimal way to do it (quick and simple, using C’s strtok_r or something of C++’s equivalent) ?
The main question ofcourse is how to parse the tags (to read their arguments and stuff). I’m worried that the same tag can be given in two ways:

<Row/> or.. <Row>data here</Row>

Any advice how to proceed with that ? An example or article or just some pointers how to start parsing those tags (reading their arguments and open and closing braces and so on and on and on..).

Best wishes, D5

P.S. The whole XML file (with one value at B2 location) used also in the example above:

Code:
<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">
 <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
  <Author>Andrew</Author>
  <LastAuthor>Andrew</LastAuthor>
  <Created>2008-05-22T18:24:54Z</Created>
  <Version>12.00</Version>
 </DocumentProperties>
 <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
  <WindowHeight>11985</WindowHeight>
  <WindowWidth>24855</WindowWidth>
  <WindowTopX>240</WindowTopX>
  <WindowTopY>120</WindowTopY>
  <ProtectStructure>False</ProtectStructure>
  <ProtectWindows>False</ProtectWindows>
 </ExcelWorkbook>
 <Styles>
  <Style ss:ID="Default" ss:Name="Normal">
   <Alignment ss:Vertical="Bottom"/>
   <Borders/>
   <Font ss:FontName="Calibri" x:CharSet="186" x:Family="Swiss" ss:Size="11"
    ss:Color="#000000"/>
   <Interior/>
   <NumberFormat/>
   <Protection/>
  </Style>
 </Styles>
 <Worksheet ss:Name="Sheet1">
  <Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="2" x:FullColumns="1"
   x:FullRows="1" ss:DefaultRowHeight="15">
   <Row ss:AutoFitHeight="0"/>
   <Row ss:AutoFitHeight="0">
    <Cell ss:Index="2"><Data ss:Type="String">THISCELL!</Data></Cell>
   </Row>
  </Table>
  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
   <PageSetup>
    <Header x:Margin="0.3"/>
    <Footer x:Margin="0.3"/>
    <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75"/>
   </PageSetup>
   <Unsynced/>
   <Selected/>
   <Panes>
    <Pane>
     <Number>3</Number>
     <ActiveRow>1</ActiveRow>
     <ActiveCol>1</ActiveCol>
    </Pane>
   </Panes>
   <ProtectObjects>False</ProtectObjects>
   <ProtectScenarios>False</ProtectScenarios>
  </WorksheetOptions>
 </Worksheet>
 <Worksheet ss:Name="Sheet2">
  <Table ss:ExpandedColumnCount="1" ss:ExpandedRowCount="1" x:FullColumns="1"
   x:FullRows="1" ss:DefaultRowHeight="15">
   <Row ss:AutoFitHeight="0"/>
  </Table>
  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
   <PageSetup>
    <Header x:Margin="0.3"/>
    <Footer x:Margin="0.3"/>
    <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75"/>
   </PageSetup>
   <Unsynced/>
   <ProtectObjects>False</ProtectObjects>
   <ProtectScenarios>False</ProtectScenarios>
  </WorksheetOptions>
 </Worksheet>
 <Worksheet ss:Name="Sheet3">
  <Table ss:ExpandedColumnCount="1" ss:ExpandedRowCount="1" x:FullColumns="1"
   x:FullRows="1" ss:DefaultRowHeight="15">
   <Row ss:AutoFitHeight="0"/>
  </Table>
  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
   <PageSetup>
    <Header x:Margin="0.3"/>
    <Footer x:Margin="0.3"/>
    <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75"/>
   </PageSetup>
   <Unsynced/>
   <ProtectObjects>False</ProtectObjects>
   <ProtectScenarios>False</ProtectScenarios>
  </WorksheetOptions>
 </Worksheet>
</Workbook>