Thread: Parasing website data : intermitten rubbish characters retrived

  1. #1
    Registered User
    Join Date
    Mar 2004
    Posts
    114

    Parasing website data : intermitten rubbish characters retrived

    Parasing website data : intermitten rubbish characters retrived

    hi, i am making a program to parse data from website, to do that i need to download the file

    Step1: download file
    Code:
    CString Data;
    //CString Buffer;
    	DeleteUrlCacheEntry(url);// delete the old stupid cache
    
    HINTERNET IntOpen = ::InternetOpen("Sample", LOCAL_INTERNET_ACCESS, NULL, 0, 0);
    HINTERNET handle = ::InternetOpenUrl(IntOpen, url, NULL, NULL, NULL, NULL);
    HANDLE hFile	= ::CreateFile("c:\\index.txt", GENERIC_WRITE, NULL, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
     
    char Buffer[1024];
    DWORD dwRead =0;
    while(::InternetReadFile(handle, Buffer, sizeof(Buffer), &dwRead) == TRUE)
    {
    	if ( dwRead == 0) 
    	 break;
    	DWORD dwWrite = 0;
    	::WriteFile(hFile, Buffer, dwRead, &dwWrite, NULL);
    	Data+=Buffer;
    }
    
     ::CloseHandle(hFile);
    ::InternetCloseHandle(handle);
    the Cstring "Data" contains the website in a plain text
    step2 : parse the data using brackets
    because a lot of data in within <> brackets, this can be used to reference the desired data

    Code:
    // this function look for the text and removes "bracket_distance" number of <>, then return the result 
    // eg. "dsfsd<><><><>6.35<>", item = dsfsd, bracket_distance = 4
    CString Mydialog::Parse_Backets(CString file_string, CString item, int bracket_distance)
    {
    	file_string.ReleaseBuffer();
    
    	int start_index;
    	int end_index;
    	start_index = file_string.Find(item);
    	if(start_index == -1)
    	{
    		CString error_string = "Error";
    		error_flag = 1;
    		return error_string;
    	}
    	for(int i =0; i <bracket_distance; i++)
    	{
    		start_index = file_string.Find(">",start_index)+1;
    	}
    	end_index = file_string.Find("<",start_index) - 1;
    	file_string=file_string.Mid(start_index, end_index-start_index+1  );
    	return file_string;
    }
    now the problem is once in a while i get rubbish characters. Like the actual value when i browse to the website, should be 0.55 , i get 0.aj5m5, or even 0.1595

    the website is http://stquote.sgx.com/live/st/STStock.asp?stk=G

    does anyone knows how to solve this problem?
    using:
    - mfc
    - VC6.0

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > Data+=Buffer;
    Because this assumes that Buffer is '\0' terminated, and that simply isn't true at all.
    The fact that you're getting such good results so far is down to dumb luck, nothing more.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Mar 2004
    Posts
    114
    thanks for advice
    how should i fix the null terminated string problem?
    Last edited by hanhao; 07-31-2007 at 04:36 AM.

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    I dunno, but I thought that adding a '\0' would have been an easy thing to do.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Madly in anger with you
    Join Date
    Nov 2005
    Posts
    211
    wrote this for someone a couple months ago, might be of interest.

    http://s134k.koreru.org/shiznit/GetIP.c

    Intel Core 2 Quad Q6600 @ 2.40 GHz
    3072 MB PC2-5300 DDR2
    2 x 320 GB SATA (640 GB)
    NVIDIA GeForce 8400GS 256 MB PCI-E

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 6
    Last Post: 12-27-2007, 11:10 AM
  2. can't insert data into my B-Tree class structure
    By daluu in forum C++ Programming
    Replies: 0
    Last Post: 12-05-2002, 06:03 PM
  3. C Programming Question
    By TK in forum A Brief History of Cprogramming.com
    Replies: 13
    Last Post: 07-04-2002, 07:11 PM
  4. Warnings, warnings, warnings?
    By spentdome in forum C Programming
    Replies: 25
    Last Post: 05-27-2002, 06:49 PM
  5. gcc problem
    By bjdea1 in forum Linux Programming
    Replies: 13
    Last Post: 04-29-2002, 06:51 PM