Thread: String fun

  1. #46
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    acutally i just ran that if statement 2billion times and it added like 5s to my execution time
    which is ok with me

  2. #47
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    Here is my new code:

    i dont know why it doesnt work. I dont write enough data to the final binary file:
    This is heavily influenced by matsp's code

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    ///////////////////////////////////////////////////////////////////////////////
    //		
    //		Error Output Meaning:
    //		0 - Success, no error
    //		1 - Not Enough Memory for Data array
    //		2 - Couldn't open binary file
    //		3 - Couldn't open one of the CSV files
    //		4 - CSV contains bad data
    ///////////////////////////////////////////////////////////////////////////////
    
    #define RowsInArray 10000	//number of rows in data array, modified to optimize execution
    #define MaxNumberOfFiles 20							
    unsigned short AllData2D[MaxNumberOfFiles][RowsInArray];	//big malloc'd array Data points around in.
    
    int main(){
    	FILE *f[20];			//may open between 1 and 20 files with very similar names
    	FILE *fbin;				//binary file pointer
    
    	double Header[62];		//header array...explained later
    	short ElementsInHeader;	//keep track of number of elements because i cant write more elements than are needed
    							//and this is easier than dynamically malloc'ing a value.
    	char FileName[255];		//File to be opened
    	char BaseName[] = "Chan%.0d.log";	//used to create filename to open
    	int i,j,k;					//counters
    	short NumberOfFiles = 2;	//number of files, 
    
    	int NumberOfElements;		//number of elements in DataArray = RowsInArray * NumberOfFiles
    								//used as variable so i dont have to calculate it 12800 times.
    	int TotalRows = 12800000;		//number of datapoints taken per channel
    	
    	short *AllData1D;			//big malloc'd array Data points around in.
    	
    	//Malloc a big array memory block to hold all data
    	//**This is more efficient than mallocing multiple times
    	AllData1D = (void *)malloc(sizeof(short) * NumberOfFiles * RowsInArray);
    	if (AllData1D == NULL){
    		printf("Not Enough Memory for Data\n");
    		return(1);
    	}
        
    
    	//open binary file
    	if ((fbin = fopen("BinaryData.log", "wb"))==NULL){
    		printf("Couldn't open binary file\n");
    		exit(2);
    	}
    	
    	//create the header array
    	//The header is wierd, i didnt create it, but you have to have to ...
    	//...and so on...
    	Header[0] = (double)3;
    	Header[1] = (double)NumberOfFiles;
    	for (i=0; i<NumberOfFiles; i++){
    		j=i*3+2;
    		Header[j] = (double)128000000;
    		Header[(j+1)] = (double)0;
    		Header[(j+2)] = (double)0.00015625;
    	}
    	ElementsInHeader = j+3;
    	
    	//Write header to binary file	
    	fwrite(Header, sizeof(Header[0]), ElementsInHeader, fbin);
    
    	//open binary channel files
    	for (i=0; i<NumberOfFiles; i++){
    		//create filename to open
    		sprintf(FileName, BaseName, (i+1));
    		
    		//open files
    		if ((f[i] = fopen(FileName, "rb"))==NULL){
    			printf("Couldn't open Binary Channel file #%d\n", (i+1));
    			exit(3); 
    		}
    	}
    	
    	//get data from Binary files and write it to an array that is Data[RowsInArray][NumberOfFiles]
    	//this is ~1000 times faster than writing one line at a time.
    	//10,000 rows at a time was the optimised value for the test computer... actually 100,000
    	//was better but there can be 20x10,000 elements = 200,000 which is optimal.
    	//anything around 1,000,000 elements will blow the computer stack and crash the program.
    
    //////////////////////////////////////////////////////////////////////////////////////////
    	//loop until all data taken (= TotalRows/RowsInArray) number of times
    	for (k=0; k<(TotalRows/RowsInArray); k++){
    		for (i=0; i<NumberOfFiles; i++){					//loop for all files
    			if (fread(AllData2D[i], sizeof(AllData2D[i]), 1, f[i]) <= 0){
    				printf("Binary channel read error\n");
    				return(4);
    			}
    		}
    
    		// store it in the 1D buffer in the right order (in columns)
    		for(i=0; i<NumberOfFiles; i++)
    		{
    			for(j=0; j<RowsInArray; j++)
    				AllData1D[j * NumberOfFiles + i] = (short)(AllData2D[i][j]/2);
    		}
    		if (fwrite(AllData1D, sizeof(AllData1D), 1, fbin) <= 0){
    			printf("Binary channel write error");
    			return(5);
    		}
    		
    	}
    ///////////////////////////////////////////////////////////////////////////////////////
    	
    
    	//free malloc'd Data
    	free(AllData1D);
    		
    
    	//Close channel Files
    	for (i=0; i<NumberOfFiles; i++)
    		fclose(f[i]);
    
    	//close Binary
    	fclose(fbin);
    	
    	return(0);
    }

  3. #48
    Registered User
    Join Date
    Aug 2007
    Posts
    42
    A few crazy Ideas to improve performance, coding is left as an excercise for the reader.

    But first a question. How fast do you need it to be? Is this a 1 time task or repetitive?

    Are you bottlenecked for I/O bandwidth using the above program? (iostat, vmstat etc)

    From simplest to most complex.

    1. Put the Input files on seperate drives. Definately put the output on a seperate drive.
    Possibly implement disk striping that is properly aligned with your file I/O.

    2. Use async i/o for writing ? That way you can hand off the write to the OS. and continue On with meaningful work. A bit more complex since you have to come back after the fact and verify the write succeeded at a later point in time.

    3. Stop using malloc, fread,fwrite. Those are system calls that block. Instead use open, mmap. Read/Write directly to memory addressing using pointers. Let the operating system handle page faults like its supposed to do. And you get the added benefit of avoiding double buffering.

    4. Implement a producer consumer model (see Pracital Unix Programming and concurrency) such that producers handle file read. And your consumer thread writes. Of course all of this will require thread synchronization and an exponential level of complexity.

  4. #49
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    wow my coding abilities were being stretched already, with the previous code. This is just going to take me to the next level.

    Firstly though im going to make this one work and see how fast the operation is. Then if time permits i will attempt to make the code run even faster.

    And yes this is my bottleneck at the moment.. Everything in my program runs sequencialy and this is the slowest operation. I run this code once for every sequence the runs. The entire sequence is run repetitively.

  5. #50
    Registered User
    Join Date
    Aug 2007
    Posts
    42
    I'm going to bet that it is I/O bound right? what OS are you running against?
    If its i/o then add more disk. And Adjust your routine accordingly to take advantage of the disks available.

  6. #51
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    yeah i/o seems to be the limiting factor, and i dont think that will change. I only have the one physical harddrive, but i have network connections... but i dont think that would be faster.

    I am using windows too

  7. #52
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by bean66 View Post
    3. Stop using malloc, fread,fwrite. Those are system calls that block. Instead use open, mmap. Read/Write directly to memory addressing using pointers. Let the operating system handle page faults like its supposed to do. And you get the added benefit of avoiding double buffering.
    Malloc is only called at startup, so it's not a major part of the time used, no matter how you look at it.

    I doubt that using fread/fwrite adds mch overhead from the native system calls in comparison to the time it takes the disk to deliver the data. So using any other method to read/write the data will most likely make little difference. mmap (or the corresponding Windows call, which I'd have to look up) would probably work better for reading, but for writing it would only work if the system is a 64-bit version of Windows, as the output file is bigger than the 4GB addressable range in 32-bit space (it becomes about 5GB).

    Using async writes would probably help quite a bit until the system can't take any more async writes. Not sure about reads - we still need to wait for the read before we can write, but I guess we could issue 20 reads in parallell and then wait for all of them to finish.

    Of course, changing the hardware would also help... Either by adding more disks and storing the input files on separate disks, or by using some sort of striped data arrangement (where data is split over multiple disks).

    --
    Mats
    Last edited by matsp; 08-24-2007 at 11:53 AM.

  8. #53
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    You say it doesn't write enough data - how much data DO you get in the final file, and what happens when it finishes - does it give any error, or just finish "happily"?

    Some comments on the code:
    1. Make the "128000000" a constant of some sort - it's repeated several times in different places - and even if you don't think it's goign to change right now, using a constant will make it easier to change later. Either "const int x = ..." or "#define X ..." where x is a reasonable name. By the way, you seem to have two different numbers in there TotalRows is 12800000 (5 zeros) and the data you write to the header is 128000000 (6 zeros) - presumably both should be the same? It would be a possible reason why it's not wriitng the right amount of data...

    2.
    Code:
    unsigned short AllData2D[MaxNumberOfFiles][RowsInArray];	//big malloc'd array Data points around in.
    I don't see anything "malloc'd" in relation to this... I.e. the comment is all wrong.

    3.
    Code:
    	//get data from Binary files and write it to an array that is Data[RowsInArray][NumberOfFiles]
    	//this is ~1000 times faster than writing one line at a time.
    	//10,000 rows at a time was the optimised value for the test computer... actually 100,000
    	//was better but there can be 20x10,000 elements = 200,000 which is optimal.
    	//anything around 1,000,000 elements will blow the computer stack and crash the program.
    If you use malloc or global variable, this comment is incorrect with regards to blowing up the stack and crashing.

    --
    Mats

  9. #54
    Registered User
    Join Date
    Aug 2007
    Posts
    42
    Quote Originally Posted by matsp View Post
    Malloc is only called at startup, so it's not a major part of the time used, no matter how you look at it.
    Agreed.

    Quote Originally Posted by matsp View Post
    I doubt that using fread/fwrite adds mch overhead from the native system calls in comparison to the time it takes the disk to deliver the data. So using any other method to read/write the data will most likely make little difference. mmap (or the corresponding Windows call, which I'd have to look up) would probably work better for reading, but for writing it would only work if the system is a 64-bit version of Windows, as the output file is bigger than the 4GB addressable range in 32-bit space (it becomes about 5GB).
    I disagree, fread/fwrite reads from disk, then makes a copy into your user address space then the code copies that buffer to its integer arrays? Thats 3 copies.

    Quote Originally Posted by matsp View Post
    Using async writes would probably help quite a bit until the system can't take any more async writes. Not sure about reads - we still need to wait for the read before we can write, but I guess we could issue 20 reads in parallell and then wait for all of them to finish.

    Of course, changing the hardware would also help... Either by adding more disks and storing the input files on separate disks, or by using some sort of striped data arrangement (where data is split over multiple disks).
    Combining the async read/write along with hardware changes will most definately yield the best performance. As it allows you to keep the reading operational and in parallel. Ie 5 reads running concurrently vs 1. Should be ~5 times quicker.

  10. #55
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    wow! finished product! (I'll upgrade it after i test it out on monday, with all of the hardware, too see if there are any bugs when i integrate all my programs/hardware together)

    It works and you guys are amazing, my original version of this code took about 25Days to execute (it was in LabVIEW, which isn't a bag language, it makes development and hardware integration super fast, but the execution speed wasn't so great.)

    Now it executes in about 10 minutes.

    The whole group of programs takes about 30 minutes. Thanks so much.

    Im very excited to try out those other suggestions too! *Fingers crossed that there are no issues*

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 4
    Last Post: 03-03-2006, 02:11 AM
  2. Calculator + LinkedList
    By maro009 in forum C++ Programming
    Replies: 20
    Last Post: 05-17-2005, 12:56 PM
  3. Classes inheretance problem...
    By NANO in forum C++ Programming
    Replies: 12
    Last Post: 12-09-2002, 03:23 PM
  4. creating class, and linking files
    By JCK in forum C++ Programming
    Replies: 12
    Last Post: 12-08-2002, 02:45 PM
  5. Warnings, warnings, warnings?
    By spentdome in forum C Programming
    Replies: 25
    Last Post: 05-27-2002, 06:49 PM