Thread: Multi-threading and file access assignment (test on Dual-Core)

  1. #1
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788

    Multi-threading and file access assignment (test on Dual-Core)

    Here is the story (skip it if you are not interested skip directly to the Problem). I have an assignment to do on Dual core. Take some code that counts words in a file and write it using multi-threading and semaphores. I looked at the code - it is horrible, really horrible... I thought if they want me to use semaphores - why not to give me some sample that really requires use of it? So I change the code a little bit... And a little bit more and come with this version that does the task - calculates the number of words in the file, but does not require semaphores.

    Description of the program: Program can be run without parameters (it uses one thread for calculations) or with 1 param from 0 to 9 that determines number of threads to use (0 stands for 10 threads). The only locking is used - are inside function fgets, no additional locks are required in my code... Program calculates number of words in the file "InFile1.txt" (sample file is attached, filename is hardcoded, to use other file its name should be changed to that and it should be put in the folder together with exe)

    Problem: I have no idea if running this program with 1 thread and 2 threads make any difference on dualcore. So if someone can run it once without params and once as <Threaded 2> on Dualcore system I will appritiate.

    Thanks.

    Here is the code.
    Code:
    /***********************************************************
    * Threaded.c
    ************************************************************
    * Description 
    * Counts number of words in the file InFile1.txt
    * Uses from 1 to 10 threads to perform the calculations 
    * depending on the command line arguments
    *************************************************************
    * Compiled using MSVS Express 2005
    * With Warning 4996 disabled
    *************************************************************
    * Usage:
    * Threaded 
    * for running in the single thread mode
    * Threaded x
    * where x is from 1 to 9 to run using x threads
    * Threaded 0
    * to run using 10 threads
    *************************************************************
    * (c) vart 2007
    *************************************************************/
    #include <windows.h>
    #include <stdio.h>
    #include <time.h>
    
    #define MAX_THREADCOUNT 10
    struct TotalWords
    {
    	int TotalEvenWords;
    	int TotalOddWords;
    };
    static struct TotalWords empty = {0};
    
    struct Params
    {
    	struct TotalWords count;
    	FILE*  fd;
    };
    void GetWordAndLetterCount(unsigned char *Line, struct TotalWords* current)
    {
    	int Letter_Count = 0;
    	int i;
    	for (i=0;Line[i];i++)
    	{
    		if (!isspace(Line[i])) 
    			Letter_Count++; 
    		else 
    		{
    			if(Letter_Count > 0)
    			{
    				if (Letter_Count % 2) 
    				{
    					current->TotalOddWords++; 
    				}
    				else 
    				{
    					current->TotalEvenWords++;
    				}
    				Letter_Count = 0;
    			}
    		}
    	}
    	if(Letter_Count > 0)
    	{
    		/* last word can have no white space at the end - we still need to count it */
    		if (Letter_Count % 2) 
    		{
    			current->TotalOddWords++; 
    		}
    		else 
    		{
    			current->TotalEvenWords++;
    		}
    		Letter_Count = 0;
    	}
    	return ;     
    }
    
    DWORD WINAPI CountWords(LPVOID arg) 
    {
    	struct Params* current = (struct Params*)arg;
    	unsigned char inLine[BUFSIZ];
    
    	current->count = empty; /* initializes counters to zero */
    	while (fgets((char*)inLine,sizeof inLine,current->fd))
    	{
    		GetWordAndLetterCount(inLine, &current->count) ;
    	}
    	return 0;
    }
    
    int main(int argc, char* argv[])
    {
    	int threadCount = 1;
    	HANDLE hThread[MAX_THREADCOUNT];
    	struct Params par[MAX_THREADCOUNT];
    	FILE* fd = NULL;
    	if(argc == 2 && isdigit(argv[1][0]))
    		threadCount = argv[1][0] - '0';
    
    	if (threadCount == 0) threadCount = 10;
    	fd = fopen("InFile1.txt", "r"); // Open file for read
    	if(fd == NULL)
    		perror("Cannot open file");
    	else
    	{
    		clock_t start,stop;
    		start = clock();
    		if(threadCount == 1)
    		{
    			par[0].fd = fd;
    			CountWords(par);
    		}
    		else
    		{
    			int i;
    			for (i = 0; i < threadCount; i++) 
    			{
    				par[i].fd = fd;
    				hThread[i] = CreateThread(NULL,0,CountWords,par+i,0,NULL);
    			}
    			WaitForMultipleObjects(threadCount, hThread, TRUE, INFINITE);
    			for (i = 1; i < threadCount; i++) 
    			{
    				par[0].count.TotalEvenWords += par[i].count.TotalEvenWords;
    				par[0].count.TotalOddWords += par[i].count.TotalOddWords;
    			}
    		}
    		fclose(fd);
    		stop = clock();
    		printf("Total Words = %8d\n\n", par[0].count.TotalEvenWords +	par[0].count.TotalOddWords);
    		printf("Total Even Words = %7d\nTotal Odd Words  = %7d\n", par[0].count.TotalEvenWords, par[0].count.TotalOddWords);
    		printf(" Calculated during %f clocks\n", (double)(stop-start));
    	}
    
    }
    The uploaded sample file is really short. Using this file time results are zeroes on my notebook... So for real test I'd ask to provided some text file in the size of around 5M or bigger ( could not upload it here)...
    Last edited by vart; 03-24-2007 at 03:24 AM.
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  2. #2
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    I think the idea is that one thread reads the raw file data into a buffer and another counts the words in that buffer.
    Of course that would be better done using async I/O, but whatever.

    You will, of course, still not notice a difference between a single-threaded and a dual-core multi-threaded version, because the program is completely I/O bound.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  3. #3
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    Quote Originally Posted by CornedBee
    I think the idea is that one thread reads the raw file data into a buffer and another counts the words in that buffer.
    the original idea of the assignment was to use global counter and lock access to them with semaphores... The course is about splitting loops between threads (each thread executes several iteration of the loop), not functionality like IO and calculations...
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Mutlithreaded file handling
    By nvoigt in forum Windows Programming
    Replies: 11
    Last Post: 06-30-2005, 02:39 PM