Thread: Read size of data array when reading .txt data

  1. #1
    Registered User
    Join Date
    Apr 2007
    Posts
    7

    Read size of data array when reading .txt data

    Hi a newbie here (just started learning C two weeks ago). I like C as a fresh start. I have been using R and Matlab for most numerical analysis. But it takes a lot of times and CPU when I use a massive data. So that motivated me to take advantage of comiplier-based languages instead of interpreter-based ones.

    Anyways, I would like to know how to let a program read the size of data array imported, and how to store values at the jth row and at the kth column into an j*k array, in C.

    I tried the follwing codes to start with. I type the file name and let this program to read the file.. But it did not work. The values in the file are xx.xxx (double).

    Code:
    #include<stdio.h>
     
    int main()
     {
    	
    /* define variables */
    	FILE *fp;	
    	double read_arr[6][2];
    	char filename[80];
    	int j=0,k=0;
    	
    /* ask and get a file name to be read */
    	printf("Filename=: ");
    	gets(filename);
    	
    /* Show error message when the file doesn't exist in the current directory */
    	if((fp=fopen(filename,"r"))== NULL)
    	{
    		printf("Cannot open file\n");
    		exit(1);
    	
    	while(!feof(fp))
    		{
    			for(j=0;j<10;j++)
                            for(k=0;k<2;k++)
    			{
    				fscanf(fp,"%f",&(read_arr[i][j]));
    				printf("%f",read_arr[i][j]);
    			}
    			printf("\n");
                            
    		}
    		fclose(fp);
    }}
    I am not sure why this does not work. No compiling errors, but no printout either.

    So the first question is in what part I did wrong. and the second is that instead of assigning 6 and 2 in read_arr[6][2] how can I let the program determine the size of array? I have many array data whose sizes are not the same. Eventually, i would like to loop this process and save the results, while I am doing something else.

  2. #2
    Registered User
    Join Date
    Oct 2006
    Location
    Canada
    Posts
    1,243
    as others will tell you, using 'gets' is a bad idea. read this. in short, use this instead:
    Code:
       fgets(filename,sizeof(filename),stdin);
    but please read the link to find out why.

    also, in your if statement, it appears you want only the two statements to execute (if the file cant be opened). however you will notice you have misplaced a bracket. remove the second bracket (on the last line in your output) and add one (closing bracket) just before the while loop.

    regarding your actual question.. i dont fully understand what you need to do! sorry. im exhausted, i will read it again in the morning before work.

    edit: it seems that you want an array, however you dont know what size of an array you want? if thats the case, its a perfect example of when to use 'dynamic memory'. search this site, or wherever you find helpful, for that term, and other terms such as 'malloc' and 'free', which make up dynamic memory.

    can you give more information on the content/structure the file contains? is it just a list of numbers (data type 'double')? each one on different lines? seperated by a space, etc. is the input file just a bunch of simple numbers like this?
    Last edited by nadroj; 04-20-2007 at 11:37 PM.

  3. #3
    Deathray Engineer MacGyver's Avatar
    Join Date
    Mar 2007
    Posts
    3,210
    Your terrible formatting has led you to this problem. Since I'm in such a good mood at the moment, not only will I attempt to fix your formatting issue, but I'll point out some changes I made to make your code safer.

    Code:
    #include<stdio.h>
    #include <string.h> /* Added because of strlen() -- MG */
    #include <stdlib.h> /* Added because of exit() -- MG */
     
    int main(void) /* Added to make this C99 compliant, since I believe void must be in here.
    		If I'm wrong, no doubt somebody will try to take pleasure in correcting me -- MG */
    {
    	/* define variables */
    	FILE *fp;
    	double read_arr[6][2];
    	char filename[80];
    	int j=0,k=0,iEnd; /* Added integer iEnd to keep track of string end -- MG */
    
    	/* ask and get a file name to be read */
    	printf("Filename=: ");
    	fgets(filename,sizeof(filename),stdin); /* gets() is dangerous and can be used to take over programs.
    					fgets() is the safer alternative. -- MG */
    	iEnd = strlen(filename) - 1; /* Find where the last char of filename is located at. -- MG */
    	if(filename[iEnd] == '\n') /* fgets() includes the '\n' char if there is room. -- MG */
    	{
    		filename[iEnd] = '\0'; /* So we replace it with a '\0' to mark the "real" end of the string. -- MG */
    	}
    
    	/* Show error message when the file doesn't exist in the current directory */
    	if((fp=fopen(filename,"r"))== NULL)
    	{
    		printf("Cannot open file\n");
    		exit(1);
    	} /* Remember not to forget to add braces when they are needed -- MG */ 
    	
    	while(!feof(fp))
    	{
    		for(j=0;j<10;j++)
    		{ /* Good idea to use braces even when you don't have to -- MG*/ 
    			for(k=0;k<2;k++)
    			{
    				fscanf(fp,"&#37;f",&(read_arr[k][j])); /* Replaced non-declared i with k. -- MG */
    				printf("%f",read_arr[k][j]); /* Replaced non-declared i with k. -- MG */
    			}
    			printf("\n");
    		} /* Matching braces helps. -- MG */
    	}
    	fclose(fp);
    }
    I didn't do any error checking. That can be your fun job.
    Last edited by MacGyver; 04-21-2007 at 11:02 PM. Reason: Lost an ending tag

  4. #4
    Registered User
    Join Date
    Apr 2007
    Posts
    7
    I am very sorry for taking your time like this.... and I appreciate your comments, codes, and suggestions on my code. Although I started to learn C two weeks ago I hope I will learn more about C. I only know R, SAS, and Matlab... I never thought of reading files into array without using read(...) in R, proc data in SAS, and load in Matlab. These programs automatically read files.... Again, thank you very much and keep reading C primer at hand......

  5. #5
    Registered User
    Join Date
    Apr 2007
    Posts
    7
    Oh, what I meant by the size of array is the size of data to be read from a local directory, not an array where the data will be strored in C. For example, in Matlab, I could just type:

    load filename.txt;
    &#37; n as the number of rows, p as the number of columns of this particular filename.txt
    [n,p]=size(filename);

    So, only two lines of command do reading a txt file into a multidimensional array. But Matlab, and others, take so much time to read and execute, when the data size gets very very large. So that time-issue motivated me to study Fortran, C, or Java as an alternative.

    I am very sorry to have you confused. And I thank you two for useful comments..

  6. #6
    Registered User
    Join Date
    Oct 2006
    Location
    Canada
    Posts
    1,243
    Oh, what I meant by the size of array is the size of data to be read from a local directory
    do you mean 'file' instead of 'directory'?

    can you try and explain again what exactly you want to do? sorry im not fully understanding.

    you want to read a file, like you can in matlap, and store it into a two-dimensional array? if thats the case can you post afew sample lines of a file, such as one you would use in your above matlab example.

  7. #7
    Registered User
    Join Date
    Apr 2007
    Posts
    7
    Dear nadroj:

    Thanks. Suppose I have a txt file of 5 rows and 3 columns; and this file name is sample.txt. The data I would like to read into C would look like:

    1 10 11
    2 20 12
    3 30 13
    4 40 14
    5 50 15


    To read this sample.txt file into Matlab,

    &#37; read txt file into matlab
    load sample.txt;

    % rename this data as SAMPLE
    SAMPLE=sample;

    % Take the number of rows and columns from this imported sample.txt data
    [n,p]=size(SAMPLE9;

    Then when I type n and p, then Matlab will return

    n=5 % returns the number of rows
    p=3 % returns the number of columns

    When I type SAMPLE, then Matlab will show

    SAMPLE=
    1 10 11
    2 20 12
    3 30 13
    4 40 14
    5 50 15

    So, withougt knowing the actual size of this data, as long as I successfully read .txt file into Matlab, this very data is already converted into the matrix form. so when I type

    SAMPLE(1,5)

    Matlab will return

    SAMPLE(1,5)=5

    When I type

    SAMPLE(1,5)*SAMPLE(2,3)

    Matlab will return

    SAMPLE(1,5)*SAMPLE(2,3)=150

    So Matlab automatically store values in a matrix. In C, I would have to read a data using a pointer, define the size (rows and columns) of a data I would like to import into C, then store the values of this imported data into a 2 dimensional array using a loop.

    Since I have 253 different types of txt files whose numbers of rows and columns are not equal to one another, changing the size of array (e.g. read_arr[i][j] in my original code) and define them everytime is cumbersome.

  8. #8
    Registered User
    Join Date
    Oct 2006
    Location
    Canada
    Posts
    1,243
    ok so the content of the input file is just a bunch of numbers. each number seperated by a space, and each column of the matrix seperated by a newline.i was confused by your initial post:
    Quote Originally Posted by Taquito
    The values in the file are xx.xxx (double).
    is there a maximum row or column size? i think it would be easier to do this in C++, with standard datastructures, such as vectors, arraylists, etc. they will handle resizing pointers and all that nasty stuff.

    i can only think of one way to do this, but it doesnt seem very efficient at all. Taquito, take a look if you like, but i suggest wait and see what others offer.

    my idea would be to read the file character by character, and whenever you find a '\n' (newline), you increase a counter for the number of rows. once you have that, you know one dimension of your matrix, the number of columns:
    Code:
    int** matrix;
    int rowCount = 0;
    // .. loop through file counting number of '\n's..
    matrix = malloc(rowCount * sizeof(int*));
    and do a similar approach to each column of the matrix, starting again at the beginning of the file, and in each row count the number of ' ' (spaces).

    i imagine theres a much better way to do this, and hope someone can let me know too!

    edit: is there the same number of numbers on each line? if i have 4 numbers on line one, will every line have 4 numbers? or can a line have a different length?
    Last edited by nadroj; 04-21-2007 at 11:45 PM.

  9. #9
    Registered User
    Join Date
    Apr 2007
    Posts
    7
    nadroj,

    There is the same number of numbers on each line. As Matlab and other statistical packages do not handle with NA, I changed NAs into zeros (0). So suppose I have a n*p matrix that contains some numbers, these numbers are either double or zeros. At this moment, each line has the same number of numbers. In the future, I will have to deal with cases where each line has irregular number of numbers..

    Hmmmm. C++... I think I will learn C++ in two years or so... C program is very deep and can be efficient through efficient designs, I believe. As I am not very interested in creating a GUI program, I thought C provides me with the best opportunity to improve computing speeds, although some websites discuss C vs C++... I am not yet at that level, as I just started to learn C... I am curious, but it must depend on programming purposes....

    I want to ask you: What book(s) do you recommend is the best that are full of good tutorials and examples to C beginners and intermediates? I am looking for math/statistical library for C, and found Gnu math/statistical library on the web. Is this library pretty good? (if you know). C programming is growing on me very much!!!!

  10. #10
    Registered User
    Join Date
    Oct 2006
    Location
    Canada
    Posts
    1,243
    theres a thread on book recommendations here. there are tutorials and quizzes on cprogramming.com (look at the left menu).

    youre right, C can be used to create very fast and efficient programs, if designed well. you might be able to write a faster program using C++ and the datastructures mentioned above, rather than using C and creating your own (and maybe improperly). however, it would be good practice to stick with C, as you seem you want to, so you can get appreciation for C++ and utilizing its OOP.

    since you realize that in the future you may want to handle lines each with unique number of numbers on it, it would be good to write your program so it can handle that now, rather than having to redo a lot of things when you decide to change it.

    again, i stick with my approach as mentioned above.. but would also appreciate the comments from more proficient programmers.

    to count the number of rows in your matrix, read the file character by character. whenever you find a newline ('\n') then increment your counter.

    now to count the number of columns on each line, start at the beginning of the file, read each character until you reach a '\n' (ie read each character on a line). any time you read a space ' ', before the '\n' is found, you increment the column count for that row.
    Code:
    int** matrix; // obviously, our actual matrix of numbers
    int rowCount = 0; // obviously, the number of rows in the matrix
    int* columns // an array of ints, ie columns[0] will store the number of columns in row 0, etc
    
    // open file
    // make a loop, reading the file character by character
    // inside the loop, increase rowCount when you find '\n' character
    
    // once done the loop, rewind the file to the beginning
    
    matrix = malloc(rowCount * sizeof(int*));
    columns = malloc(rowCount * sizeof(int));
    
    // make a loop, from X = 0 to rowCount
    // initialize column[X] to 0
    // make an inner loop, reading the file character by character until a '\n' is reached
    // inside this inner loop, increase column[X] whenever you see a space ' '
    // once the inner loop is done (for this iteration of the outer loop) then do:
    matrix[X] = malloc(columns[X] * sizeof(int));
    all this would do is get the size of the matrix! however the (i think) hard part would be done, and you would know the number of rows, and the length of each row (columns), even if each row is different length. you still have to then start reading at the beginning of the file, and store each number into its corresponding place in the matrix.

    again, im sure theres a better way to do this in C, and hope someone can shed some light.

    edit: the above method would get very screwed up if the input file was something like:
    Code:
    1 2    3              4                  (even more spaces here)
    therefore it would be better to use a different delimeter (instead of space) or an entirely different way all together. if you can safely assume the input file will be in the exact format you expect, then this method should work.
    Last edited by nadroj; 04-22-2007 at 12:12 PM.

  11. #11
    Registered User
    Join Date
    Apr 2007
    Posts
    7
    Thanks It worked. I am planning to stay focused on C for a year and a half, and will move on to C++ afterward. One of my friends working as a security programmer asked me why I study C instead of java... I think C is pretty good for me, especially a guy who wants to create programs to compute, manage, and sort data files... I do not do GUI programming. So Thanks, I will keep learning C!!!

  12. #12
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    A hint on getting your braces right: whenever you open a block, close it at the same time. Whenever you type
    Code:
    {
    , also type the corresponding
    Code:
    }
    . THEN type your code between the braces. You are far less likely to end up with a missing brace. (In fact, if you always follow this rule, it should be obvious that you can NEVER have a missing brace.)

    By the way... The board wouldn't let me type the curly braces without sticking them in code tags. Kinda lame.

  13. #13
    Registered User
    Join Date
    Oct 2006
    Location
    Canada
    Posts
    1,243
    Taquito, im interested, if you dont mind, to see your code

  14. #14
    Registered User
    Join Date
    Apr 2007
    Posts
    7
    nadroj, let me get back to you with my code when I come back from CA..... I am away from my PC now.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Fixing my program
    By Mcwaffle in forum C Programming
    Replies: 5
    Last Post: 11-05-2008, 03:55 AM
  2. Read data from my .txt?
    By Mech0z in forum C++ Programming
    Replies: 3
    Last Post: 05-29-2006, 09:55 AM
  3. Replies: 1
    Last Post: 09-10-2005, 06:02 AM
  4. problems reading data into an array and printing output
    By serino78 in forum C Programming
    Replies: 4
    Last Post: 04-28-2003, 08:39 AM
  5. read from .txt file & put into array?
    By slow brain in forum C Programming
    Replies: 6
    Last Post: 02-25-2003, 05:16 AM