Thread: .txt file splitter partial program

  1. #1
    Registered User
    Join Date
    Sep 2009
    Posts
    9

    .txt file splitter partial program

    Hi guys, I've got some code for a C program that splits a text file.

    It splits a text file into different components, and currently works for text files with YYYYMMDD.txt format. For example, here are two daily files:

    20090101.txt
    ACD,20090101,2738.0801,2738.0801,2738.0801,2738.08 01,0
    ADR,20090101,626.15,626.15,626.15,626.15,0
    ADRA,20090101,2.58,2.58,2.58,2.58,0

    20090102.txt
    ACD,20090102,2738.08,2799.71,2717.33,2799.64,0
    ADR,20090102,626.15,642.47,625.42,640.72,0
    ADRA,20090102,0.45,2.42,0.45,2.42,0


    What it does is make ACD.txt, ADR.txt and ADRA.txt and append each days data onto the end of each file. For example;

    ACD.txt
    20090101,2738.0801,2738.0801,2738.0801,2738.0801,0
    20090102,2738.08,2799.71,2717.33,2799.64,0


    The code I've got only works for files in the YYYYMMDD.txt type format. I want to modify it so that it works for any text format (such as TEST_YYYYMMDD.txt).

    If someone could point me into what to change, that would be greatly appreciated. I've studied the code and can't see it.

    Code:
    #include <stdio.h>
    #include <string.h>
    #define EXTENSION ".txt"
    int main(void)
    	{
    
     	int count=0;
    	char fileName[100];
    	char TempfileName[100];
    	char TempString[256];
    	char TempChar;
    	FILE *fp;
    	FILE *fpTmp;
    
    	printf("Enter Filename: ");
    
    	scanf("%s", fileName);
    
    	printf("Opening %s...\n", fileName);
    
    	fp = fopen(fileName, "rb");
    
    	if(fp == NULL)
    		{
    		fprintf(stderr, "Error Opening File: %s\n", fileName);
    		return -1;
    		}
    
    	else
    		printf("%s Opened Successfully.\n\n\n",fileName);
    
    	while(!feof(fp))
    		{
    		
    		for(count=0; ( (TempfileName[count] = fgetc(fp)) != ',' ) ; count++);
    		
    		TempfileName[count] = '\0';
    		
    		strcat(TempfileName,EXTENSION);
    		
    		fpTmp = fopen(TempfileName,"ab");
    		
    		if(fpTmp == NULL)
    			{
    			fprintf(stderr, "Error Opening File: %s\n", TempfileName);
    			return -1;
    			}
    		else
    			printf("%s Opened Successfully.\n",TempfileName);
    			
    		fgets(TempString,256,fp);
    		
    		TempString[strlen(TempString)-1] = '\n';
    		
    		fputs(TempString, fpTmp);
    		
    		printf("%sWritten to %s\n", TempString, TempfileName);
    		
    		fclose(fpTmp);
    		
    		printf("%s Closed.\n\n",TempfileName);
    		
    		}
    
    	}
    Also, this program extracts the text files into the same directory, how would I go about putting them into a sub directory such as "extracted". e.g. if it was in C:/Data, extract it to C:/Data/Extracted?

    Thanks all.

  2. #2
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    You can't make your program work with "any format", because a format has, by definition, a particular set of things that must conform to it, or it is not a format, at all.

    You can test the start of the line for the char's "TEST", and then proceed to process the file just the same as you would otherwise, just ignoring the "TEST_" portion.

    Normally, I build up a string array[] to hold the path I want, and work with it from there. Remember to put an end of string char - '\0' right after the last char in your path name, each time you build it up, and it's just easier to work with forward slashes in the path - '/', rather than backslashes - '\', since the backslashes are also escape char's in some C functions.

    So:
    Code:
    FILE *out1;
    char path1[50];
    
    /* is gradually built up (depending on what the program needs, to:*/
    
    "C:/directoryIWant/sub-directoryIwant/filename"
    
    then 
    
    out1 = fopen(path1, "wt");  //always overwrite, or use "at" to always append text data to the file
    
    to make this work, you MUST have the end of string char in place, at the end of the char array, path1.

  3. #3
    Registered User
    Join Date
    Sep 2009
    Posts
    9
    I don't understand why it cannot work for any format. By format I mean the text files name. The data within the text file is the same each time. The program asks for the file name, so should there be a reason stopping it from being called 20090101.txt or File1.txt? If it both contains the same data, and the file name is entered, the program should process it? (it doesn't, and I can't see why it won't).

    Regarding the file path, there isn't a way to place it into a subdirectory of the folder without specifying the full path? Depending on the computer I am using, the directory can be different (i.e. it's on a memory stick). Right not it just places it within the same directory, if it could place in a sub-folder within the same directory without specifying the path, that would be fantastic.

    I may be thinking totally wrong as I am not a programmer. But my two points above seem common sense to me - but then again, I don't know much.

  4. #4
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    The program can work with any file name you want. That just needs you to add in the logic. Tell me what you're stuck on, because that part isn't clear to me. How about a concrete example, step by step?

    Usually, problems like this are the result of not remembering to put the end of string char into it's place, right at the end of the path+filename, string.

    Paths are relative, so yes, you don't need to have absolute paths. I just showed that, as an example.

    "SubDirectoryNameYouWant/filenameYouWant"

    I'm rusty using relative paths, so let me confirm the above. There may be differences of course, between putting the file into a subdirectory that already exists, compared to putting the file into a subdirectory that doesn't exist yet.

    What OS are you using?

    OK, I see my problem. I use full path names because I frequently use Turbo C/C++ for little utility programs, and they can't use relative paths (at least not in my version).

    I haven't used relative paths in Windows or Linux, but I believe both of them support that in their major C compilers.

    Note that if you use a char string for the file name or path, you have to first remove the newline char from it. I use this:

    Code:
    i = -1;
    while(charArray[++i] != '\n');  //note the semi-colon
    
    charArray[i] = '\0';            //this is outside the one line while loop
    If the \n is still in the string (common when using fgets() to make up the charArray), it won't work.
    Last edited by Adak; 09-07-2009 at 05:27 AM.

  5. #5
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by ojm View Post
    I don't understand why it cannot work for any format. By format I mean the text files name. The data within the text file is the same each time. The program asks for the file name, so should there be a reason stopping it from being called 20090101.txt or File1.txt? If it both contains the same data, and the file name is entered, the program should process it? (it doesn't, and I can't see why it won't).
    The code you have does process any file name that doesn't have a space in it (since %s stops reading at spaces.) I don't know why you think it doesn't.
    Quote Originally Posted by ojm View Post
    Regarding the file path, there isn't a way to place it into a subdirectory of the folder without specifying the full path? Depending on the computer I am using, the directory can be different (i.e. it's on a memory stick). Right not it just places it within the same directory, if it could place in a sub-folder within the same directory without specifying the path, that would be fantastic.
    All you have to do is write the code that makes this happen. (I.e., when they give you a filename, you have to save the path, and then add the directory and filename to that path when you try to write something out.)

  6. #6
    Registered User
    Join Date
    Sep 2009
    Posts
    9
    Figured it out: it works with text files that have 8 characters or less in the name of the file. This would explain why I was thinking it would only work for YYYYMMDD.txt files, not ones with words before them.

    Can anyone spot what it is that would need to be changed to allow more than 8 characters in the file name?

    The subdirectory question: the sub-directory won't have to be created, it would always be there. Can the fputs be changed to write to the sub directory, or does code have to be placed to "change" the directory and then write?

    (I'm using Windows XP).

  7. #7
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by ojm View Post
    Figured it out: it works with text files that have 8 characters or less in the name of the file. This would explain why I was thinking it would only work for YYYYMMDD.txt files, not ones with words before them.

    Can anyone spot what it is that would need to be changed to allow more than 8 characters in the file name?

    The subdirectory question: the sub-directory won't have to be created, it would always be there. Can the fputs be changed to write to the sub directory, or does code have to be placed to "change" the directory and then write?

    (I'm using Windows XP).
    If you say 20090901.txt works but 20090901a.txt doesn't, I say either you're hallucinating or you've changed the code and need to post the new stuff. What your program cannot handle is spaces in the file name (or path) -- if you need to deal with spaces you're going to have to change over to fgets.

  8. #8
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Oh, and as to your other question:
    Can the fputs be changed to write to the sub directory, or does code have to be placed to "change" the directory and then write?
    fputs can only write to the opened file. If you mean can the fopen be changed, or does code have to be placed to change the directory, the answer is "yes". You can change the fopen, or you can change the working directory, whichever you want to do.

  9. #9
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Which gives the following. Try, try again.

  10. #10
    Registered User
    Join Date
    Sep 2009
    Posts
    9
    Okay, its no an eight character problem, I came to that assumption a bit early. But look at these results:

    Enter Filename: index20090101.txt
    Opening index20090101.txt...
    Error Opening File: index20090101.txt

    Enter Filename: 20090101index.txt
    Opening 20090101index.txt...
    20090101index.txt Opened Successfully.

    Enter Filename: index_20090101.txt
    Opening index_20090101.txt...
    Error Opening File: index_20090101.txt

    Enter Filename: 20090101_index.txt
    Opening 20090101_index.txt...
    20090101_index.txt Opened Successfully.

    So if the text comes before the numbers it doesn't work. Any suggestions on why this is?


    Regarding the 20090101a.txt, if using a20090101.txt -- it doesn't work.

    Enter Filename: a20090101.txt
    Opening a20090101.txt...
    Error Opening File: a20090101.txt
    Last edited by ojm; 09-07-2009 at 05:51 PM.

  11. #11
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    I'm guessing the first line in the below screen shot is what you forgot to do. If there's actually something going on, you can call strerror inside your "Error Opening File" bit to see what it is.

  12. #12
    Registered User
    Join Date
    Sep 2009
    Posts
    9
    C:\DOCUME~1\pzqkvv\Desktop\INDEX_~1>split
    Enter Filename: 20090101a.txt
    Opening 20090101a.txt...
    20090101a.txt Opened Successfully.


    ACD.txt Opened Successfully.
    20090102,2738.08,2799.71,2717.33,2799.64,0
    Written to ACD.txt
    ACD.txt Closed.

    ADR.txt Opened Successfully.
    20090102,626.15,642.47,625.42,640.72,0
    Written to ADR.txt
    ADR.txt Closed.

    ADRA.txt Opened Successfully.
    20090102,0.45,2.42,0.45,2.42,0
    Written to ADRA.txt
    ADRA.txt Closed.

    ADRD.txt Opened Successfully.
    20090102,0.08,0.61,0.08,0.61,0
    Written to ADRD.txt
    ADRD.txt Closed.

    ADRN.txt Opened Successfully.
    20090102,0.94,4.98,0.76,4.29,
    Written to ADRN.txt
    ADRN.txt Closed.


    C:\DOCUME~1\pzqkvv\Desktop\INDEX_~1>rename 20090101a.txt a20090101.txt

    C:\DOCUME~1\pzqkvv\Desktop\INDEX_~1>split
    Enter Filename: a20090101.txt
    Opening a20090101.txt...
    Error Opening File: a20090101.txt

    C:\DOCUME~1\pzqkvv\Desktop\INDEX_~1>


    I'm not sure how to call strerror (from what I am trying it says that variable doesn't exist).

  13. #13
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    It's a function.
    Code:
    printf("Error Opening File %s: %s", fileName, strerror(errno));
    EDIT: You will need #include <errno.h> at the top if you don't have it already.
    Last edited by tabstop; 09-07-2009 at 06:38 PM.

  14. #14
    Registered User
    Join Date
    Sep 2009
    Posts
    9
    Well. My compiler didn't have errno.h, so I just changed complier.. and guess what: it now works with EVERY combination of file name. That is pretty annoying. Sorry for the wasted time all. For the record: I was using Miracle C, and switched to Dev-C++.

    Thanks very much for your help tabstop (and others).

    Now onto the topic of the subdirectory: would code be placed about the "fputs(TempString, fpTmp);" line to change the directory?

  15. #15
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by ojm View Post
    Well. My compiler didn't have errno.h, so I just changed complier.. and guess what: it now works with EVERY combination of file name. That is pretty annoying. Sorry for the wasted time all. For the record: I was using Miracle C, and switched to Dev-C++.

    Thanks very much for your help tabstop (and others).

    Now onto the topic of the subdirectory: would code be placed about the "fputs(TempString, fpTmp);" line to change the directory?
    Nothing. Once you open the file for output, that's the end of that. If you want to be in a certain directory you have to open the file there.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. help with text input
    By Alphawaves in forum C Programming
    Replies: 8
    Last Post: 04-08-2007, 04:54 PM
  2. Replies: 6
    Last Post: 01-03-2007, 03:02 PM
  3. Encryption program
    By zeiffelz in forum C Programming
    Replies: 1
    Last Post: 06-15-2005, 03:39 AM
  4. read from .txt file & put into array?
    By slow brain in forum C Programming
    Replies: 6
    Last Post: 02-25-2003, 05:16 AM
  5. My program, anyhelp
    By @licomb in forum C Programming
    Replies: 14
    Last Post: 08-14-2001, 10:04 PM