Thread: Data Organization and Excel

  1. #1
    Registered User
    Join Date
    May 2008
    Posts
    2

    Data Organization and Excel

    Hi guys, I just finished a course in C programming and hardly learned anything. My teacher was a first time grad student teaching. The book was wordy and too long to read. Anyway i walked away with a few basic concepts and a strong fascination for programming
    //my major is Biochemistry this is a big deal.

    I changed my major and this class doesn't count toward anything, but I wish to use what i learned to help my friend out. At the moment she is working in a research lab which has some sort of machine which outputs .csv files. After converting them to txt files she delimits in excel and spends hours on end moving the data around in excel using copy paste.
    /*this is years of data we are talking about with each sample giving off about 12-15 data points per week */
    Anyway as i was learning about functions, strings and finally structures I thought i could help her out by just writing one program which could do all the work for her.

    /* If your still reading your a real trooper*/

    Just to make it right I don't need code I just need some direction
    My question is what is the best method of reorganizing this data.
    Each data set for each sample is in a csv file in a folder.
    Each folder has that weeks csv files for each sample. Theres a lot of folders in one big folder
    Ideas
    1. Use C to reorganize the data txt file by txt files. Then delimit them
    2. Is it possible to use C to organize all the txt files in a folder and ouput to one big txt file and delimit that.
    3. Is it possible to grab all the txt files in the big folder and organize them out into a bigger txt file and delimit that?
    4. What functions should i read up on?
    5. Should I even be using C for this or should i just purge the txt files and use excel to reorganize them.
    I really don't know the best course of action here. I do know that programming even if it takes me 10 hours to figure it out will in the long run save my friend days of mindless copy pasting.
    Thanks for reading I know this is a bit wordy compared to other posts /*like that book*/ I just wanted you to get all the details.
    Last edited by uniqst3r; 05-25-2008 at 10:43 PM.

  2. #2
    Registered User
    Join Date
    Apr 2008
    Posts
    396
    2. Is it possible to use C to organize all the txt files in a folder and ouput to one big txt file and delimit that.
    Yes.

    3. Is it possible to grab all the txt files in the big folder and organize them out into a bigger txt file and delimit that?
    What's the difference with the previous option?

    4. What functions should i read up on?
    You will have to read doc/tutorials about file-related functions like fread/fwrite/fopen/fclose and directory-related functions like opendir/readdir etc. those last ones are system dependent.

    5. Should I even be using C for this[...]
    It's true an higher level script language could make the development easier here. But you must take into account the time to master this other language.

  3. #3
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    As a general rule, I'd recommend doing whatever needs to be done using Excel - and teaching your friend the easy way to do this work in Excel, as well.

    Teaching your friend to be a fisherman, rather than just giving her a fish for today, idea. That would allow her to possibly extend whatever you do, even if you're not around to assist, later on.

    Be sure you have a current backup of the data, before you start doing anything with it, of course!

    If you still want to do this programmatically, then any language you know should be capable of doing it. C is not somehow better at this. Faster run time, and longer to program it, probably; but not better.

    Until you post up examples of the work as it is now, and as you want it to be, we know nothing of the real details. Details are details, they are not summary descriptions. You have mentioned moving files and folders around, and delimiting data. You have not said or better yet, shown why, this is necessary or advantageous.

    My summary answer then, is to use Excel, if at all possible. Whatever way you do it, be sure to back up your data first, and check your results very thoroughly, for errors.

    Nothing would be worse for your friend than having to explain to her boss, that the data she was to work on, is now corrupted because of a program that was made up to "save her work". Another angle on this is that if you make it too easy, the bosses may say "we don't need her anymore", and she could lose her job.

    Good luck!

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Two snippets of perl

    while ( <> )
    This reads each line in turn from all the input files listed on the command line. If you need to know the name of each file on the command line, you can get that too.

    my @f = split(/ /,$_);
    splits the current input line into an array of strings (that's @f), using space as a delimiter. You can choose your delimiters by changing / /

    The first is probably a dozen lines of code, but the second might take you a couple of weeks to produce something solid in C. You have to manage all the memory yourself.

    > 2. Is it possible to use C to organize all the txt files in a folder and ouput to one big txt file and delimit that.
    Almost all programming languages will allow you to create programs which can support
    myProg *.txt

    > 3. Is it possible to grab all the txt files in the big folder and organize them out into a bigger txt file and delimit that?
    Again, */*.txt is a good bet.

    > which outputs .csv files. After converting them to txt files she delimits in excel
    Erm, excel will read .csv without the need to convert anything. This seems like an unnecessary step.

    Another idea would be to experiment with excels macro record/replay feature. One thing you can do from that is look at the source code, which is VBA. This might be another route to where you want to get to. It too would be easily capable of performing the actions you've suggested on a single document.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    May 2008
    Posts
    2
    ok a few things
    first off there is a data obtaining machine which outputs the files with no extension at all and she has to rename them to txt/csv. I taught her to do that with bat files.

    About teacher her to catch fish vs give her fish well i just took a semester long course in C and i doubt she'll be upto learning a whole programming language even though it will save her time.(i asked her) .

    Ofcourse we will save the data.

    She won't lose her job b/c it is volunteer work and her "bosses" wan't the data in a cetain format. They don't care how long it takes her to do it.

    The work is sort of top secret she changes the data up before giving it to me while maintaining the same format.

    My problem is that i'm not all that good with C because i can never think of the programs while i do understand how they work once i read them.
    ex.
    When asked to write a program that writes factorials for the first time. I had no idea even after hours staring at #include stdio.h
    and int main (void)

    after that i was lost. Once i saw the code i was like oh ok that makes sense.

    This happened with every assignment.
    I actually don't have the txt she sent me on this comp. I'll try to get it and give more details but basically its a txt file that has headings and then data under it. ex.

    SAMPLE # TEMP CONDITION
    1 23 blue
    2 32 red
    3 23 red
    4 29 blue
    5 42 blue

    etc etc.
    To clarify samples are taken from one subject (blood samples etc) each has his/her own txt file. These txt files are in a folder together for all the subjects for that week.
    The folder for that week is in another folder that holds the txt files for as long as the exp is run.

    The folder and txt file names are organized by date and subject numbers respectively.
    The only variable is when there is sometimes not the same number of txt files in some folders due to inability to obtain data. Or some subjects may have died. (they blood cultures or something)

    So it is confirmed it is possible. Now im gonna get more details and read up on the aforementioned functions and their use.

    If anyone has any good sources on the web i can read please provide links or something.
    Thank you.
    Last edited by uniqst3r; 05-26-2008 at 01:41 AM.

Popular pages Recent additions subscribe to a feed