Thread: Code to search a file and take out a column

  1. #1
    Registered User
    Join Date
    Jan 2009
    Posts
    6

    Code to search a file and take out a column

    Hey guys, I am needing to make a code that can search through different files(that have 3 columns of numbers in them) and put all of the first columns in one file and all the 2nd in a different one.. etc.

    Ive had some basic C++ classes but im not quite sure how to go about this. Is this even possible in C++? Looking for some pointers to get started.

  2. #2
    Registered User
    Join Date
    Apr 2006
    Posts
    2,149
    Sure it's possible.

    You read the input file line by line, and extract each column , and immediately write that to it's corresponding output file. Loop until the end of the input file.

    This sounds like a job for a scripting language though. I wouldn't do this in C++ unless it was part of a larger program.
    It is too clear and so it is hard to see.
    A dunce once searched for fire with a lighted lantern.
    Had he known what fire was,
    He could have cooked his rice much sooner.

  3. #3
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Sure it's possible.

    Try say
    fin >> a >> b >> c;

    Then
    f1 << a;

    etc
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  4. #4
    Registered User
    Join Date
    Jan 2009
    Posts
    6
    Thanks guys. I also forgot to say that the files have like 4000 numbers in them. So if i read in the file do i read in each number or just the columns, because 4000 numbers is a lot lol.

    If you are wondering this is for some undergrad research im doing and the columns of numbers are energy density, pressure, and particle density for neutron stars.

  5. #5
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    That's what while loops are for.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  6. #6
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    I assume this is a CSV file. If so then read the file in either line by line or all at once. Go through each line, find the commas, and separate into vectors, lists, or some other data structure of your choosing.

    basic_string has find() and a host of other find_<x> functions that will assist you and make this a snap.

  7. #7
    3735928559
    Join Date
    Mar 2008
    Location
    RTP
    Posts
    838

  8. #8
    Registered User VirtualAce's Avatar
    Join Date
    Aug 2001
    Posts
    9,607
    C++ can handle this task quite easily and in very few lines of code. I see no need for a script here.

  9. #9
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    C++ can handle this task quite easily and in very few lines of code. I see no need for a script here.
    ^_^

    Salem already has it half written.

    Soma

  10. #10
    Registered User
    Join Date
    Jan 2009
    Posts
    6
    so the easiest way is using file in command?

  11. #11
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    A lot of the "what's best/easiest" depends on EXACTLY what the format is. If the file is simply something like this:
    Code:
    1 2 3
    4 5 6
    then using the method described by Salem will be the simples (and perfectly adequate as long as you don't need to care about "idiot entered rubbish into the file" or "file got destroyed by attempt to copy it that failed"etc) method.

    If you have a file that contains "extra decorations" (e.g. quotes or commas), then you will need to come up with a more advanced method of reading the data, as you need to somehow deal with those "decorations" - the method described by Salem won't work DIRECTLY, but if you know the format, and deal with it appropriately, it can be solved by a method SIMILAR to Salems - again with the provision that if you need to deal with error conditions, it gets a bit harder (handling errors ALWAYS adds more work than just doing the stuff that doesn't deal with errors - in fact, more of the code I write deal with "if(error)..." than the code that actualy does "real work").

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  12. #12
    3735928559
    Join Date
    Mar 2008
    Location
    RTP
    Posts
    838
    Code:
    <?php
    $filename = "path/to/file";
    $ext = ".csv"
    $infile = fopen($filename.$ext,"r");
    $line = fgetcsv($infile);
    $outpufiles=array();
    while($line)
    {
        $i=0;
        foreach(array_values($line) as $cell)
        {
            $i++;
            if(!$outputfiles[$i])
            {
                $outfiles[$i] = fopen($filename."col".$i.$ext,"a");
            }
            fwrite($outfiles[$i],cell);
        }
        $line = getcsv($infile);
    }
    foreach(array_values($outfiles) as $file)
    {
        fclose($file);
    }
    ?>
    the above is an example in PHP. whatever tool you decide to use, the algorithm will be something similar to the above.
    Last edited by m37h0d; 01-25-2009 at 01:31 PM. Reason: forgot second call to fgetcsv; fclose - modified to eliminate excessive fopen/fclose . is cat operator, not +

  13. #13
    l'Anziano DavidP's Avatar
    Join Date
    Aug 2001
    Location
    Plano, Texas, United States
    Posts
    2,743
    Your PHP code needs some work. You are calling fopen over every single iteration of the output loop. That means you will open each output file "n" times where "n" is number of rows of output. That's a bit too much.

    In addition, you are only parsing one line of the input CSV file.
    My Website

    "Circular logic is good because it is."

  14. #14
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Do you even need to write a program for this!
    If this is just for a one-off, you can just import it into Excel and then copy-n-paste the columns out into other files. (if you have access to Excel)
    Or you could just attach the file to your post here and I'll split it for you.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  15. #15
    3735928559
    Join Date
    Mar 2008
    Location
    RTP
    Posts
    838
    Quote Originally Posted by DavidP View Post
    Your PHP code needs some work.
    it was a quick and dirty example of generally how it might be done, not a turnkey solution. i never ran it. i don't even think i have PHP installed on this machine.

    Quote Originally Posted by DavidP View Post
    You are calling fopen over every single iteration of the output loop. That means you will open each output file "n" times where "n" is number of rows of output. That's a bit too much.
    normally i'd agree but for the purposes of the OP who cares?

    Quote Originally Posted by DavidP View Post
    In addition, you are only parsing one line of the input CSV file.
    good catch there. i forgot the call to fgetcsv at the end of the while loop.

Popular pages Recent additions subscribe to a feed