Thread: String fun

  1. #1
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168

    String fun

    I have a comma seperated value file.
    i need to look at it and find \n <---carrage return
    then insert a value just before that \n
    then continue searching from that \n for the next \n



    I am creating a spreadsheet

  2. #2
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    i was planning on using
    char *strchr(const char *, int); to find the \n
    and i dont know how to insert a value into the string.

    or is this possible?

  3. #3
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Assuming you have a long enough buffer[1], you should be able to use "fgets" to read the string, and once it's been read, the last char according to strlen should be '\n'. Verify that this is the case, remove it and strcat the new value at the end of the string.

    Remember that you will have to read from one file and write to a different one, as you can't (trivially) insert text into an existing file.

    [1] If you don't know AT ALL how long your line may be, this is perhaps not the best idea... But as long as we're discussing something that may be a spread-sheet, it probably won't be ginormous lines, so some statically allocated string of (say) 10000 chars should be sufficient.

    --
    Mats

  4. #4
    Registered User
    Join Date
    Jul 2006
    Posts
    162
    \n is not a carriage return, \r is. \n is new-line.
    Last edited by simpleid; 08-16-2007 at 12:51 PM.

  5. #5
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by simpleid View Post
    \n is not a carriage return, \r is. \n is new-line.
    Technically, '\r' is "Carriage Return" and '\n' is "linefeed" or "newline". The first returns the print-head (or cursor) to the beginning of the line, the second forwards the paper (cursor) one line.

    Windows stores both in the file, whilst Unix/Linux traditionally stores only newline and prepends a carriage return in the actual output "automagically" where needed. Mac's have previously used the other way around, storing '\r' only, and appending a '\n' when needed in the output process. Since more recent MacOS are based in BSD(?), they probably have the same mechanism as Linux.

    And if you are NOT reading/writing the file in binary, the difference is hidden in Windows and MacOS as long as you use the stanard file-io functions (as opposed to using the native system calls for example) - there's only a '\n' at the end of the line.

    --
    Mats

  6. #6
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    ok, i have 20x 128 000 000 data points in this spreadsheet so this isnt going to work, its wayyy to slow

    im going to do something crazy and write this data to a OBDC MySQL database, im scared but here goes.

  7. #7
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    how do i get mysql.h

    i think i need it for my include.

    also
    MySQL databases may be used by programs written in the C programming language on Socrates and Plato and on the IS Solaris workstations
    what?

  8. #8
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by chico1st View Post
    ok, i have 20x 128 000 000 data points in this spreadsheet so this isnt going to work, its wayyy to slow

    im going to do something crazy and write this data to a OBDC MySQL database, im scared but here goes.
    Writing it to a SQL database is probably not the solution either - the SQL database engine can't write to the disk any faster than any other method of storing data to disk - and that is by far the biggest factor in writing such a large amount of data to disk. On top of that, you'll have the SQL database trying to keep indices and such of your data, which won't make it any better.

    So, you want to write 20 x 128000000 16-bit data entries to disk, right? And how long a time do you have to do that?

    --
    Mats

  9. #9
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    nevermind i got mysql.h

    the reason i want to use the database is that i can just write to a new field as opposed to finding the new line and writing to a new file when i need to add data.

    the data i get comes in arrays of 1x128 000 000. So writing to a spreadsheet (tdv,csv) is no good. do to all the string work i need to do.

    I have a few hours i can do this in, lets say 3. I think the database will be very nice to work with once i figure out how to access it. I have experience in MySQL but never with C/MySQL before.

  10. #10
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Let's see if I got this right...

    You get some 128M x 20 (or 20 x 128M - whichever way you want to put it), and you want to create a CSV-file of it, yes?

    And you want this file to be ready in less than 3 hours or thereabouts?

    Why not just create a binary file-format first, and then convert that to a .csv (or get a machine with a bit more than 20x 128M x sizeof(datatype) of RAM, then just write it when you've collected all the data? It's only just over 5GB if you can store it as 16-bit integers. You'll need a 64-bit version of the OS (and related 64-bit compiler of course) - but that's available from all major OS vendors (Microsoft, Apple and Linux distros).

    I'll have a little play and see if I can come up with any other suggestions.

    --
    Mats

  11. #11
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    nono i get 1x128 000 000 arrays at 20 different points, about 10 minutes apart.
    then i have to stick them together in a particular format and create a binary file with it, then it goes into someone elses program.

  12. #12
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    I just wrote 20 x 128M 16-bit numbers to a file on my not-very-new-but-not-ancient machine, and it took 430 seconds (doing it in 20000 at a time). That's in a binary file.

    Of course, if you only get 128M numbers at a time, you'll probably end up doing this many times.

    One solution I can think of is this:

    Use a fixed format CSV file - that is, pad the numbers with suitable number of spaces/zeros so that each entry takes up exactly the same amount of space, and have all 20 columns present in the file. Then all you need to do is to put the right data in the right place in the file, and you can calculate where the data belongs (saving the string manipulation at least to some extent, as well as saving the need to read one file and write another).

    So your file would look a bit like this:
    Code:
    00.00000,01.00000,02.00000,03.00000,04.00000,05.00000,06.00000,07.00000
    -1.00000,-2.00000,-3.00000,-4.00000,
    Does that make some sense?

    Just don't forget to figure out if '\n' is one or two characters in the implementation you have.

    --
    Mats

  13. #13
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    If you want one binary file with 20x 128M numbers, where you get each 128M array at 10minutes interval, why not just write each 128M array to disk in a separate file, then read from 20 different files into one 2D array and write that back out to a single file? Shouldn't be majorly much longer than double the 430 seconds my experiment took. And it's dead simple.

    --
    Mats

  14. #14
    Wanabe Laser Engineer chico1st's Avatar
    Join Date
    Jul 2007
    Posts
    168
    yeah thats what im doing but it took a long time.

  15. #15
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by chico1st View Post
    yeah thats what im doing but it took a long time.
    How long?

    --
    Mats

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 4
    Last Post: 03-03-2006, 02:11 AM
  2. Calculator + LinkedList
    By maro009 in forum C++ Programming
    Replies: 20
    Last Post: 05-17-2005, 12:56 PM
  3. Classes inheretance problem...
    By NANO in forum C++ Programming
    Replies: 12
    Last Post: 12-09-2002, 03:23 PM
  4. creating class, and linking files
    By JCK in forum C++ Programming
    Replies: 12
    Last Post: 12-08-2002, 02:45 PM
  5. Warnings, warnings, warnings?
    By spentdome in forum C Programming
    Replies: 25
    Last Post: 05-27-2002, 06:49 PM