Thread: Writing to other files

  1. #1
    Registered User
    Join Date
    May 2009
    Posts
    27

    Writing to other files

    I have a rather complex program where I need to pull in data from a website using loops, direct that data to an output file for calculations and then direct everything to another file for raw data which will then be used for a different program.

    I understand there's a file redirection but I want to be sure I've got the procedure correct.

    to make it easy, I'll call the first file: "loop1", second file: "calc1", third file: "out1"


    To direct loop1 to calc1, I put this somewhere near the end of loop1 or at the beginning of calc1?

    loop1 > calc1


    Then, since calc1 will be for specific functions:

    loop1 < calc1 > out1


    What am I missing? I'm new at this and not sure of anything....

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Are you writing C or a shell script?

    If you're writing C, you should look into things like fopen and friends.

  3. #3
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    This is not C syntax, and to be honest this is a task better done in some other language. I am going to guess you found this info here:
    UNIX File Redirection
    Which is sort of misleading, as it seems to be part of a C tutorial (but it actually refers to shell commands; this would be a good thing to do with shell script *except* for the web grabbing part).

    Also, you refer to two files in your first sentence, then all of a sudden you are referring to three files...unless one is supposed to represent the web source?

    AFAIK you can't redirect "directly" this way in C. The closest you will get is something like this:
    Code:
    #include <stdio.h>
    
    int main() {
    	FILE *one=fopen("tmp.txt","r"), *two=fopen("copy.txt","w");
    	char buffer[128];
    	while (fgets(buffer,128,one)) fprintf(two,"%s", buffer);
    	return 0;
    }
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  4. #4
    Registered User
    Join Date
    May 2009
    Posts
    27
    Tabstop... C or Shell... Might be both.



    Quote Originally Posted by MK27 View Post
    This is not C syntax, and to be honest this is a task better done in some other language. I am going to guess you found this info here:
    UNIX File Redirection
    Which is sort of misleading, as it seems to be part of a C tutorial (but it actually refers to shell commands; this would be a good thing to do with shell script *except* for the web grabbing part).

    Also, you refer to two files in your first sentence, then all of a sudden you are referring to three files...unless one is supposed to represent the web source?

    AFAIK you can't redirect "directly" this way in C. The closest you will get is something like this:
    Code:
    #include <stdio.h>
    
    int main() {
    	FILE *one=fopen("tmp.txt","r"), *two=fopen("copy.txt","w");
    	char buffer[128];
    	while (fgets(buffer,128,one)) fprintf(two,"%s", buffer);
    	return 0;
    }
    Thanks... I only know C, and maybe shell, but the two run together to the point of where I'm royally confused. My programming experience is 4 years old and hasn't been put to use in that time.

    But for now...

    Yes, I got it from UNIX File Redirection, it's all I could find.

    Sorry if that first part was misleading. I meant file 1 is using wget and making loops, the second file is just calculations from file 1's output. The third (out1) is the file from which I will then gather the output from the other two files, and store to use in another program. If that requires, C, C-shell, I'm okay. Otherwise, I'd have to learn something completely new, and I'm not sure my boss has the patience for that.

    The code you mention here:

    1. Is [128] just a random size buffer, or do I need to figure out an exact buffer size? this will be a LARGE file.

    2. If I'm not printing to screen, but to the output and calc files, does fprintf become fgets2 or something like that? Do I need to return something else to send it to another file?


    Thanks!

  5. #5
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    If you're doing it via the shell, then you don't do jack inside the program itself. You just write the file to the screen and let the person who calls it decide where the output goes.

    128 is just a random size buffer.

    fprintf already is printing to a file. You don't print to the screen with fprintf (well, you can, but you don't).

  6. #6
    Registered User
    Join Date
    May 2009
    Posts
    27
    Quote Originally Posted by tabstop View Post
    If you're doing it via the shell, then you don't do jack inside the program itself. You just write the file to the screen and let the person who calls it decide where the output goes.

    128 is just a random size buffer.

    fprintf already is printing to a file. You don't print to the screen with fprintf (well, you can, but you don't).
    Thanks.

    So in other words, two files? One c program to make the loops/wget program and then a shell to write the file to the screen (and let the person ('me') who calls it decide where the output goes)?

  7. #7
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    You don't need to write a shell for this; I'm assuming you're already using a shell (csh, bash, whatever).

    At this point, you're actually going to have to define your terms: what does program 1 do, exactly? Does it *just* do the web scraping? Does it also do calculations? If it doesn't do calculations, then by definition you will need program 2 to do the calculations (unless that's somebody else's job; not clear). If your program 1 needs to write two different things (summary data and raw data), then you can't use redirection for that.

    (Edit: The point I'm making here is that "[write] a shell to write the file to the screen" doesn't make a lot of sense. The C program that gets the data will write the file to the screen; any shell script that you write (or shell tools that you use or whatever) would maybe redirect things that are already going to the screen.)
    Last edited by tabstop; 05-19-2009 at 11:07 AM.

  8. #8
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Liz320 View Post
    I only know C
    I think you are exaggerating Here is some more info:

    Quote Originally Posted by Liz320 View Post
    1. Is [128] just a random size buffer, or do I need to figure out an exact buffer size? this will be a LARGE file.
    It can be any size, as long as it matches the parameter used in fgets. The significant factor is not the total size of the file, it is the length of each line in the file. However, even that is not such a big deal because if the line is longer than 127 bytes (fgets saves one for the null terminator), it will pick up where it left off. So technically, you could make buffer only 2 bytes long, but that will not be very efficient.
    Quote Originally Posted by Liz320 View Post
    2. If I'm not printing to screen, but to the output and calc files, does fprintf become fgets2 or something like that? Do I need to return something else to send it to another file?
    No. I have never heard of "fgets2". Anyway, printf is to stdout (usually the screen). fprintf can be any stream, eg, fprintf(stdout, "blah") would be to the screen. But instead, I used the FILE handle.

    Using a shell script to do this task would be easier since you can just use the same commands as you would on the command-line
    Code:
    #!/bin/bash
    
    wget -r -np -k -p http://cboard.cprogramming.com/ -o LOG &&
    will start wget and background (fork) it so you can now include more commands. The shell has it's own language, for example at the command line try:
    Code:
    for x in *;do if [ -f $x ]; then echo $x; fi; done
    Which means you could write a fairly short shell script to accomplish this. The only problem(s) with learning bash (as opposed to C) is that there are much fewer resources to help you*, and it is terribly picky and tricky to debug. If you do this kind of thing often enough and/or linux is your primary OS, I would recommend ordering a (beginner) book on bash/"the shell".

    Perl is also a great option. There are lots of resources for that. But if you want to stick with C, stick with C. In any case, I would say as a near total beginner this might take you a while to work out. But it would be a good "introductory", post "hello world" exercise in any language.

    *there are some tutorial and the "Bash Beginner's Guide". There is actually a bash oriented forum somewhere too, but the response time is probably slow.
    Last edited by MK27; 05-19-2009 at 11:10 AM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  9. #9
    Registered User
    Join Date
    May 2009
    Posts
    27
    Quote Originally Posted by tabstop View Post
    You don't need to write a shell for this; I'm assuming you're already using a shell (csh, bash, whatever).

    At this point, you're actually going to have to define your terms: what does program 1 do, exactly? Does it *just* do the web scraping? Does it also do calculations? If it doesn't do calculations, then by definition you will need program 2 to do the calculations (unless that's somebody else's job; not clear). If your program 1 needs to write two different things (summary data and raw data), then you can't use redirection for that.
    I was using C, unless, as I mentioned, shell would be more appropriate for a given step. (addressed further)

    Program 1 is pulling info off the web. That's it for that one. I'm planning for it to be all raw data.

    Program 2 is just calculations using a few of the variables from program 1. (again, all raw data, or what works best).

    Program 3 needs to feed everything into a spreadsheet eventually. So I figured it's just storing data in a better form for the spreadsheet. It made sense to me to split these up. I am getting conflicting information as to how large this will be, so it may depend on how I approach it. Last thing I want to do is make something so large that it can't run or it runs in the terabytes or something insane and I make someone mad at me.

    Yes, (God help us all) I am handling this entire mess!

    Thanks again! [*I got your edit too]


    Quote Originally Posted by MK27 View Post
    I think you are exaggerating
    you're right, I'm exaggerating. I 'hardly' only know C!


    Quote Originally Posted by MK27 View Post
    It can be any size, as long as it matches the parameter used in fgets. The significant factor is not the total size of the file, it is the length of each line in the file. However, even that is not such a big deal because if the line is longer than 127 bytes (fgets saves one for the null terminator), it will pick up where it left off. So technically, you could make buffer only 2 bytes long, but that will not be very efficient.
    So I'm essentially sending each line to the new file? I just need to know the size of the line, then.




    Quote Originally Posted by MK27 View Post
    No. I have never heard of "fgets2". Anyway, printf is to stdout (usually the screen). fprintf can be any stream, eg, fprintf(stdout, "blah") would be to the screen. But instead, I used the FILE handle.
    I'm sorry. I didn't know fprintf was file. I associated it with printing on screen, which just got cleared up. fgets2 hasn't been invented....................................yet.



    Quote Originally Posted by MK27 View Post
    Using a shell script to do this task would be easier since you can just use the same commands as you would on the command-line
    Code:
    #!/bin/bash
    
    wget -r -np -k -p http://cboard.cprogramming.com/ -o LOG &&
    will start wget and background (fork) it so you can now include more commands. The shell has it's own language, for example at the command line try:
    Code:
    for x in *;do if [ -f $x ]; then echo $x; fi; done
    Which means you could write a fairly short shell script to accomplish this. The only problem(s) with learning bash (as opposed to C) is that there are much fewer resources to help you*, and it is terribly picky and tricky to debug. If you do this kind of thing often enough and/or linux is your primary OS, I would recommend ordering a (beginner) book on bash/"the shell".

    Perl is also a great option. There are lots of resources for that. But if you want to stick with C, stick with C. In any case, I would say as a near total beginner this might take you a while to work out. But it would be a good "introductory", post "hello world" exercise in any language.

    *there are some tutorial and the "Bash Beginner's Guide". There is actually a bash oriented forum somewhere too, but the response time is probably slow.
    Definitely not a bad idea to learn Bash/Perl at some point. I am hoping to master C, but I know there are shortcuts. My program is rediculously large. It's like asking someone who's built a treehouse to build a skyscraper. Breaking it down does seem to help.

    I'm having to parse URLs and then loop those several times to get all of the data in (too many variables 'within' the URL are dependent on one another). I thought that a bit of code at the end of the script would help direct it to a new file (I was saying this before looking at your notes above about the lines, and Tabstop's edit which makes sense). My experience level, coupled with the amount of data I'm working with has me treading very cautiously when I do finally run it. Just making sure I know the estimated output and give someone fair warning on the memory allocation.

    Thanks to you both, MK27 and Tabstop!

  10. #10
    Registered User
    Join Date
    May 2009
    Posts
    27
    Another question...

    By using Shell, am I essentially going to be using redirects?

  11. #11
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    Quote Originally Posted by Liz320 View Post
    Another question...

    By using Shell, am I essentially going to be using redirects?
    Do you mean "if I use redirection in a shell, am I going to be using redirects?" If so, the answer is yes. If you don't mean that, then you have to explain what "using Shell" refers to.

  12. #12
    Registered User
    Join Date
    May 2009
    Posts
    27
    Quote Originally Posted by tabstop View Post
    Do you mean "if I use redirection in a shell, am I going to be using redirects?" If so, the answer is yes. If you don't mean that, then you have to explain what "using Shell" refers to.
    I meant, as we discussed yesterday, that I'd be writing a shell script to redirect the files (again, as was mentioned yesterday). So, having just figured out what "redirects" are, I just wanted to confirm I was on track.

    Thanks.

  13. #13
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Liz320 View Post
    Another question...

    By using Shell, am I essentially going to be using redirects?
    If you use a shell script you can use those unix shell methods from your Original Post, yeah.

    Also, I woke up sweating in the middle of the night because I suddenly realized I did this
    Quote Originally Posted by MK27 View Post
    Code:
    #!/bin/bash
    
    wget -r -np -k -p http://cboard.cprogramming.com/ -o LOG &&
    will start wget and background (fork) it so you can now include more commands.
    That should be &, not &&. && will wait for the process to sucessfully complete. Maybe that is a good idea. Anyway, & will fork and move on immediately (I hate the idea of someone being frustrated by my mistake).

    You can actually use $1 (first parameter) here too, so if the script started:
    Code:
    #!/bin/bash
    
    wget -r -np -k -p $1 -o LOG &
    you could launch it "myscript http://cboard.cprogramming.com".

    Also, that thing about C & the line length: no, you do not need to know it. You just need to make sure the buffer length and the fgets parameter are the same. fgets() stops at a new line, but if the line is too long, it will stop before that, and pick up where it left off on the next pass. So it doesn't matter how long the lines are. Since you are using *nix, it would also be fine to use fread(), which does not stop at a new line and hence would be more efficient since it will always completely fill the buffer.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  14. #14
    Registered User
    Join Date
    May 2009
    Posts
    27
    Quote Originally Posted by MK27 View Post
    If you use a shell script you can use those unix shell methods from your Original Post, yeah.

    Also, I woke up sweating in the middle of the night because I suddenly realized I did this

    That should be &, not &&. && will wait for the process to sucessfully complete. Maybe that is a good idea. Anyway, & will fork and move on immediately (I hate the idea of someone being frustrated by my mistake).

    You can actually use $1 (first parameter) here too, so if the script started:
    Code:
    #!/bin/bash
    
    wget -r -np -k -p $1 -o LOG &
    you could launch it "myscript http://cboard.cprogramming.com".

    Also, that thing about C & the line length: no, you do not need to know it. You just need to make sure the buffer length and the fgets parameter are the same. fgets() stops at a new line, but if the line is too long, it will stop before that, and pick up where it left off on the next pass. So it doesn't matter how long the lines are. Since you are using *nix, it would also be fine to use fread(), which does not stop at a new line and hence would be more efficient since it will always completely fill the buffer.

    Thanks! Although, I'm still trying to find out what '-r -np -k -p $1 -o LOG' refer to, so I wouldn't have gotten too far to be frustrated. Hope you slept okay after that. You're right on bash not having much in the way of tutorials, by the way.

    Dumb question.. *nix = unix?

  15. #15
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Liz320 View Post
    Thanks! Although, I'm still trying to find out what '-r -np -k -p $1 -o LOG' refer to, so I wouldn't have gotten too far to be frustrated. Hope you slept okay after that. You're right on bash not having much in the way of tutorials, by the way.

    Dumb question.. *nix = unix?
    Yes, that way we include unix, linux, and the beats

    -r recursive
    -np don't go into parent directories for anything, except...
    -p do retrieve stylesheets & similar
    -k convert all links relative to local copy
    -o LOG = write output to a file called LOG

    I got that from the man page as a means to download an entire site, and it actually works very well most of the time. No muss, no fuss, just straight to index.html...

    That bash forum is at bashscripts.org, but the posts are every few days, so make sure it's a good one. There is a slightly more active GNU bash forum at gnabble, but the former site has other, more varied resources. The Linux Documentation Project (LDP) maintains a "Bash Beginner's Guide" (and an "Advanced" which the two are not really related in a "read the beginner first!" way, so look through them both.

    Bash is very picky about spaces, but the error messages do not explain that.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Problem with writing to files
    By beanroaster in forum C++ Programming
    Replies: 10
    Last Post: 12-23-2007, 12:21 AM
  2. Reading & Writing files (Error)
    By Blackroot in forum C++ Programming
    Replies: 9
    Last Post: 01-10-2006, 11:55 AM
  3. Writing files to a CD
    By SyntaxBubble in forum Windows Programming
    Replies: 1
    Last Post: 04-16-2003, 04:43 PM
  4. *.COM Files? Writing them?
    By johnc in forum A Brief History of Cprogramming.com
    Replies: 13
    Last Post: 07-11-2002, 01:52 AM
  5. Making files, opening them, and writing to them
    By Unregistered in forum Game Programming
    Replies: 6
    Last Post: 06-18-2002, 09:57 PM