Thread: Unix Minishell Need Help Adding Pipelines

  1. #1
    Registered User
    Join Date
    Nov 2012
    Posts
    8

    Unix Minishell Need Help Adding Pipelines

    Hey guys,



    So I'm making a UNIX minishell, and am trying to add pipelines, so I can do things like this:


    Code:
    ps aux | grep dh | grep -v grep | cut -c1-5
    However I'm having trouble wrapping my head around the piping part. I replace all the "|" characters with 0, and then run each line as a normal line. However, I am trying to divert the output and input. The input of a command needs to be the output of the previous command, and the output of a command needs to be the input of the next command.
    I'm doing this using pipes, however I can't figure out where to call pipe() and where to close them. From the main processing function, processline(), I have this code:

    Code:
    if((pix = findUnquotChar(line_itr, '|')))
    {
    	line_itr[pix++] = 0;
    	if(pipe (fd_a) < 0) perror("pipe");
    	processline(line_itr, inFD, fd_a[1], pl_flags);
    	line_itr = &(line_itr[pix]);
    
    	while((pix = findUnquotChar(line_itr, '|')) && pix < line_len)
    	{
    		close(fd_a[1]);
    		line_itr[pix++] = 0;
    		if(pipe (fd_b) < 0) perror("pipe");
    		processline(line_itr, fd_a[0], fd_b[1] pl_flags);
    		close(fd_a[0]);
    		close(fd_b[1]);
    		line_itr = &(line_itr[pix]);
    	}
    	return;
    }
    So, I'm recursively(the code above is in processline) sending the commands in between the "|" to be processed by processline.


    The 2nd and 3rd parameter of processline are the inputFD and outputFD respectively, so I need to process a command, write the output to a pipe, and then call processline again on the next command, however this time the output of the previous command is the input.


    This just doesn't seem like it can work though with one pipe, so in the above example I'm trying to make it work with two, by possibly flip flopping back and forth between using fd_a[0] as input for commands that occur during an odd iteration in the while loop, and use fd_b[0] for input on commands that occur during an even iteration.



    I'm just having trouble seeing how this is possible with a single pipe, if you guys need any additional info just ask. Here's the entire processline function in case you want to take a look:
    [C] processline - Pastebin.com

  2. #2
    Registered User
    Join Date
    Sep 2001
    Posts
    4,912
    Not sure I follow your current approach, entirely, but... if your pipeline consists of n commands, you need n - 1 pipes, and you probably want to assign them iteratively rather than recursively. I would create an array of pipes (really an array of pairs of file descriptors return by pipe()), and then loop through the commands deciding which file descriptors to set in the commands (if any - stdin will be the default input and stdout will be the default output for all of them). Like so:
    Code:
    stdin    -> cmd1 -> fd[0][0]
    fd[0][1] -> cmd2 -> fd[1][0]
    fd[1][1] -> cmd3 -> fd[2][0]
    fd[2][1] -> cmd4 -> fd[3][0]
    fd[3][1] -> cmd5 -> stdout

  3. #3
    Registered User
    Join Date
    Nov 2012
    Posts
    8
    Quote Originally Posted by sean View Post
    Not sure I follow your current approach, entirely, but... if your pipeline consists of n commands, you need n - 1 pipes, and you probably want to assign them iteratively rather than recursively. I would create an array of pipes (really an array of pairs of file descriptors return by pipe()), and then loop through the commands deciding which file descriptors to set in the commands (if any - stdin will be the default input and stdout will be the default output for all of them). Like so:
    Code:
    stdin    -> cmd1 -> fd[0][0]
    fd[0][1] -> cmd2 -> fd[1][0]
    fd[1][1] -> cmd3 -> fd[2][0]
    fd[2][1] -> cmd4 -> fd[3][0]
    fd[3][1] -> cmd5 -> stdout

    That actually seems fairly close to what I'm trying to do. I'm iterating through each command separated by pipes, so if I had:


    echo a | echo b | echo c | echo d

    I would have one loop for "echo a", "echo b", "echo c", and "echo d" respectively. I then pass those 4 commands recursively to my shell again, with an input file descriptor, an output file descriptor, and a few flags.


    My teacher said I can't have more than two pipes open simultaneously, so I can't do the approach you suggested exactly like that. Right now I'm trying an approach like this:


    Code:
    if((pix = findUnquotChar(line_itr, '|')))
    {
    	line_itr[pix++] = 0;
    	if(pipe (fd) < 0) perror("pipe");
    	processline(line_itr, inFD, fd[1], pl_flags);
    	close(fd[1]);
    	line_itr = &(line_itr[pix]);
    
    
    	while((pix = findUnquotChar(line_itr, '|')) && pix < line_len)
    	{
    		if(pix != 0) line_itr[pix++] = 0;
    		if(fdFlipF)
    		{
    			if(pipe(&(fd[2])) < 0) perror("pipe");
    			if(!pix) fd[3] = 1;
    			processline(line_itr, fd[0], fd[3], pl_flags);
    			close(fd[3]);
    			close(fd[0]);
    		}
    		else
    		{
    			if(pipe(fd) < 0) perror("pipe");
    			if(!pix) fd[1] = 1;
    			processline(line_itr, fd[2], fd[1], pl_flags);
    			close(fd[1]);
    			close(fd[2]);
    		}
    
    
    		line_itr = &(line_itr[pix]);
    		fdFlipF = !fdFlipF;
    	}
    	return;
    }

    This might be somewhat hard to understand, but I'm having to use two different pipe variables, and I flip flop between them. I haven't got it working yet, and it doesn't seem like the right way to do it(although I think it will work eventually). My minishell doesn't follow standard shell protocols and whatnot, it's not implemented like the bash.


    If for some reason you feel like you have some time on your hands, here's the main file of my shell. It might help you understand what I'm trying to do.

    [C] msh.c - Pastebin.com


    Thank you so much for the help, I've been banging my head against the wall trying to figure this out.

  4. #4
    Registered User
    Join Date
    Sep 2001
    Posts
    4,912
    Oh I see, that makes sense now. So the reason this is hard to do with a single pipe is that as soon as you reuse that pipe for a second command, it's going to start writing data into the same pipe it's trying to read from - and it's just going to end up reading in it's own output once the intended data is out. If you were to do this with a single pipe your program would need to read everything from the output pipe in to a memory buffer, and then re-write it to the next input pipe before the next program could run.

    So your approach is to use 2 pipes, so that instead of having to read, buffer, and write it yourself, each alternating program will do that for you, leaving the previous pipe available and empty for the next program to use. That's good - seems like it would work. I see a couple of problems with your implementation though. As a general recommendation, I still think this might be cleaner for you to implement this iteratively, using a loop instead of a recursive function. I'd suggest you see if you can write a simple example that just pipes data between two hard-coded commands, then three, and then scale up to a loop once you get that working.


    • You're creating new pipes every time you call processLine(), so technically you still aren't complying with what your teacher wants - you'll create more than 2 pipes. You can ensure and prove that you never use more than 2 pipes by creating your pipes ahead of time, and then just passing them around to the commands that need them.



    • It looks like you're overwriting one end of the pipe with stdin for your first command, and trying to use dup to use stdout for your last command. What you actually want to do is leave the results of pipe() alone and use each pair of fds between consecutive commands. Does that make sense?

  5. #5
    Registered User
    Join Date
    Nov 2012
    Posts
    8
    Quote Originally Posted by sean View Post
    Oh I see, that makes sense now. So the reason this is hard to do with a single pipe is that as soon as you reuse that pipe for a second command, it's going to start writing data into the same pipe it's trying to read from - and it's just going to end up reading in it's own output once the intended data is out. If you were to do this with a single pipe your program would need to read everything from the output pipe in to a memory buffer, and then re-write it to the next input pipe before the next program could run.

    So your approach is to use 2 pipes, so that instead of having to read, buffer, and write it yourself, each alternating program will do that for you, leaving the previous pipe available and empty for the next program to use. That's good - seems like it would work. I see a couple of problems with your implementation though. As a general recommendation, I still think this might be cleaner for you to implement this iteratively, using a loop instead of a recursive function. I'd suggest you see if you can write a simple example that just pipes data between two hard-coded commands, then three, and then scale up to a loop once you get that working.


    • You're creating new pipes every time you call processLine(), so technically you still aren't complying with what your teacher wants - you'll create more than 2 pipes. You can ensure and prove that you never use more than 2 pipes by creating your pipes ahead of time, and then just passing them around to the commands that need them.



    • It looks like you're overwriting one end of the pipe with stdin for your first command, and trying to use dup to use stdout for your last command. What you actually want to do is leave the results of pipe() alone and use each pair of fds between consecutive commands. Does that make sense?

    I figured it out finally, I read your comment and rewrote it, and then talked to my teacher, and he gave me a few additional tips. Here's what I have not, which seems to work great:

    Code:
    if((pix = findUnquotChar(line_itr, '|')))
    {
    	line_itr[pix] = 0;
    
    
    	if(pipe (fd) < 0) perror("pipe");
    	processline(line_itr, inFD, fd[1], pl_flags);
    	close(fd[1]);
    	saveFD = fd[0];
    	line_itr = &(line_itr[pix+1]);
    		
    	while((pix = findUnquotChar(line_itr, '|')) && pix < line_len)
    	{
    		line_itr[pix] = 0;
    			
    		if(pipe (fd) < 0) perror("pipe");
    		processline(line_itr, saveFD, fd[1], pl_flags);
    		close(fd[1]);
    		close(saveFD);
    		saveFD = fd[0];
    		line_itr = &(line_itr[pix+1]);
    	}
    
    
    	pl_flags.WAIT = FLAGS.WAIT;
    	processline(line_itr, saveFD, outFD, pl_flags);
    	close(saveFD);
    
    
    	return;
    }

    Now I just have one problem, unless I explicitly tell my shell to wait for each command, it finishes before the commands can get the output all the way to the last command. So, nothing gets output, however if I tell it to wait each and every time it works pretty well.

    How exactly do I wait for all the zombies once the shell is finished?


    Thanks

  6. #6
    Registered User
    Join Date
    Nov 2012
    Posts
    8
    I should also add, waiting on all the zombies doesn't work, either that or I did it wrong. I tried this:


    while(waitpid(-1, &status, WNOHANG > 0));


    For some reason the only thing that works at the moment is by waiting each and every time. Do you have any idea what one does in this situation?

  7. #7
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > while(waitpid(-1, &status, WNOHANG > 0));
    Erm do you have the () matched up correctly here?

    while( waitpid(-1, &status, WNOHANG) > 0 );
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  8. #8
    Registered User
    Join Date
    Nov 2012
    Posts
    8
    Quote Originally Posted by Salem View Post
    > while(waitpid(-1, &status, WNOHANG > 0));
    Erm do you have the () matched up correctly here?

    while( waitpid(-1, &status, WNOHANG) > 0 );

    Ha, you're right, I didn't. I didn't copy paste this though, I just typed it out, that's probably why.


    I actually have nearly found a solution. I changed the way SIGCHLD is handled by passing it SIG_IGN. However then I realized that if I did this, I wouldn't be able to collect the exit status's of each child process.



    So, I am working on a new method. What I'm currently considering trying is making a sig_handler function for SIGCHLD, that will increment an sig_atomic_t global n variable each time a child is spawned. Then at the end I will wait n times, ensuring I take care of the zombies and wait the correct number of times, meanwhile I will be able to collect the exit status of the last command in a pipe. I am currently waiting in another file(expand.c) for command expansion, so when I wait I will decrement n. Does this seem like an appropriate way to handle the problem?



    I was wondering if it would be possible to use a single signal_handler, as I currently already have one that handles SIGINT. I was thinking of combining the two, and having a switch/if statement determining what the signal number is, and then reacting accordingly.



    If not, could I at least use the same struct sigaction variable for both handlers, by setting the properties of the sigaction variable, calling sigaction for SIGINT, changing the struct sigaction variables, and then calling sigaction for SIGCHLD?


    Thanks

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Pipelines - please help, 6 hours and I am losing my mind.
    By Johnny_010 in forum C Programming
    Replies: 4
    Last Post: 04-04-2012, 02:06 PM
  2. UNIX (Linux, BSD, etc) Programming :: UNIX
    By kuphryn in forum Linux Programming
    Replies: 6
    Last Post: 04-01-2004, 08:44 PM
  3. Mission Unix Dumb-Ass to Unix-guru
    By bookworm in forum A Brief History of Cprogramming.com
    Replies: 59
    Last Post: 07-20-2003, 01:47 AM
  4. C and UNIX Q's?
    By Unregistered in forum C Programming
    Replies: 0
    Last Post: 06-13-2002, 05:39 AM
  5. Unix C++
    By Unregistered in forum C++ Programming
    Replies: 1
    Last Post: 04-12-2002, 08:20 PM

Tags for this Thread