Thread: C CGI program, and CONTENT_TYPE and boundaries.

  1. #1
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332

    C CGI program, and CONTENT_TYPE and boundaries.

    I'm writing my first CGI program, and it's in C. Right now, I'm just output all the environment variables and their values.

    My question relates to the method="post" action="multipart/form-data" combination, and the boundary separators.

    As seen in the attached image, the boundary indicator starts with 4 hyphens and end with an "M".

    In the lower part of the image, (you can't see so good) I'm breaking up the attached file (a ruby script) at newline characters, and substituting a "<br />" in it's place so it will display properly.

    However, it appears the boundary markers differ slightly. In the CONTENT_TYPE, when read from STDIN, the markers start with 6 hyphens.

    And in the smaller image, you can see that the last boundary marker even has 2 trailing hyphens.

    When I read the content, here is my loop:

    Code:
    	for ( i = 0 ; i < clength ; ++i ) { 
    		char c = fgetc(stdin) ; 
    		if (c=='\n') printf("<br />") ; 
    		else printf( "%c", c ) ; 
    
    	}
    clength is a straight atoi() of CONTENT_LENGTH, which leads me to think I'm reading the right amount of data.

    Any ideas?

    Thanks, Todd
    Mainframe assembler programmer by trade. C coder when I can.

  2. #2
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    I think I've come to a theory of operation for this question.

    I think the way it is supposed to work is that if I find the boundary marker on any line (delimited by the newline character), then that is a separator "line" and marks the boundary from one set of data from another.

    I was taking the boundary marker literally, and only ignoring those characters that comprise the boundary marker.

    Is this correct?

    Thanks, Todd
    Mainframe assembler programmer by trade. C coder when I can.

  3. #3
    Registered User
    Join Date
    Jan 2008
    Posts
    290
    You should probably check out RFC 2045 and RFC 2046 for more info

    Quote Originally Posted by RFC 2046
    The Content-Type field for multipart entities requires one parameter,
    "boundary". The boundary delimiter line is then defined as a line
    consisting entirely of two hyphen characters ("-", decimal value 45)
    followed by the boundary parameter value from the Content-Type header
    field, optional linear whitespace, and a terminating CRLF.
    Those RFC's contain more information than I'd be willing to wade through, but I did find that snippet from doing a quick search for "boundary".

  4. #4
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    Thank you. That confirms what I had deduced.

    Todd
    Mainframe assembler programmer by trade. C coder when I can.

Popular pages Recent additions subscribe to a feed