Thread: pointer problem

  1. #1
    Registered User
    Join Date
    Feb 2003
    Posts
    596

    pointer problem

    This is a sort of "bare-bones" LR parser: all it's supposed to do is prompt for a parsing table (table.txt), then prompt for an expression like:
    id * id
    or:
    (id + id) * id
    or:
    id + *
    and it will respond either "OK" (for inputs like the first 2 examples) or "syntax error" (for input like the third).

    I guess I must be doing something wrong with the pointers, probably char **table, but WHAT?

    ***Everything seems to work correctly when I compile and run it in MSVC 6.0.***

    When I compile it with GCC it compiles OK but I get a segmentation fault when I run it. And when I tried to debug it with some printf statements, I get some unexpected results: when i give strlen() a pointer to "id" (at least I thought it was pointing to "id") it returns the length as 3 instead of 2.

    And where I expected my printf's (in parser.c) to give:
    Code:
    2 is the length of
            id
     ho ho ho
            id
    0
    instead I get:
    Code:
    3 is the length of
           id
     ho ho ho
            id
    1
    And this section (line 63 of parser.c)
    Code:
    		while(strcmp(*(table+col),lexeme_tbl[lookup(nextToken)])!=0){
    //DEBUG
    printf("%10s %10s\n", *(table+col),lexeme_tbl[lookup(nextToken)] );
    //ENDDEBUG
    			col++;
    		}
    is supposed to find the column of the parsing table that corresponds to the token returned by lex. So if I just give it
    id
    as the input (since id is the first token in the table), it should immediately drop out of the while loop. Instead it seems to be finding that "id" and "id" are not the same, keeps incrementing col until it runs beyond "table"s memory, & segfaults.




    parser.c:
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include "readTable.h"
    #include "sstack.h"
    #include "lex.h"
    
    char *lexeme_tbl[] = {0,"$","(", ")", "*", "+", "id"};
    int lexeme_idx[] = {0,36,40,41,42,43,128};
    char *stk[STACK_SIZE];
    
    
    int main() {
    
    	int i,j,rows,cols,states,actions,gotos,top=0,nextToken,rule,row,col;
    	char *action;
    	char **table;
    
    // the lhs array is a lookup table of just the LHSs of the grammar rules,
    //	indexed by rule number; lhs[0] is not used
    	char *lhs[7] = {"VOID","E","E","T","T","F","F"};
    // the rhs_count array tells how many symbols to pop for a reduce action
    	int rhs_count[7] = {0,3,1,3,1,3,1};
    // (it would probably be more realistic for the grammar rules
    //	to be read in from a file, or interactive input)
    
    // read in the parsing table
    	table = readTable(&states, &actions, &gotos);
    
    	rows = states + 1;
    	cols = actions + gotos + 1;
    
    // get user input string
    	printf("\n\nEnter input expression.\n\n>");
    
    // the actual parsing begins here
    // start with state 0 on the stack
    	push("0");
    
    // put first input character into buffer to begin lexical analysis
    	myGetChar();
    	nextToken = lex();
    
    
    //DEBUG
    printf("parser.c: returned from lex\n");
    //ENDDEBUG
    
    // continue to process the rest of the input
    	while (nextToken != 0){
    
    // find the table column for nextToken
    		col=1;
    
    //DEBUG
    printf("%d is the length of\n",strlen(*(table+col)));
    printf("%10s\n", *(table+col));
    printf(" ho ho ho\n");
    printf("%10s\n",lexeme_tbl[lookup(nextToken)]);
    printf("%d\n",strcmp(*(table+col),lexeme_tbl[lookup(nextToken)]));
    //ENDDEBUG
    
    		while(strcmp(*(table+col),lexeme_tbl[lookup(nextToken)])!=0){
    //DEBUG
    printf("%10s %10s\n", *(table+col),lexeme_tbl[lookup(nextToken)] );
    //ENDDEBUG
    			col++;
    		}
    
    //DEBUG
    printf("parser.c: out of first while loop\n");
    //ENDDEBUG
    
    //	find the appropriate action for this token as shown by the parse table;
    //	 row 0 has the column headings, so the action row number = state+1
    		action = *(table+(1+atoi(peek()))*cols+col);
    
    //DEBUG
    printf("action is %s\n",action);
    printf("parser.c: ready to check S-rules\n");
    //ENDDEBUG
    
    //	S-rules:
    		if(*action=='S'){
    			push(lexeme_tbl[lookup(nextToken)]);
    			push(action+1); // this trims off the "S" leaving just the state
    
    // nextToken was consumed (pushed onto stack) so call lex to get new nextToken
    			nextToken = lex();
    		} // end if(*action=='S')
    
    // R-rules:
    		else if (*action=='R'){
    			rule=atoi(action+1);
    			for(i=0;i<rhs_count[rule];i++){
    				pop(); pop();
    			}
    // find the row for the goto rule (1st row is column headings, so row = state+1)
    //    peek() gives the state currently at top of stack
    			row = 1+atoi(peek());
    			push(lhs[rule]);
    
    //	find table column for goto rule
    			col = actions+1; //this locates beginning of GOTO section of table
    
    //	next, find the column headed by the nonterminal on the lhs of the reduction rule
    			while(strcmp(*(table+col),lhs[rule]))
    				col++;
    			if( **(table+row*cols+col)!= 'B'){
    				push (*(table+row*cols+col));
    			}
    			else{
    				printf("Syntax error.\n");
    				nextToken=0;
    			}
    		} // end if(*action=='R')
    
    // accept rule:
    		else if (*action=='a'){
    			printf("OK.\n");
    			nextToken = 0;
    		} // end if(*action=='a')
    		else{	// default: *action=='B'; parse table error
    			printf("Syntax error.\n");
    			nextToken = 0;
    		}
    
    	} //end while
    
    // the while loop ends when input is accepted, or the parse table rules
    //  reveal a syntax error, or lex detects a syntax error and returns a '0'
    	
    
    	return 0;
    }
    lex.c
    Code:
    /*
      a simple lexical analyzer for lr_parser assignment.
      lex() prompts for keyboard input, returning c-strings that it recognizes.
      lex() recognizes inputs of "id", "+", "*", "(", ")", and whitespace.
      anything else is reported as a syntax error.
    */
    
    #include <string.h>
    #include <stdio.h>
    #include "lex.h"
    #define ID_CODE 128
    #define IDENT_MAXSIZE 25
    
    
    extern int lexeme_idx[];
    
    int lex() {
    	char lexeme[IDENT_MAXSIZE];
    	int i=0, temp;
    
    //	myGetChar(); //moved this to parser.c; to be done only before the first call to lex
    
    /* consume blank spaces between lexemes */
    	while(charClass == SPACE)
    		myGetChar();
    
    	switch(charClass) {
    	case LETTER:
    		lexeme[i]=ch;
    		i++;
    		lexeme[i]='\0';
    		myGetChar();
    		while(charClass == LETTER){
    			lexeme[i]=ch;
    			i++;
    			lexeme[i]='\0';
    			if(i>2){
    				printf("Syntax error; invalid identifier: %s\n",lexeme);
    				return 0;
    			}
    			myGetChar();
    		}
    
    /* in this implementation, if the lexeme is a char string other than "id" it is an error: */
    		if(strcmp(lexeme, "id")){
    			printf("Syntax error: invalid identifier: %s\n",lexeme);
    			return 0;
    		}
    		return ID_CODE;
    		break;
    	case OPER:
    		temp = ch;
    		myGetChar(); // to load next character into buffer
    		return temp;				
    		break;
    	case CR: // put a $ to terminate the input
    		ch = '$';
    		return ch;
    		break;
    	case INVALID:
    		lexeme[i]=ch;
    		i++;
    		lexeme[i]='\0';
    		printf("Syntax error: invalid token: %s\n",lexeme);
    		return 0;
    		break;
    	}
    }
    
    
    int myGetChar(){
    	ch = getchar();
    	if ((ch > 96 && ch < 123))
    		charClass = LETTER;
    	else if ((ch > 39 && ch < 44))
    		charClass = OPER;
    	else if (ch == ' ')
    		charClass = SPACE;
    	else if (ch == 13 || ch == 10)
    		charClass = CR;
    	else
    		charClass = INVALID;
    	return ch;
    }
    
    
    int lookup(int lex_code){
    	int i=0;
    
    	while(lexeme_idx[i] != lex_code)
    		i++;
    	return i;
    }
    sstack.c
    Code:
    #include <assert.h>
    #include <stdio.h>
    #include "sstack.h"
    
    extern char *stk[STACK_SIZE];
    
    
    void clear(){
    	top = 0;
    }
    
    Bool is_empty(){
    	return top==0;
    }
    
    Bool is_full(){
    	return (top==STACK_SIZE);
    }
    
    void push(char *str){
    	assert (! is_full());
    	stk[top] = str;
    	top++;
    }
    
    char *pop(){
    	assert (! is_empty());
    	top--;
    	return stk[top];
    }
    
    char *peek(){
    	assert (! is_empty());
    	return stk[top-1];
    }
    readTable.c
    Code:
    /*
    function to read a parsing table from a text file & return it as an array
    the file structure is:
    line 1: number of states
    line 2: number of action columns
    line 3: number of goto columns
    lines 4...: the entries in a single column (one entry per line) for row 1 (the column headings),
    				followed by row 2, and so on...
    
    readTable returns a pointer to the array
    */
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    int ch; // ch holds character input from keyboard; also used in file lex.c
    
    char **readTable (int *states, int *actions, int *gotos) {
    
    	FILE *infile;
    	int rows, cols;
    	char filename[81];
    	char inbuffer[10];
    	char **p_table;
    	char *table_entry;
    	
    	int i,j,k;
    
    /* get the name of the input file */
    	printf("Enter the name of the parsing table file.\n\n>");
    
    	for(i=0; (i<80) && ((ch = getchar()) != EOF) && (ch != '\n'); i++)
    		filename[i] = ch;
    
    	filename[i] = '\0';
    
    
    /* read in the dimensions of the table & allocate an array of pointers to hold it */
    	infile = fopen(filename,"r");
    	
    	for(i=0; (i<9) && ((ch = fgetc(infile)) != EOF) && (ch != '\n'); i++)
    		inbuffer[i] = ch;
    	inbuffer[i] = '\0';
    	*states = atoi(inbuffer);
    
    	for(i=0; (i<9) && ((ch = fgetc(infile)) != EOF) && (ch != '\n'); i++)
    		inbuffer[i] = ch;
    	inbuffer[i] = '\0';
    	*actions = atoi(inbuffer);
    
    	for(i=0; (i<9) && ((ch = fgetc(infile)) != EOF) && (ch != '\n'); i++)
    		inbuffer[i] = ch;
    	inbuffer[i] = '\0';
    	*gotos = atoi(inbuffer);
    
    	rows = (*states) + 1;
    	cols = (*actions) + (*gotos) + 1;
    
    	p_table = malloc((rows)*(cols)*sizeof(char*));
    
    
    /* read in each table entry (string), allocate an array to hold it,
    	and set the corresponding entry in the table array to point to the string */
    
    /* first do row 1 (column headings) */
    	*p_table = "state";  // needed only if the table is to be printed
    	for(j=1; j<(cols); j++){
    
    		for(k=0; (k<9) && ((ch = fgetc(infile)) != EOF) && (ch != '\n'); k++)
    			inbuffer[k] = ch;
    		inbuffer[k] = '\0';
    
    		table_entry = malloc(strlen(inbuffer)+1);
    		strcpy(table_entry,inbuffer);
    		*(p_table+j)=table_entry;
    	}
    
    /* now the rest of the table */
    	for(i=1; i<rows; i++)
    		for(j=0; j<cols; j++){
    
    			for(k=0; (k<9) && ((ch = fgetc(infile)) != EOF) && (ch != '\n'); k++)
    				inbuffer[k] = ch;
    			inbuffer[k] = '\0';
    
    			table_entry = malloc(strlen(inbuffer)+1);
    			strcpy(table_entry,inbuffer);
    			*(p_table+i*cols+j)=table_entry;
    		}
    
    
    //DEBUG
    for(k=0; k<rows*cols; k++)
    	printf("%s\n",*(p_table+k));
    printf("leaving readTable\n");
    
    
    for(i=0;i<rows;i++){
    	for(j=0;j<cols;j++)
    		printf("%8s",*(p_table+i*cols+j));
    	printf("\n");
    }
    //ENDDEBUG
    
    
    
    	return p_table;
    
    
    	
    }
    readTable.h
    Code:
    #ifndef readTable_h
    #define readTable_h
    
    char **readTable(int *states, int *actions, int *gotos);
    
    #endif
    lex.h
    Code:
    #ifndef lex_h
    #define lex_h
    
    int lex();
    int myGetChar();
    int lookup(int);
    
    int ch;
    enum {LETTER, OPER, SPACE, CR, INVALID} charClass;
    
    #endif
    sstack.h
    Code:
    #ifndef sstack_h
    #define sstack_h
    
    #define STACK_SIZE 30
    
    
    int top;
    typedef int Bool;
    
    
    void clear();
    Bool is_empty();
    Bool is_full();
    void push(char*);
    char *pop();
    char *peek();
    
    #endif
    this is the makefile I'm using for gcc (NOT for msvc):
    Code:
    parser.exe:	parser.o	sstack.o	readTable.o	lex.o
    	gcc	-o	parser.exe	parser.o	sstack.o	readTable.o	lex.o
    
    parser.o:	parser.c	sstack.h	readTable.h	lex.h
    	gcc	-c	parser.c
    
    sstack.o:	sstack.c	sstack.h
    	gcc	-c	sstack.c
    
    readTable.o:	readTable.c
    	gcc	-c	readTable.c
    
    lex.o:	lex.c	lex.h
    	gcc	-c	lex.c
    table.txt
    Code:
    12
    6
    3
    id
    +
    *
    (
    )
    $
    E
    T
    F
    0
    S5
    B
    B
    S4
    B
    B
    1
    2
    3
    1
    B
    S6
    B
    B
    B
    acc
    B
    B
    B
    2
    B
    R2
    S7
    B
    R2
    R2
    B
    B
    B
    3
    B
    R4
    R4
    B
    R4
    R4
    B
    B
    B
    4
    S5
    B
    B
    S4
    B
    B
    8
    2
    3
    5
    B
    R6
    R6
    B
    R6
    R6
    B
    B
    B
    6
    S5
    B
    B
    S4
    B
    B
    B
    9
    3
    7
    S5
    B
    B
    S4
    B
    B
    B
    B
    10
    8
    B
    S6
    B
    B
    S11
    B
    B
    B
    B
    9
    B
    R1
    S7
    B
    R1
    R1
    B
    B
    B
    10
    B
    R3
    R3
    B
    R3
    R3
    B
    B
    B
    11
    B
    R5
    R5
    B
    R5
    R5
    B
    B
    B
    the parse table looks like this:
    Code:
       state      id       +       *       (       )       $       E       T       F
    
           0      S5       B       B      S4       B       B       1       2       3
    
           1       B      S6       B       B       B     acc       B       B       B
    
           2       B      R2      S7       B      R2      R2       B       B       B
    
           3       B      R4      R4       B      R4      R4       B       B       B
    
           4      S5       B       B      S4       B       B       8       2       3
    
           5       B      R6      R6       B      R6      R6       B       B       B
    
           6      S5       B       B      S4       B       B       B       9       3
    
           7      S5       B       B      S4       B       B       B       B      10
    
           8       B      S6       B       B     S11       B       B       B       B
    
           9       B      R1      S7       B      R1      R1       B       B       B
    
          10       B      R3      R3       B      R3      R3       B       B       B
    
          11       B      R5      R5       B      R5      R5       B       B       B
    Last edited by R.Stiltskin; 10-17-2005 at 12:26 PM.

  2. #2
    Unregistered User
    Join Date
    Sep 2005
    Location
    Antarctica
    Posts
    341
    well, I'm not going to weed through all that code, but, when you built it in MSVC, did you do a debug build or a release build? It would be interested to see if it crashed when you did a release build as opposed to a debug build.

  3. #3
    End Of Line Hammer's Avatar
    Join Date
    Apr 2002
    Posts
    6,231
    Sounds like you need to learn how to use gdb
    I stuck all your code into one source file, compiled and ran it. Here is what happened:

    Code:
    $ gcc -ggdb junk1.c
    
    $ gdb ./a
    GNU gdb 6.3.50_2004-12-28-cvs (cygwin-special)
    Copyright 2004 Free Software Foundation, Inc.
    GDB is free software, covered by the GNU General Public License, and you are
    welcome to change it and/or distribute copies of it under certain conditions.
    Type "show copying" to see the conditions.
    There is absolutely no warranty for GDB.  Type "show warranty" for details.
    This GDB was configured as "i686-pc-cygwin"...
    (gdb) run
    Starting program: /c/junk/a.exe
    Enter the name of the parsing table file.
    
    >junk1.txt
    
    Enter input expression.
    
    >id * id
    parser.c: returned from lex
    3 is the length of
           id
     ho ho ho
            id
    13
             id
             id
    Program received signal SIGSEGV, Segmentation fault.
    0x610d8e5c in strcmp () from /usr/bin/cygwin1.dll
    
    (gdb) where
    #0  0x610d8e5c in strcmp () from /usr/bin/cygwin1.dll
    #1  0x00401248 in main () at junk1.c:122
    (gdb)
    As you can see, it gives a filename and line number denoting the crash point. Try it yourself, and see how you get on.

    If you haven't got gdb, then this ain't really relevant!
    When all else fails, read the instructions.
    If you're posting code, use code tags: [code] /* insert code here */ [/code]

  4. #4
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    In msvc it works correctly in release build as well as in debug build.

    I just found that using gcc (in linux) my readTable function was putting '\r' (char 13) at the end of each table entry, so that's what was causing the printf() and strcmp() confusion and seg faults.

    I fixed that by changing
    Code:
    			for(k=0; (k<9) && ((ch = fgetc(infile)) != EOF) && (ch != '\n'); k++)
    				inbuffer[k] = ch;
    to
    Code:
    			for(k=0; (k<9) && ((ch = fgetc(infile)) != EOF) && (ch != 10) && (ch != 13); k++)
    				inbuffer[k] = ch;
    so it's not crashing anymore.

    The gcc version is still not working right, but at least I'm making some progress.

  5. #5
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    You're right, Hammer, I do need to learn how to use a debugger & I've been meaning to for a few years now. Never seems to be enough time though.

  6. #6
    Gawking at stupidity
    Join Date
    Jul 2004
    Location
    Oregon, USA
    Posts
    3,218
    To add to Hammer's post, gdb requires that the executable contain its symbol table to have the most useful information. Most (all?) compilers have an option to not include the symbol table so the final executable is smaller. But with the symbol table gdb can tell you all sorts of useful stuff like what value different variables had when the program crashed, etc. From where Hammer left off in gdb the next few logical steps would be to move up one level in the stack frame (with the 'up' command) since you can be pretty certain that the crash wasn't actually caused by strcmp(). Once you move up you can do 'list' to show the code around the point where the program crashed. Then you can 'print <variable name>' to show what its value was when the program crashed.
    If you understand what you're doing, you're not learning anything.

  7. #7
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    Problem solved -- but I don't really understand why it occurred in the first place. It seems that when I tried to run the program in my linux machine (using the same input file "table.txt" that I was using in Windows), it was stumbling over cr characters (13) that I hadn't written code to handle. But why was it working with the MSVC compiler without that code? Does the MS compiler "automatically" ignore cr characters, or did those characters not exist in the Windows input file (written in Notepad2) and then magically appear when I copied that file into my linux system?

    There definitely seems to be something "unobvious" going on here. Can anybody shed any light on it?


    PS: the solution was simply to add code to consume (& discard) the extraneous 10 and 13 chars; it now says
    Code:
    /* first do row 1 (column headings) */
    	*p_table = "state";  // needed only if the table is to be printed
    	for(j=1; j<(cols); j++){
    
    	/* get rid of any CR or LF characters floating around */
    		while( ((ch = fgetc(infile)) == 10) || (ch == 13) );
    
    	/* now read in the actual table entries */
    		inbuffer[0] = ch;
    	/* read characters from infile, ignoring EOF, CR and LF */
    		for(k=1; (k<9) && ((ch = fgetc(infile)) != EOF)  && (ch != 10) && (ch != 13); k++)
    			inbuffer[k] = ch;
    		inbuffer[k] = '\0';
    
    		table_entry = malloc(strlen(inbuffer)+1);
    		strcpy(table_entry,inbuffer);
    		*(p_table+j)=table_entry;
    	}
    instead of
    Code:
    /* first do row 1 (column headings) */
    	*p_table = "state";  // needed only if the table is to be printed
    	for(j=1; j<(cols); j++){
    
    		for(k=0; (k<9) && ((ch = fgetc(infile)) != EOF) && (ch != '\n'); k++)
    			inbuffer[k] = ch;
    		inbuffer[k] = '\0';
    
    		table_entry = malloc(strlen(inbuffer)+1);
    		strcpy(table_entry,inbuffer);
    		*(p_table+j)=table_entry;
    	}
    and similarly for the rest of the table.

  8. #8
    Gawking at stupidity
    Join Date
    Jul 2004
    Location
    Oregon, USA
    Posts
    3,218
    MS Windows actually uses \r\n as its newline sequence. *NIX uses \n and (I think) MacOS uses \r.
    If you understand what you're doing, you're not learning anything.

  9. #9
    End Of Line Hammer's Avatar
    Join Date
    Apr 2002
    Posts
    6,231
    How did you get the file from your Windows machine to the *nix one?

    If you ftp it, using standard text mode transfer (normally default), then the CRLFs will be converted, meaning that CRLF on Windows will become LF on *nix.

    If you ftp'd using binary mode, then the conversion won't happen and your program will need to cater for the CR. The same is going to be true for file sharing across the two systems.

    And yes, Macs do use CR (\r).

    Take 5 minutes to learn how to use gdb. The basics are dead easy, and they will help you solve a lot of problems that you'll encounter. Often all you need to solve a problem is to know what line of code caused the crash, and when you go back to look at the source, you spot the problem straight away.
    When all else fails, read the instructions.
    If you're posting code, use code tags: [code] /* insert code here */ [/code]

  10. #10
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    Yeah, I just looked at the file in hex and as you both said, it does have 0D 0A at the end of each line. And I transferred it by uploading it to a Yahoo briefcase & downloading to the other computer, so there was no translation, and it's no wonder that my program went haywire on the linux machine.

    But my question is this: the input file had CR/LF at the end of each line all along, and my original program had no code to handle the CR. I was reading in 1 character at a time, using
    Code:
    		for(k=0; (k<9) && ((ch = fgetc(infile)) != EOF) && (ch != '\n'); k++)
    			inbuffer[k] = ch;
    		inbuffer[k] = '\0';
    so why did my original program work correctly in MSVC?

    Does the MSVC compiler automatically translate
    Code:
    ch != '\n'
    to
    Code:
    ch != 10 && ch != 13
    ???

  11. #11
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

  12. #12
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    Thanks Dave. You got me looking in the right place. In
    Code:
    FILE *fopen( const char *filename, const char *mode );
    I used "r", which is good since the modes "t" (text) and "b" (binary) are Microsoft extensions and not ANSI portable. But ... here are excerpts of what I found in the MSVC documentation regarding the t and b modes:
    t

    Open in text (translated) mode. In this mode, CTRL+Z is interpreted as an end-of-file character on input. In files opened for reading/writing with "a+", fopen checks for a CTRL+Z at the end of the file and removes it, if possible. This is done because using fseek and ftell to move within a file that ends with a CTRL+Z, may cause fseek to behave improperly near the end of the file.

    Also, in text mode, carriage return–linefeed combinations are translated into single linefeeds on input, and linefeed characters are translated to carriage return–linefeed combinations on output. When a Unicode stream-I/O function operates in text mode (the default), the source or destination stream is assumed to be a sequence of multibyte characters. Therefore, the Unicode stream-input functions convert multibyte characters to wide characters (as if by a call to the mbtowc function). For the same reason, the Unicode stream-output functions convert wide characters to multibyte characters (as if by a call to the wctomb function).

    b

    Open in binary (untranslated) mode; translations involving carriage-return and linefeed characters are suppressed.

    If t or b is not given in mode, the default translation mode is defined by the global variable _fmode.
    _fmode
    The _fmode variable sets the default file-translation mode for text or binary translation. It is declared in STDLIB.H as

    extern int _fmode;

    The default setting of _fmode is _O_TEXT for text-mode translation. _O_BINARY is the setting for binary mode.
    So MSVC let me get away with ignoring the CRs in my original code. Its default behavior treated my "r" mode as "t", meaning that on input it was translating the CR/LFs to just LFs which my code could handle.

    On my linux machine, I was using the same input text file with CR/LF on each line. gcc did not do the Microsoft-style translation, so of course it crashed.

    Good lesson.

  13. #13
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    Quote Originally Posted by R.Stiltskin
    I used "r", which is good since the modes "t" (text) and "b" (binary) are Microsoft extensions and not ANSI portable.
    Hm? Two standard read modes are
    • "r" - text mode
    • "rb" - binary mode
    I've only seen the "rt" combination, never a "t" alone (not that I'd use it anyways).
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

  14. #14
    Registered User
    Join Date
    Feb 2003
    Posts
    596
    The "t" and "b" are described in MSVC 6.0 help for fopen() (quoted in my previous post). I never noticed them until this afternoon. Previously, I had only used "r" or "w" or "a".

    In what environment do you use "rb"?

    My linux man page for fopen() only lists mode options r, r+, w, w+, a, a+, and then says:
    ... The mode string can also include the letter ‘‘b’’ either as a last character or as a character between the characters in any of the two-character strings described above. This is strictly for compatibility with ANSI X3.159-1989 (‘‘ANSI C’’) and has no effect; the ‘‘b’’ is ignored on all POSIX conforming systems, including Linux. (Other systems may treat text files and binary files differently, and adding the ‘‘b’’ may be a good idea if you do I/O to a binary file and expect that your program may be ported to non-Unix environments.)

  15. #15
    Just Lurking Dave_Sinkula's Avatar
    Join Date
    Oct 2002
    Posts
    5,005
    Quote Originally Posted by R.Stiltskin
    The "t" and "b" are described in MSVC 6.0 help for fopen() (quoted in my previous post).
    Yes, I just don't think you read it right.
    http://msdn.microsoft.com/library/de...c_._wfopen.asp
    In addition to the above values, the following characters can be included in mode to specify the translation mode for newline characters:
    t
    Open in text [...]
    Quote Originally Posted by R.Stiltskin
    In what environment do you use "rb"?
    Any, all.

    [edit]Try looking at a standard or two.
    Last edited by Dave_Sinkula; 10-19-2005 at 06:23 PM.
    7. It is easier to write an incorrect program than understand a correct one.
    40. There are two ways to write error-free programs; only the third one works.*

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. pointer to pointer realloc problem
    By prakash0104 in forum C Programming
    Replies: 14
    Last Post: 04-06-2009, 08:53 PM
  2. Another pointer problem
    By mikahell in forum C++ Programming
    Replies: 21
    Last Post: 07-20-2006, 07:37 PM
  3. Pointer problem
    By mikahell in forum C++ Programming
    Replies: 5
    Last Post: 07-20-2006, 10:21 AM
  4. Quick question about SIGSEGV
    By Cikotic in forum C Programming
    Replies: 30
    Last Post: 07-01-2004, 07:48 PM
  5. pointer problem
    By DMaxJ in forum C Programming
    Replies: 4
    Last Post: 06-11-2003, 12:14 PM