Thread: Why the local socket program occur core dump?

  1. #1
    Registered User
    Join Date
    Feb 2008
    Location
    China
    Posts
    28

    Question Why the local socket program occur core dump?

    Hi all,
    I am learning from a linux programming book and find that the following program occurs core dump on Fedora13. Could you please help to see what happened? Seems nothing is written to the socket. Is there any socket changes in kernel? Thanks a lot.
    server.c
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <sys/socket.h>
    #include <sys/un.h>
    #include <unistd.h>
    /* Read text from the socket and print it out. Continue until the
       socket closes. Return nonzero if the client sent a "quit"
       message, zero otherwise. */
    int server (int client_socket)
    {
      while (1) {
        int length;
        char* text;
        /* First, read the length of the text message from the socket. If
           read returns zero, the client closed the connection. */
        if (read (client_socket, &length, sizeof (length)) == 0)
          return 0;
        /* Allocate a buffer to hold the text. */
        text = (char*) malloc (length);
        /* Read the text itself, and print it. */
        read (client_socket, text, length);
        printf ("%s\n", text);
        /* Free the buffer. */
        /* If the client sent the message "quit", all done. */
        if (!strcmp (text, "quit")){
          free(text);	
          return 1;
        }
        free(text);	
      }
    }
    int main (int argc, char* const argv[])
    {
      const char* const socket_name = argv[1];
      int socket_fd;
      struct sockaddr_un name;
      int client_sent_quit_message;
      /* Create the socket. */
      socket_fd = socket (PF_LOCAL, SOCK_STREAM, 0);
      /* Indicate that this is a server. */
      name.sun_family = AF_LOCAL;
      strcpy (name.sun_path, socket_name);
      bind (socket_fd, &name, SUN_LEN (&name));
      /* Listen for connections. */
      listen (socket_fd, 5);
      /* Repeatedly accept connections, spinning off one server() to deal
         with each client. Continue until a client sends a “quit” message. */
      do {
        struct sockaddr_un client_name;
        socklen_t client_name_len;
        int client_socket_fd;
        /* Accept a connection. */
        client_socket_fd = accept (socket_fd, &client_name, &client_name_len);
        /* Handle the connection. */
        client_sent_quit_message = server (client_socket_fd);
        /* Close our end of the connection. */
        close (client_socket_fd);
      }
      while (!client_sent_quit_message);
      /* Remove the socket file. */
      close (socket_fd);
      unlink (socket_name);
      return 0;
    }
    client.c
    Code:
    #include <stdio.h>
    #include <string.h>
    #include <sys/socket.h>
    #include <sys/un.h>
    #include <unistd.h>
    /* Write TEXT to the socket given by file descriptor SOCKET_FD. */
    void write_text (int socket_fd, const char* text)
    {
      /* Write the number of bytes in the string, including
         NUL-termination. */
      int length = strlen (text) + 1;
      write (socket_fd, &length, sizeof (length));
      /* Write the string. */
      write (socket_fd, text, length);
    }
    int main (int argc, char* const argv[])
    {
      const char* const socket_name = argv[1];
      const char* const message = argv[2];
      int socket_fd;
      struct sockaddr_un name;
      /* Create the socket. */
      socket_fd = socket (PF_LOCAL, SOCK_STREAM, 0);
      /* Store the server’s name in the socket address. */
      name.sun_family = AF_LOCAL;
      strcpy (name.sun_path, socket_name);
      /* Connect the socket. */
      connect (socket_fd, &name, SUN_LEN (&name));
      /* Write the text on the command line to the socket. */
      write_text (socket_fd, message);
      close (socket_fd);
      return 0;
    }
    Then run
    ./server /tmp/testsocket

    and run
    ./client /tmp/testsocket "test1"
    to see what is shown at ./server side.

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    So run it in gdb, wait for it to crash, then examine the variable(s) in use at the line that caused the crash.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Super Moderater.
    Join Date
    Jan 2005
    Posts
    374
    Yes that's good advice.

  4. #4
    Registered User
    Join Date
    Feb 2008
    Location
    China
    Posts
    28
    Thanks a lot for your advice, Salem.
    I remembered that the server side should stop at strlen():
    movl (%eax), %ecx while the %eax is shown as 0(NULL pointer). So I guessed the ./server got nothing from the local socket.
    The codes work well under RH5.
    I am still trying to learn more about gdb and get more info from the crash. Or can anyone guide me about it?
    BTW is there any method to check the content of local socket?
    Last edited by chenayang; 08-12-2010 at 08:13 AM.

  5. #5
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    So where exactly does your server code call strlen()?

    Useful gdb commands to begin with are
    bt - to show you where you are in the call sequence
    break - set a breakpoint; a good place to put one is the start of your function which seems to originate in the crash
    step / next - for stepping through the code
    print - for printing things like variables.

    On a code note, read() and write() return information, which you're only very lightly checking.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  6. #6
    Registered User
    Join Date
    Feb 2008
    Location
    China
    Posts
    28
    Thanks, the environment is not available now and I have to provide more information later. I think the text = (char*) malloc (length) code should return NULL pointer if getting wrong length from socket. I would add the checking for the return value.

  7. #7
    Registered User
    Join Date
    Feb 2008
    Location
    China
    Posts
    28
    The following is the gdb information:
    First:
    $ gdb --args ./socket-server /tmp/sockettest
    (gdb) b 20 //break at malloc()
    Breakpoint 1 at 0x804862f: file socket-server.c, line 20.
    (gdb) r
    Starting program: /..../socket-server /tmp/sockettest
    ======
    Then start client:
    For client:
    $ gdb --args ./socket-client /tmp/sockettest "test1"
    (gdb) r
    ======
    The continue server side:
    Breakpoint 1, server (client_socket=-1) at socket-server.c:20
    20 text = (char*) malloc (length);
    (gdb) b 23 //Then break at line23 printf ("%s\n", text);
    Breakpoint 2 at 0x8048656: file socket-server.c, line 23.

    Assembly code for the printf is as follows:
    23 printf ("%s\n", text);
    => 0x08048656 <+82>: 8b 45 f4 mov eax,DWORD PTR [ebp-0xc]
    0x08048659 <+85>: 89 04 24 mov DWORD PTR [esp],eax
    0x0804865c <+88>: e8 c7 fe ff ff call 0x8048528 <puts@plt>
    If step over the line 0x0804865c the error occurs:
    Program received signal SIGSEGV, Segmentation fault.
    __strlen_ia32 () at ../sysdeps/i386/i586/strlen.S:99
    99 movl (%eax), %ecx /* get word (= 4 bytes) in question */
    ====

    I found the accept() return value is -1 and errno is 22.
    Does anyone know why?

  8. #8
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    Error numbers are in errno.h

    Try something like
    Code:
    client_socket_fd = accept (socket_fd, &client_name, &client_name_len);
    if ( client_socket_fd < 0 ) {
        perror("Accept failed");
    }
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  9. #9
    Registered User
    Join Date
    Feb 2008
    Location
    China
    Posts
    28
    Hi Salem, thank you so much to let me know how to print error details.
    For error 22 EINVAL(invalid argument) which means:
    Socket is not listening for connections, or addrlen is invalid (e.g., is negative).
    Seem something is wrong in client code as well and I would check all the return values later to see which step is fail.
    But I am wondering if my Fedora13 OS has some problems--All the codes run well on RH5. Appreciated if you have Fedora and help try the code.

  10. #10
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > Socket is not listening for connections, or addrlen is invalid (e.g., is negative).
    Well then I would suggest you add error checking to your other functions like bind and listen.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  11. #11
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    I don't see any call to setsockopt(server, SOL_SOCKET, SO_REUSEADDR, ...) which means the server socket may fail to bind to the port if you run the server repeatedly. In that case, the call to bind() may end up returning -1 and any further calls to accept() will return -1 as well.

    Check the return value of bind(). If it is -1, print out the last error message with this:

    Code:
    perror("bind()");
    It'll probably tell you something about the address already being in use.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  12. #12
    Registered User
    Join Date
    Feb 2008
    Location
    China
    Posts
    28
    Hi Salem and brewbuck, I have tried to get the return value for the socket functions:
    $ ./socket-server /tmp/sockettest
    bind(): Success
    Listen(): Success
    accept(): Invalid argument

    For client:
    ./socket-client /tmp/sockettest "test"
    socket(): Success
    socket_fd in main is 3
    connect(): Success
    socket_fd in write_text() is 3

    I am trying to get the errno for client write(), but seems the client also exits once the server side exits.
    Before I run the server code I have removed the local socket file.

  13. #13
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Ugh. Sorry, my brain must be somewhere else. The SO_REUSEADDR thingie does not apply to UNIX domain sockets. But clearly, the failure is happening when you try to accept() a connection. The error is EINVAL which means you have passed something bogus in one of the parameters to accept().
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  14. #14
    Registered User
    Join Date
    Nov 2008
    Posts
    30
    [QUOTE=chenayang;964509]
    Code:
        /* Accept a connection. */
        client_socket_fd = accept (socket_fd, &client_name, &client_name_len);
    I think the client_name_len should be initialized with size of client_name structure which is size of sockaddr_un

  15. #15
    Registered User
    Join Date
    Feb 2008
    Location
    China
    Posts
    28

    Thumbs up

    [QUOTE=salquestfl;965301]
    Quote Originally Posted by chenayang View Post
    Code:
        /* Accept a connection. */
        client_socket_fd = accept (socket_fd, &client_name, &client_name_len);
    I think the client_name_len should be initialized with size of client_name structure which is size of sockaddr_un

    Thanks a lot, salquestfl. I really solve it by initialize the client_name_len with a non-negative value:
    Code:
    //NG//socklen_t client_name_len;    
        //OK//socklen_t client_name_len=sizeof(client_name);
        socklen_t client_name_len= 0;
    But I still don't know why. I haven't found any more reference which says that it must be initialized. And also why does it work well on some OS without initial value?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 15
    Last Post: 10-20-2009, 09:39 AM
  2. Local Time Program
    By Ronzel in forum C++ Programming
    Replies: 1
    Last Post: 06-18-2009, 07:19 AM
  3. Core Dump in While()
    By KrepNatas in forum C Programming
    Replies: 5
    Last Post: 05-17-2005, 11:15 AM
  4. fopen();
    By GanglyLamb in forum C Programming
    Replies: 8
    Last Post: 11-03-2002, 12:39 PM
  5. segmentation core dump - need help
    By knight101 in forum C++ Programming
    Replies: 1
    Last Post: 11-26-2001, 04:43 PM