Thread: UDP and packets greater than 1500 bytes

  1. #1
    Registered User
    Join Date
    Jan 2011
    Posts
    5

    UDP and packets greater than 1500 bytes

    Hi, I'm developing a tftp client and server and I want to dynamically select the
    udp payload size to boost transfer performance.

    I have tested it with two linux machines ( one has a gigabit ethernet card, the
    other a fast ethernet one ). I changed the MTU of the gigabit card to 2048 bytes
    and leaved the other to 1500.

    I have used setsockopt(sockfd, IPPROTO_IP, IP_MTU_DISCOVER, &optval,
    sizeof(optval)) to set the MTU_DISCOVER flag to IP_PMTUDISC_DO.

    From what I have read this option should set the DF bit to one and so it should
    be possible to find the minimum MTU of the network ( the MTU of the host that
    has the lowest MTU ). However this thing only gives me an error when I send a
    packet which size is bigger than the MTU of the machine from which I'm sending
    packets.

    Also the other machine ( the server in this case ) doesn't receive the oversized
    packets ( the server has a MTU of 1500 ). All the UDP packets are dropped, the
    only way is to send packets of 1472 bytes.

    Why the hosts do this? From what I have read, if I send a packet larger than
    MTU, the ip layer should fragment it.

  2. #2
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Can you show the code you're using?

    I routinely transfer 48k UDP messages on Windows with no problems at all. What happens is that if I exceed packet size, more than one packet is sent... But that's link layer stuff and you shouldn't have to worry about it at socket level.

  3. #3
    Registered User
    Join Date
    Jan 2011
    Posts
    5
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/types.h>
    #include <sys/socket.h>
    #include <arpa/inet.h>
    #include <unistd.h>
    #include <string.h>
    
                                // la dimensione del datagramma č data da MTU - ( ip header + udp header ) e cioč 1500 - ( 20 + 8 ) = 1472
    #define DGRAM_SIZE 2020
    
    int main()
    {
        int sockfd = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP);
        if ( sockfd == -1 ) { perror("Socket"); exit(0); }
    
        int optval = IP_PMTUDISC_DO;
        if (setsockopt(sockfd, IPPROTO_IP, IP_MTU_DISCOVER, &optval, sizeof(optval)) == -1) { perror("MTU Discover"); exit(0); }
    
        void *buf = malloc(DGRAM_SIZE);
        memset(buf, 0x1f, DGRAM_SIZE);
    
        struct sockaddr_in svraddr;
    
        svraddr.sin_family = PF_INET;
        svraddr.sin_port = htons(69);
        svraddr.sin_addr.s_addr = inet_addr("192.168.1.107");
    
        if ( connect(sockfd, (struct sockaddr*)&svraddr, sizeof(svraddr)) == -1 ) { perror("connect"); exit(0); }
    
        int dgram_size, dg_len = sizeof(dgram_size);
    /*
        if ( getsockopt(sockfd, IPPROTO_IP, IP_MTU, &dgram_size, (socklen_t*)&dg_len) == -1 ) {
                perror("getsockopt"); exit(0);
        }
    
        printf("Dimensione max fragment ( prima dell'invio ): %d\n", dgram_size);
    */
        if (sendto(sockfd, buf, DGRAM_SIZE, 0, (struct sockaddr*)&svraddr, sizeof(svraddr)) == -1 ) {
            dgram_size = 0;
    
            if ( getsockopt(sockfd, IPPROTO_IP, IP_MTU, &dgram_size, (socklen_t*)&dg_len) == -1 ) {
                perror("getsockopt"); exit(0);
            }
    
            perror("Sendto");
            printf("Dimensione max fragment ( dopo l'invio dei dati ): %d\n", dgram_size);
        } else printf("Successo\n");
    
    
        return 0;
    }

  4. #4
    Registered User
    Join Date
    Jan 2011
    Posts
    5
    Quote Originally Posted by CommonTater View Post
    Can you show the code you're using?

    I routinely transfer 48k UDP messages on Windows with no problems at all. What happens is that if I exceed packet size, more than one packet is sent... But that's link layer stuff and you shouldn't have to worry about it at socket level.
    I have posted the code. From the code you can see I'm setting the DF bit to 1 on purpose, because I want the network say me what the minimum MTU is.

    This way I can dynamically choose the best datagram size to maximize throughput eliminating ip fragmentation.

    But the problem is that nor the router nor the other host want to help the sender to understand what their MTUs are. They simply drop the packets so the sender don't receive icmp error messages.

    In your case, I think, you're exploiting ip fragmentation.

  5. #5
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by pabloski View Post
    I have posted the code. From the code you can see I'm setting the DF bit to 1 on purpose, because I want the network say me what the minimum MTU is.

    This way I can dynamically choose the best datagram size to maximize throughput eliminating ip fragmentation.

    But the problem is that nor the router nor the other host want to help the sender to understand what their MTUs are. They simply drop the packets so the sender don't receive icmp error messages.

    In your case, I think, you're exploiting ip fragmentation.
    I'm not sure that MTU discovery steps beyond the local port.

    Also, In my case I'm not using a connected socket to send datagrams...

    Code:
    // initialize Client port/return port number
    WORD InitNetwork(WORD Port,HWND Win,PTCHAR Password)
      { WSADATA     wsadata;              // winsock startup
        TCHAR       hn[MAX_HOSTNAME];     // host name
        DWORD       hs = MAX_HOSTNAME;    // host name size
        SOCKADDR    la;                   // local address
        WORD        lp = Port;            // local port
        // Load the Winsock DLL
        if (WSAStartup(MAKEWORD(2,0),&wsadata))
          return 0;
        // initialize local socket
        hSocket = socket(AF_INET,SOCK_DGRAM,IPPROTO_UDP);
        if (hSocket == INVALID_SOCKET)
          return 0;
        // initialize localhost IP  
        GetComputerName(hn,&hs);
        if (!GetHostAddr(hn,Port,&la))
          return 0;
        // bind on user designated Port
        while(bind(hSocket,&la,sizeof(SOCKADDR)))
          { SetHostPort(++lp,&la);
            if (lp > (Port + 256))
              return 0; }
        // setup for messages
        SetRxWindow(Win);
        SetPassCode(Password);
        // start rx thread
        rxThread = CreateThread(NULL,0,&ReceiveDatagrams,NULL,0,0);
        return lp; }
    Note that I am binding the sockets but not connecting them. Perhaps that is the difference. I don't think that's an exploit so much as using the full feature of the protocal.

  6. #6
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by pabloski View Post
    Code:
        if (sendto(sockfd, buf, DGRAM_SIZE, 0, (struct sockaddr*)&svraddr, sizeof(svraddr)) == -1 ) {
            dgram_size = 0;
    }
    Are your datagrams always 2020 bytes?
    You may be running into a problem with a dgram_size of 0 if any error occurs on the unbound socket... (Although this might be a platform difference, I'm used to doing this in windows...

    Code:
    BOOL SendDatagram(SOCKADDR To,WORD Command,PVOID txData,INT DataSize)
      { INT       sdgram;                                 // datagram size
        uint8_t   dg[MAX_DATAGRAM] = {0};                 // datagram data
        uint32_t* pass  = (uint32_t*)(dg + DGRAM_PASS);   // map passcode
        uint16_t* cmd   = (uint16_t*)(dg + DGRAM_CMD);    // map command
        uint8_t*  data  = (uint8_t*) (dg + DGRAM_DATA);   // map data
    
        // insert packet values  
        *pass = PassCode;
        *cmd  = Command;
        if ((txData != NULL) && (DataSize > 0))
          memcpy(data,txData,DataSize);
        sdgram = DataSize + DGRAM_DATA + sizeof(TCHAR); 
        // send it
        sendto(hSocket,(PCHAR)dg,sdgram,0,&To,sizeof(SOCKADDR));
        return 0; }

  7. #7
    Registered User
    Join Date
    Jan 2011
    Posts
    5
    Yes, the datagram size is 2020 bytes, because I'm just sending random datagrams of 2020 bytes for now.

    The Path MTU discovery should work as advertised. When I send a datagram, the first device that need to fragment it will not fragment it ( because the Don't fragment flag is set ) and will send back an ICMP describing the error.

    I should catch that error message and reduce the datagram size. This way what I obtain is the lowest MTU of all the devices participating in the communication.

    The problem is I don't receive the ICMP messages.

  8. #8
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by pabloski View Post
    Yes, the datagram size is 2020 bytes, because I'm just sending random datagrams of 2020 bytes for now.

    The Path MTU discovery should work as advertised. When I send a datagram, the first device that need to fragment it will not fragment it ( because the Don't fragment flag is set ) and will send back an ICMP describing the error.

    I should catch that error message and reduce the datagram size. This way what I obtain is the lowest MTU of all the devices participating in the communication.

    The problem is I don't receive the ICMP messages.
    Ahhh... I'm no ICMP maven... perhaps someone else can help you better...

    My only thought at this point is to ask whether you're listening on the correct ports.

  9. #9
    Registered User
    Join Date
    Jan 2011
    Posts
    5
    Quote Originally Posted by CommonTater View Post
    Ahhh... I'm no ICMP maven... perhaps someone else can help you better...

    My only thought at this point is to ask whether you're listening on the correct ports.
    I have made a little research on the topic and the problems lies in the network devices.

    Layer 2 switches simply cannot do ip fragmentation and obviously they cannot send icmp errors because of the don't fragment flag. Layer 3 switches can do it, but they'll do for machines that are connected to different subsents. For machines in the same subnet they'll do layer 2 switching.

    Also I have seen that it is a requirement that all the network devices inside a subnet have the exact same MTU values. The penalty is packet dropping.

  10. #10
    Banned
    Join Date
    Aug 2010
    Location
    Ontario Canada
    Posts
    9,547
    Quote Originally Posted by pabloski View Post
    I have made a little research on the topic and the problems lies in the network devices.

    Layer 2 switches simply cannot do ip fragmentation and obviously they cannot send icmp errors because of the don't fragment flag. Layer 3 switches can do it, but they'll do for machines that are connected to different subsents. For machines in the same subnet they'll do layer 2 switching.

    Also I have seen that it is a requirement that all the network devices inside a subnet have the exact same MTU values. The penalty is packet dropping.
    So, then... your best bet is to discover the MTU of your own machine, treat it as global and trust the network to handle things for you. Personally, I've never had to deal with this but from what you're saying I doubt it's going to improve throughput by any amount worth the trouble...

    Spending 80% of your time on a 1% improvement hardly seems practical.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. UDP receiver not receiving packets
    By chris24300 in forum Networking/Device Communication
    Replies: 15
    Last Post: 08-27-2009, 04:15 PM
  2. udp max datagram size
    By baccardi in forum Windows Programming
    Replies: 3
    Last Post: 03-19-2009, 03:39 AM
  3. Jumbo Frames - sort of working
    By bj00 in forum Networking/Device Communication
    Replies: 1
    Last Post: 07-23-2007, 10:29 AM
  4. UDP and TCP packets on port 6112
    By biosninja in forum A Brief History of Cprogramming.com
    Replies: 7
    Last Post: 01-19-2004, 11:21 AM