Thread: Setting up a performace test C vs awk?

  1. #1
    Registered User
    Join Date
    Nov 2015
    Posts
    21

    Setting up a performace test C vs awk?

    I was told someone's awk code ran almost as fast as C. Not sure that's close to a true statement even in simple tasks. I wrote two tests and ran them with Linux's time command. Is this a fair and valid test? I need to give my management an honest answer.

    Note that the code runs but I have to hand transcribe it.

    ### C

    Code:
    #include <stdio.h>
    int main(void)
    {
      FILE *fp;
      fp = fopen("/dev/null", "w");
      char my_string[10] = "123rte";
      int i;
      for ( i = 0; i < 10000; i++ )
      {
        fprintf(fp, "%s\n", my_string);
      }
      fclose(fp);
      return 0;
    }
    ### awk

    Code:
    #!/bin/bash
    
    my_string="123rte"
    for (( n=0; n<10000; n++)) do
      echo $my_string | awk '{ print $1 }' > /dev/null 2>&1
    done

  2. #2
    Registered User
    Join Date
    Nov 2015
    Posts
    21
    Forgot to mention that the C version takes ~ 0.004 seconds and the awk version takes ~20 seconds.

  3. #3
    Programming Wraith GReaper's Avatar
    Join Date
    Apr 2009
    Location
    Greece
    Posts
    2,738
    No, that's not a fair test. When you do a "man awk", you will see that there are many complex things you can do with it by calling it just once. The reason your example is so slow is because it does basically nothing and the overhead from so many processes being created and destroyed is enormous.
    Devoted my life to programming...

  4. #4
    Programming Wraith GReaper's Avatar
    Join Date
    Apr 2009
    Location
    Greece
    Posts
    2,738
    Try this instead, which is more or less equivalent to the C code:
    Code:
    #!/bin/bash
     
    my_string="123rte"
    yes $my_string | head -n 10000 | awk '{ print $1 }' > /dev/null 2>&1
    Devoted my life to programming...

  5. #5
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    The equivalent awk code would be
    Code:
    $ cat foo.c
    #include <stdio.h>
    int main(void)
    {
      FILE *fp;
      fp = fopen("/dev/null", "w");
      char my_string[10] = "123rte";
      int i;
      for ( i = 0; i < 1000000; i++ )
      {
        fprintf(fp, "%s\n", my_string);
      }
      fclose(fp);
      return 0;
    }
    $gcc foo.c
    $ time ./a.out 
    
    real	0m0.113s
    user	0m0.112s
    sys	0m0.000s
    $ cat foo.awk
    #!/bin/awk
    BEGIN {
      for ( i = 0; i < 1000000; i++ )
        print "123rte";
    }
    $ time awk -f foo.awk > /dev/null
    
    real	0m0.140s
    user	0m0.136s
    sys	0m0.000s
    Yes, awk is almost as fast.

    But part of the attraction is being able to do regular expressions directly in the language, instead of having to do an awful lot of code yourself (even with a library).
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  6. #6
    Registered User
    Join Date
    Nov 2015
    Posts
    21
    Saw your first note and, with help from Google, came up with:

    Code:
    awk 'BEGIN{for(i=0;i<10000;i++) printf "123rte" > dev/null; }'
    Which runs in 0.023 seconds. Comparable to python (0.026s), slower than C (0.004s), much faster than bash (0.414s) and naturally faster than the bash to awk handoff (18.566s)

    Thank you!

  7. #7
    Registered User
    Join Date
    Nov 2015
    Posts
    21
    While it's much faster than the combined version I'm still getting 0.023s (awk) vs 0.004s (C). Still, that's good info to pass on, I don't want to misrepresent.

  8. #8
    Programming Wraith GReaper's Avatar
    Join Date
    Apr 2009
    Location
    Greece
    Posts
    2,738
    It stands to reason that its speed would be comparable to python, since both are interpreted languages.
    Devoted my life to programming...

  9. #9
    Registered User
    Join Date
    Nov 2012
    Posts
    1,393
    Quote Originally Posted by Leam View Post
    I was told someone's awk code ran almost as fast as C.

    ...

    Code:
    #!/bin/bash
    
    my_string="123rte"
    for (( n=0; n<10000; n++)) do
      echo $my_string | awk '{ print $1 }' > /dev/null 2>&1
    done
    The problem with this benchmark is that you're launching awk 10000 times to do a very small amount of work. I would imagine most of the execution time of this would be spent creating and destroying subprocesses.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Bit setting
    By c_lady in forum C Programming
    Replies: 2
    Last Post: 03-09-2010, 01:11 PM
  2. Setting up Dev C++
    By Suchy in forum Tech Board
    Replies: 11
    Last Post: 11-11-2006, 02:38 PM
  3. Test at http://www.artlogic.com/careers/test.html
    By zMan in forum C++ Programming
    Replies: 6
    Last Post: 07-15-2003, 06:11 AM

Tags for this Thread