Ways to speed up a bruteforce-style program

This is a discussion on Ways to speed up a bruteforce-style program within the C Programming forums, part of the General Programming Boards category; On an optional assignment for a recent C class, we were given a binary file in which some text had ...

  1. #1
    Registered User inequity's Avatar
    Join Date
    Nov 2010
    Location
    Seattle, Washington
    Posts
    59

    Ways to speed up a bruteforce-style program

    On an optional assignment for a recent C class, we were given a binary file in which some text had been jumbled up using a really simple algorithm.

    Here's an example of how the text was encoded,
    using a password "foobar" and string "THIS IS THE ENCODED STRING"

    Code:
         THIS IS THE ENCODED STRING
    +   foobarfoobarfoobarfoobarfoob
    
         ----------------------------------
    
    =   UUxrOaO_xO
    So our professor gave us a binary file with a bunch of text in that has been "encrypted" like this, using a five letter lowercase word.

    We have to go through the file trying to different strings to decode it. We also aren't given a "solved" string, so we have to determine if the result we get from decoding with any given string is in fact a valid message.

    I've finished the assignment successfully, so that's not why I'm here, don't worry guys. But I'm just really interested in this, and think it would be interesting to see how much faster I could get the program to accomplish it's task.

    Current I am running through 5 loops, each representing one character of the password, and inside them I run through the entire file, converting each character and making a note of when I encounter a BAD character ( any character not A-Z, a-z, or ! ? . , " ), and also making note of how many spaces I encounter.

    If the current decoded attempt has the least bad characters and most spaces, then I store it as the current "most correct" string. If I read through the file and find less than two "weird" characters, I assume that we've found the string, and return the value.

    Are there faster methods for doing this? Are there better methods for detecting whether or not a string is a valid string in the English language? I just think it would be fun to get this working better, so I don't have to try 8 million things to decode one sentence, but I dont really know where to begin.

    Below is my code for the program, and a sample output

    Code:
    /**************************************************************************
    filename    decode_jumbledstring.c
    author      inequity
    date        12/13/2010
    
    Brief Description:
      This program reads in a binary file which as been jumbled using a
      a very simple algorithm. The original text has been encoded using
      a 5 digit string of lowercase characters, so this program attempts
      to decode it as fast as possible using different string (which 
      isn't very fast at the moment)
    **************************************************************************/
    #include <stdio.h>  /* printf, fopen, fclose, fseek, fgetc, feof */
    #include <stdlib.h> /* malloc, free                              */
    #include <string.h>
    
    #define DEBUG /* comment this out to turn off unnecessary printfs */
    #define PASSWORD_LENGTH 5
    
    int decode(const unsigned char secret_phrase[], /* encrypted string  */
               unsigned char decoded_phrase[],      /* plain text string */
               unsigned char password[],            /* password used     */
               int length,                          /* string length     */
               int *passes                          /* amount of attempts*/
              );
    
    int main(void)
    {
    
        /* This file contains the encrypted message */
      const char *filename = "secret_phrase.bin";
    
        /* Account for NULL terminator */
      unsigned char password[PASSWORD_LENGTH + 1] = {0};
      unsigned char *secret_phrase; /* Encoded phrase */
      unsigned char *plain_text;    /* Decoded phrase */
    
      FILE *infile; /* File pointer we're reading from     */
      size_t size;  /* Number of bytes in the file         */
      size_t count; /* Number of bytes fread from the file */
      int found;    /* Was the password found              */
      int length;   /* Length of the string                */
      int *passes;  /* ++'d each time we try a new string  */
    
      /* initially set number of passes (attempts) we've made to zero */
      *passes = 0;
    
        /* Open in binary (raw) since we don't want any translations */
        infile = fopen(filename, "rb");
        if (!infile)
        {
          printf("Can't open %s for reading.\n", filename);
          return -1;
        }
    
        /* Count bytes in the file */
      size = 0;
      fgetc(infile);
      while (!feof(infile))
      {
        size++;
        fgetc(infile);
      }
    
        /* The file was empty */
      if (!size)
      {
        printf("Nothing to decode.\n");
        return -2;
      }
    
        /* Set cursor back to beginning of the file */
      fseek(infile, 0, SEEK_SET);
    
      length = sizeof(unsigned char)*(size+1);
        /* Allocate the proper amount of memory */
      secret_phrase = (unsigned char *)malloc(size + 1);
      plain_text = (unsigned char *)malloc(size + 1);
    
        /* Now read in the secret phrase from the file */
      count = fread(secret_phrase, 1, size, infile);
      if (count != size)
      {
        printf("Expected %i bytes, but only read %i.\n", size, count);
        return -3;
      }
    
        /* Terminate the string of bytes */
      secret_phrase[count] = 0;
    
        /* Done with the file reading */
      fclose(infile);
    
      printf("%s\n",secret_phrase);
        /* Attempt to decode the characters */
      found = decode(secret_phrase, plain_text, password, length, passes);
        /* Did we find the password or not? */
      if (found)
      {
        printf("\n\nFOUND!\n");
        printf("The password is: %s\n", password);
        printf("The message was: %s\n", plain_text);
        printf("Tried %d combinations.\n",*passes);
      }
      else
        printf("\n\nThe secret phrase was not discovered.\n");
        /* Free the memory that we allocated. */
      free(secret_phrase);
      free(plain_text);
    
      return 0;
    }
    
    int decode(const unsigned char secret_phrase[], /* encrypted string  */
               unsigned char decoded_phrase[],      /* plain text string */
               unsigned char password[],            /* password used     */
               int length,                          /* length of string  */
               int *passes                          /* number of attempts*/
              )
    {
      int i, j, k, l, m, n;         /* Loop counters                                  */
      unsigned char charPtr;        /* Pointer to read values of characters           */
      unsigned char tempStr[200];   /* Temporary to hold string from current          */
                                    /* decode attempt.                                */
      int weirdChars = 0;           /* How many bad characters we've encountered      */
      int leastWeirdChars = 40;     /* Least bad characters we've found in an attempt */
      int spacesFound = 0;          /* Spaces found in current attempt                */
      int mostSpacesFound = 0;      /* Most spaces we've found in an attempt yet      */
      
      /* Increment through the alphabet for each of the letters in the password */
      for(i = 'a'; i <= 'z'; i++)
      {
        for(j = 'a'; j <= 'z'; j++)
        {
          for(k = 'a'; k <= 'z'; k++)
          {
            for(l = 'a'; l <= 'z'; l++)
            {
              for(m = 'a'; m <= 'z'; m++)
              {
                /* Initialize variables */
                *passes += 1;
                weirdChars = 0;
                spacesFound = 0;
                /* Set password characters to loop counter values */
                password[0] = i;
                password[1] = j;
                password[2] = k;
                password[3] = l;
                password[4] = m;
                
                /* For each character in the string */
                for(n = 0; n < length; n++)
                {
                  /* Set charPtr to the secret_phrase minus the matching password char */
                  charPtr = secret_phrase[n] - (password[n % PASSWORD_LENGTH]);
                  /* uppercase 65-90, lowercase 97-122, 32, 33, 44, 46, 63, 59 */
                  /* regular char, uppercase */
                  if( charPtr >= 65 && charPtr <= 90)
                  {
                  
                  }
                  /* regular char, lowercase */
                  else if( charPtr >= 97 && charPtr <= 122)
                  {
                  
                  }
                  /* check if character is ? ! . , ; */
                  else if( charPtr == 33 || charPtr == 44 || charPtr == 46 
                        || charPtr == 59 || charPtr == 63 )
                  {
    
                  }
                  else if( charPtr == 34 )
                  {
                  
                  }
                  else if( charPtr == 32 )
                  {
                    spacesFound++;
                  }
                  else
                  {
                    weirdChars++;
                  }
                    tempStr[n] = charPtr;
                    
                    if(weirdChars > leastWeirdChars)
                    {
                      n = length;
                    }
                }
                
                
                    
                if(weirdChars <= leastWeirdChars && spacesFound >= mostSpacesFound)
                {
                  tempStr[length-1] = ' ';
                  tempStr[length] = ' ';
                  for(n = 0; n <= length; n++)
                  {
                    decoded_phrase[n] = tempStr[n];
                  }
                  decoded_phrase[length] = 0;
                  #ifdef DEBUG
                  printf("%s - password: %s\n",decoded_phrase,password);
                  #endif
                  leastWeirdChars = weirdChars;
                  mostSpacesFound = spacesFound;
                  
                  if(weirdChars < 2)
                  {
                    return 1;
                  }
                }
              }
            }
          }
        }
      } 
      return 0;
    }
    Code:
    ',ZUaIxO"A-s"a%OO,YO"UOx"+YI<ZOE_O"YOa"DO_3,AY"UIO"OO"IUOx"EO"O_"Ex?%,xaOUYOO"xO"
    S?"I y,kng2`CD,2,t d{w"thw3yorv3$th{?m" w+eapw3{ou,3nip.R"Yo╪3yerw3pot2{krev3hor2Oqur2utai?+. y?^"hi,qpo+toic2⌂cnd2?css@  - password: aabin
    S?!I y,jng2`BD,2,s d{w!thw3xorv3#th{?l" w+dapw3zou,3mip.R!Yo╪3xerw3oot2{jrev3gor2Opur2usai?+- y?^!hi,ppo+tnic2⌂bnd2?bss@  - password: aacin
    S? K y,ipg2`AF,2,r"d{w vhw3wqrv3"vh{?k$ w+ccpw3yqu,3lkp.R [o╪3wgrw3nqt2{itev3fqr2Oowr2urci?+,"y?^ ji,oro+tmkc2⌂apd2?aus@  - password: aadgn
    S? J y,iog2`AE,2,r!d{w uhw3wprv3"uh{?k# w+cbpw3ypu,3ljp.R Zo╪3wfrw3npt2{isev3fpr2Oovr2urbi?+,!y?^ ii,oqo+tmjc2⌂aod2?ats@  - password: aadhn
    S? I"y,ini2`AD.2,r f{w tjw3wotv3"tj{?k""w+carw3yow,3lir.R Yq╪3wetw3nov2{irgv3fot2Oout2urak?+, {?^ hk,opq+tmie2⌂anf2?asu@  - password: aadil
    S? I!y,inh2`AD-2,r e{w tiw3wosv3"ti{?k"!w+caqw3yov,3liq.R Yp╪3wesw3nou2{irfv3fos2Oous2uraj?+, z?^ hj,opp+tmid2⌂ane2?ast@  - password: aadim
    S? I y,ing2`AD,2,r d{w thw3worv3"th{?k" w+capw3you,3lip.R Yo╪3werw3not2{irev3for2Oour2urai?+, y?^ hi,opo+tmic2⌂and2?ass@  - password: aadin
    S⌂ I y?ing2_AD,2?r d{v thw2worv2"th{?k" w.capw2you,2lip.Q Yo╪2werw2not2zirev2for2<our2trai?., y?╪ hi,,opo+smic2~and2⌂ass@  - password: abdin
    S~ I y?ing2^AD,2?r d{u thw1worv1"th{⌂k" w,capw1you,1lip.P Yo╪1werw1not2yirev1for2Sour2srai?,, y?+ hi,?opo+rmic2}and2~ass@  - password: acdin
    S} I y⌂ing2]AD,2⌂r d{t thw0worv0"th{~k" wcapw0you,0lip.O Yo╪0werw0not2xirev0for2%our2rrai?, y?. hi,?opo+qmic2|and2}ass@  - password: addin
    S| I y~ing2\AD,2~r d{s thw/worv/"th{}k" w,capw/you,/lip.N Yo╪/werw/not2wirev/for2^our2qrai?,, y?, hi,⌂opo+pmic2{and2|ass@  - password: aedin
    S{ I y}ing2[AD,2}r d{r thw.worv."th{|k" w?capw.you,.lip.M Yo╪.werw.not2virev.for2╪our2prai??, y? hi,~opo+omic2zand2{ass@  - password: afdin
    Sy I y{ing2YAD,2{r d{p thw,worv,"th{zk" w⌂capw,you,,lip.K Yo╪,werw,not2tirev,for2.our2nrai?⌂, y?? hi,|opo+mmic2xand2yass@  - password: ahdin
    So I yqing2OAD,2qr d{f thw"worv""th{pk" wucapw"you,"lip.A Yo╪"werw"not2jirev"for2{our2drai?u, y?w hi,ropo+cmic2nand2oass@  - password: ardin
    Sn I yping2NAD,2pr d{e thw!worv!"th{ok" wtcapw!you,!lip.@ Yo╪!werw!not2iirev!for2zour2crai?t, y?v hi,qopo+bmic2mand2nass@  - password: asdin
    Sm J yoiog2MAE,2or!d{d uhw wprv "uh{nk# wscbpw ypu, ljp.? Zo╪ wfrw npt2hisev fpr2yovr2brbi?s,!y?u ii,poqo+amjc2laod2mats@  - password: atdhn
    Sm I"yoini2MAD.2or f{d tjw wotv "tj{nk""wscarw yow, lir.? Yq╪ wetw nov2hirgv fot2yout2brak?s, {?u hk,popq+amie2lanf2masu@  - password: atdil
    Sm I!yoinh2MAD-2or e{d tiw wosv "ti{nk"!wscaqw yov, liq.? Yp╪ wesw nou2hirfv fos2yous2braj?s, z?u hj,popp+amid2lane2mast@  - password: atdim
    Sm I yoing2MAD,2or d{d thw worv "th{nk" wscapw you, lip.? Yo╪ werw not2hirev for2your2brai?s, y?u hi,popo+amic2land2mass@  - password: atdin
    Rm I xoing1MAD,1or dzd thv woru "thznk" vscapv you lip,? Yo+ werv not1hireu for1your1brai⌂s, y?u hi?popo.amic1land1mass?  - password: btdin
    Om I uoing.MAD,.or dwd ths worr "thwnk" sscaps you? lip?? Yo wers not.hirer for.your.brai|s, y}u hi~popo,amic.land.mass<  - password: etdin
    Mm I soing,MAD,,or dud thq worp "thunk" qscapq you~ lip? Yo? werq not,hirep for,your,braizs, y{u hi|popo?amic,land,mass:  - password: gtdin
    Cm I ioing"MAD,"or dkd thg worf "thknk" gscapg yout lipu? Yow werg not"hiref for"your"braips, yqu hirpopovamic"land"mass0  - password: qtdin
    Bm I hoing!MAD,!or djd thf wore "thjnk" fscapf yous lipt? Yov werf not!hiree for!your!braios, ypu hiqpopouamic!land!mass/  - password: rtdin
    Ao I gqing OAD, qr dif the"word""thipk" eucape"your"lipsA You"were"not jired"for {our drainu, yow hipropotcmic nand oass.  - password: srdin
    An I gping NAD, pr die the!word!"thiok" etcape!your!lips@ You!were!not iired!for zour craint, yov hipqopotbmic mand nass.  - password: ssdin
    Am J goiog MAE, or!did uhe wprd "uhink# escbpe ypur ljps? Zou wfre npt hised fpr yovr brbins,!you iippoqotamjc laod mats.  - password: stdhn
    Am I"goini MAD. or fid tje wotd "tjink""escare yowr lirs? Yqu wete nov hirgd fot yout brakns, {ou hkppopqtamie lanf masu.  - password: stdil
    Am I!goinh MAD- or eid tie wosd "tiink"!escaqe yovr liqs? Ypu wese nou hirfd fos yous brajns, zou hjppopptamid lane mast.  - password: stdim
    Am I going MAD, or did the word "think" escape your lips? You were not hired for your brains, you hippopotamic land mass.  - password: stdin
    
    
    FOUND!
    The password is: stdin
    The message was: Am I going MAD, or did the word "think" escape your lips? You were not hired for your brains, you hippopotamic land mass.
    Tried 8561762 combinations.
    Anyways, any help would be greatly appreciated. And any comments on my coding style are also greatly appreciated.

  2. #2
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,639
    First, don't use feof to control a loop: Cprogramming.com FAQ > Why it's bad to use feof() to control a loop. Other than that, I don't think your coding style is too bad.

    As for optimization, with a fixed password length that's as short as yours, and a relatively short cipher text, I don't know that 8 million possibilities is that bad, even if you multiply that by the length of your string (since you process characters individually). Should be reasonably quick on new computers. In anycase, here are a few things I would do:

    The general strategy here is to analyze your cipher text, determine what the most likely choices for password characters are for each slot and try those solutions first.

    1. Don't check the size of the file with fgetc in a loop. Use some arithmetic with fseek or use a platform specific function like stat on Unix/Linux (don't know the Windows equivalent). This is relatively small, but could still help if you had a large text to decode.
    2. Use standard library function like strcpy, they're super efficient compared to your for loop.
    3. Do a frequency analysis of the most common encoded characters. They should roughly map to the most frequent characters of your plain-text language. E.g. 'e' accounts for roughly 12% of all letters in regular English text, so if you find an encoded character that accounts for roughly 12% of encoded characters, it's probably the encoding of an 'e'.
    4. Go through the string once and build a list/range of possible characters for each letter of your password. That is, if the character in position 0 is , then you may be able to tell that subtracting 'a' through 'g' would still result in a weird character, as would, say, 'w' through 'z', thus your only valid character for the first letter of your password would be 'h' through 'v'. Since your password repeats so many times through out your cipher text, you can potentially narrow this down to a set of 5 or 10 likely characters. If you can't actually throw away letters as potential password characters , you could try sorting them in order of most likely to least likely so you're likely to come across the best solution first.
    5. Break out of your loop once your weirdCharacters gives you worse performance. This will compounded with #'s 3 and 4 can help a lot.
    6. You don't actually need to copy the whole cipher text each time you find a better password. Just copy the password, then do a full decode at the end, since your text is much longer than the password.

    Numbers 2 and 3 pay off more when your encoded text gets longer as letter frequencies become more accurate and you can better narrow down the range of possible characters for an individual password character.

    As for determining whether you have a legit English phrase, you could look at estimating string weirdness and spaciness. Say you are willing to accept .1% weird characters (1 in 1,000). If your string is 1,000 chars long and you find 2 weird chars in the first 10 positions, you can bail out after checking just 10 characters b/c the string is just too weird to be properly decoded. Do likewise for spaces.

    That's all I have for now.

  3. #3
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    This was a fun program! I spent a lot of time poking around with it.

    The one thing that made more difference than anything else I tried, was to short-circuit the for loops - just as soon as possible.

    When you want the trials letters that are being generated as a possible solution, you see a lot of junk being produced. That can steal a LOT of time. If you know the password can't have a $ in it, for instance. Then don't make it a candidate letter, only to be rejected by yet more logic, later!

    Probably because I was tired, but despite working with ranges quite a bit, I didn't find them a huge help. A few if statements made the code much more readable, and did the same work.

    Adding the "candidate letter guardian" if statements, and keeping it simple, gave me the best solving times, by far. I can out simple most anyone, but let's keep that between ourselves, ok?


    I don't know if it's still as good as it used to be, but there was a crypto group on usenet (the newsgroups) side of the internet, and they had a couple fantastic cryptographers occasionally visit the group. Did some amazing de-crypting. I'd run this by them, along with your program, and some actual solving times, if you're interested.

    In any case, a really fun assignment.

    P.S. The if statements were immediately after the for loop:
    Code:
    for(.........) {
      if(this letter makes junk) continue;
    
      Two more, just about like it
    
      for("next for loop) {
        //very similar if statements
    
         //etc.
    
      }
    
    }
    Last edited by Adak; 12-14-2010 at 01:07 PM.

  4. #4
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,639
    Nice Adak! I was wishing I had time to actually get some metrics on that stuff, but I threw it together just before leaving work yesterday and never got back to it. Any idea how your average- and worst-case big O stack up against the OP's? I'd be curious to know what kind of performance gain you got.

    Crypto stuff like this really is a blast. One of my favorite assignments (from a linguistics class of all things) was to decrypt a Huffman coded text without the dictionary. A fun little algorithm and not too complicated.

  5. #5
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Over half the Summer, I did a project looking for a 16 givens Sudoku grid with just a single solution. As odd as it might sound, the backbone of the project was about 16 nested loops. Trillion+ grids generated and solved. Cut offs or smart ranges to begin with, are the only thing that makes loops like that bearable on such programs.

    My idea here, was to try and NOT generate any "weird" char's, because you're basically working with an "odometer" here - the .1 mile wheel, won't bother you, but wait until you have a "weird" char up in the thousands wheel.

    With the clearer letters, I was hoping to tie the program into my special dictionary files. The words are all in files, according to their length, 3words.txt has all the common words with three letters only, etc. Some of these files also have the letter patterns included, like:

    123 cat
    122 ill
    121 mom
    etc.

    Once the first few words are found, the whole thing should "solve", like Sudoku when you get down to naked singles. That's my hope, anyway.

    I spent too much time with the range array of structs (lo, hi), for each letter. When I changed it over -- well, the program now looks like a grenade hit it.

    The big O for this, must be truly appalling. N! * N!... ?? Without some help, you could be here a very long time. Before the addition of the speed up's, I could really see it too.

    I'm going to tinker some more with this, and clean it up. Don't have a ton of time right now, but it's too good to let go of without a bit more.

    I have not met Mr. Huffman, but he sounds quite interesting.

  6. #6
    Captain Crash brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,263
    Instead of inventing your own method for determining the quality of your solution, use Index Of Coincidence.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  7. #7
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    The int pointer "*passes" is initialized with no address, and subsequently used. That needs fixing.

    Anyone else get it to run?

    I believe I also have a character set problem with this program. I need to generate my own binary of encrypted message.

    Thanks for that link, Brewbuck. That Index of Coincidence is pretty clever stuff!

    In the assignment, the length of the password was given, but how often is *that* going to occur?

  8. #8
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,639
    Quote Originally Posted by Adak View Post
    The int pointer "*passes" is initialized with no address, and subsequently used. That needs fixing.

    Anyone else get it to run?
    Yeah, I only had to change passes to a plain int (instead of int *), then stop dereferencing it throughout main. Otherwise, the OP's program worked as expected.

    I believe I also have a character set problem with this program. I need to generate my own binary of encrypted message.
    I wrote a little program (attached) that seems to encode everything fine, but vi has a habit of appending a new line onto the end of text files, so I had to fix that too.

    Your approach took me from 8,561,762 iterations down to 144! Granted each time I exited early, I still had to scan at least part of secret_phrase to count weird characters, but I could jump in 5 character increments and quit as soon as I exceeded the 1 weird character limit.

    My intuition says we went from 26^p to 26*p worst-case complexity (where p is the password length), which is pretty darn good. I have yet to play with the index of coincidence stuff yet, but I imagine the time complexity is the same as your algorithm if the password length is known.
    Attached Files Attached Files

  9. #9
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    I got it going with:
    Code:
      /*  //creates a new encrypted text file 
      fp=fopen("crypt.txt", "wb");
      for(i=0;i<sizeof(str1);i++) {
        str3[i] = str1[i]+str2[i];
        fputc(str3[i], fp);
      }
      fclose(fp);
      putchar('*');
      return 0;
      */
    but I d/l'ed your encode.c anyway

    Running his program (the original one, but with the int variable passes being initialized to zero, his program is showing only 420k+ passes (far from 8 million+).

    Maybe that 8 million was wacko?

    I looked, at my word lists - "hippopotamic" was NOT in there!

    Weird word regards.

  10. #10
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,639
    I still got 8 million from his original code with just the passes modification. I think this fits since he was trying every combination (redoing his m loop for every l loop, etc). It breaks down like this:
    For the password stdin, the OP goes through every possible combination for start letters a-r, then on s he only does part of the next 4 loops when the first letter is s (stopping when the second letter is t). This means that for the letters(position in alphabet) s(19) t(20) d(4) i(9) n(14), you have the following number of iterations of your inner loop:

    aaaaa-rzzzz = 18 * 26^4 = 8,225,568
    saaaa-sszzz = 19 * 26^3 = 333,944
    staaa-stczz = 3 * 26^2 = 2,028
    stdaa-stdhz = 8 * 26 = 208
    stdia - stdin = 14

    If you add up all those numbers, you get 8,561,762, which is what the OP got as well. Maybe you have some mod in there you forgot about?

  11. #11
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    You nailed it. It was late; the program was close, but being cranky, so I took a monkey wrench to it, and forgot about it.

    It does that "run once too far (which I attribute to the !feof() file checking), so I knocked down the length of the message to just "size", instead of "size + 1".

    With size +1, I get a spurious char right on the end, and the program says the secret message is not discovered. I left the malloc's unchanged.

    Cutting the workload in half, by fixing a bug - I love it!

  12. #12
    Registered User \007's Avatar
    Join Date
    Dec 2010
    Posts
    179
    I didn't read the code too much, but can you run some of these loops in parallel? Take advantage of multiple CPUs.. or better yet, if it can run in parallel then try to run the program on the GPU instead of the CPU.

    You can also use your C language/code, but need to modify it a bit.

    CUDA - Wikipedia, the free encyclopedia

  13. #13
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    Well yes, you can run them in parallel (just split up the string across the other cores), but that's cheating.

    Ditto with GPU's. These newer GPU's with CUDA, are *racehorses*. Not the most durable racehorses, however.

    First thing is to get the algorithm down to it's best running time - and I'm sure we're not there yet (at least my version isn't there yet). You may want to add or subtract some features of your program, etc. Best to do that now

    THEN, you can take this best algorithm/program, and run it on whatever you have, knowing that within reason (reason being no optimizations for different hardware), you have the best program running.

    I'm an elitist in that way. A program with a mediocre algorithm, running on a fast NVidia GPU, is a total bore. I'd much prefer a program with a great algorithm, running on some puny cpu, for now. Because maybe next month, I'll get it ported over to that hot GPU or overclocked six core cpu.

    I smile just thinking about it.

  14. #14
    Registered User \007's Avatar
    Join Date
    Dec 2010
    Posts
    179
    Haha, it may be a poor algorithm but when the completion time is cut down 80% no one cares about the algorithm anymore

    However, it's best to max the algorithm and then split it up on multiple cores or the GPU etc.

    edit:

    Not to mention... splitting it up on multiple cores and lowering the time needed to complete the computation may help you design a better algorithm. It's easier to test something that completes in 40 seconds as opposed to 10 minutes etc.
    Last edited by \007; 12-15-2010 at 09:17 PM.

  15. #15
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    The OP may be long gone, but this is an interesting topic.

    This program uses the same nested-loop-per-letter-in-the-password, algorithm, with a few changes:

    1) Using an array of int's as indeces to the character set, it makes changes to the program, easier.

    2) It makes a lot of checks UPSTREAM, which short-circuits lots of processing later on.

    It decodes the encrypted text, in about a tenth of a second, at 2.66GHz, and covers the entire search space in about 5.3 seconds. The fast decode is primarily caused by luck, since the text begins with the letter 'A'. It tries 590,499 different permutations before finding the right answer. Note that the 590k number, doesn't include cut-offs made upstream - that count only includes those permutations where the password was "plausible".

    This is effective in this case, because the password was short (and known) at just five char's. Longer passwords would make it exponentially slower to solve, and show the entire algorithm to be inadequate.

    A better algorithm would take the first letter of the candidate password (cp), and check it's plausibility across the full length of the encrypted text:
    Code:
    the encrypted text  <-- the plain text we're looking for
    acornsacornsacorns  <-- the repeating password
    When starting out with the better algorithm, we'd do this:
    Code:
    odd = 0
    maxOdd = length of message/10 to 14 
    i=0
    while(i < length of message) 
      
      while(odd < maxOdd) 
        ch = encrypted[i] - cp[i]    
        test ch to see if it's an odd char across the entire encrypted text 
        the encrypted text
        a     a     a     a
        if ch is an impossibly odd char for your string (maybe '@') 
          break
        if slightly odd (maybe '.') odd++
      end while
      ++i;
    end while
    The above is rough (I haven't coded one like this), but I hope it shows the algorithm OK. The tremendous advantage is that the nested loops never get stuck doing millions of permutations, behind a first or second char, that can already be proven impossible.

    This short-circuits a LOT of possible work, but you'll notice it only with longer passwords.


    Code:
    /* When finished, it will encrypt and decrypt words. NOT designed for general cryptology.
    
      Status: not thoroughly tested.
    
      attempts to decrypt simple jumbled words where each letter has been
      added to a password:
       The encoded string   multiple words
     + passwordspasswords   (a repeating password)
       ===========================
      s[0] = 'T'+'p'; 
      s[1]='h'+'a'; 
      //etc.
    
    */
    
    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    #include <time.h>
    
    #define SPACE 32
    
    long fileSize(FILE *stream);
    
    int main() {
      unsigned int i,j, len, odd, maxOdd, minOdd;
      unsigned int oddb,oddc,oddd,odde, max;
      time_t start, stop;
      long fsize, count=0;
      //these are all the allowable char's for the password AND the text
      //the first letter of the password must be a regular letter
      char ok[]={
       32, 33, 34, 39, 44, 45, 46, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 63,\
       65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82,\
       83, 84, 85, 86, 87, 88, 89, 90, 97, 98, 99,100,101,102,103,104,105,106,\
      107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122 
      };
      unsigned int a,b,c,d,e; 
      unsigned char ch;
      
    /* str1 is used to create encrypted text, only. This is the phrase the OP's prof chose, not me. 
    I added some newline char's to it here, to fit it in under the forum display width. */
      //unsigned char str1[]="Am I going MAD, or did the word \"think\" escape your lips?
     You were not hired for your brains, you hippopotamic land mass.";  
      unsigned char str2[]="     ";  //was acorns repeating
      unsigned char str3[]="str3 should be empty, and as long as the plaintext in str1.
    Shortened here to stop breaking the forum display width ";
      unsigned char *s;
      FILE *fp, *fpwords;
    
      /*  //creates a new encrypted text file, if you don't have one
      fp=fopen("crypt.txt", "wb");
      for(i=0;i<sizeof(str1);i++) {
        str3[i] = str1[i]+str2[i];
        fputc(str3[i], fp);
      }
      fclose(fp);
      putchar('*');
      return 0;
      //end of block to create a new encrypted text file  
      */
      start=clock();
      max=sizeof(ok)/sizeof(ok[0]);
      fp=fopen("crypt.txt", "r");
      fsize=fileSize(fp);
      s=calloc(fsize, sizeof(char));
      
      if(s==NULL) {
        printf("\nError allocating memory\n");
        return 1;
      }
      printf("\nLength of encrypted string: %ld", fsize);
      fgets(s, fsize, fp);
      len=strlen(s);
      maxOdd=len/12;
      if(maxOdd<3) 
        maxOdd=3;
      minOdd = maxOdd;
      if(s[len-1]=='\n')
        s[len-1]='\0';
    
      printf("\nEncrypted Text:\n%s\n", s);
      fclose(fp);
    
      //start brute force checking
      oddb=oddc=oddd=odde=0;
      //first letter can't be an odd one: .',?!>-=", etc.
      for(a=18;a<max;a++) { //ok[18]=='A' 
        ch=s[0] - ok[a];
        if(ch > 122) //ch > 'z'
           continue;
        if((ch < 'A') || (ch>90 && ch<97)) { //1st letter must be at least 'A' 
          continue;                          //and not punctuation or odd char
        }
        str2[0]=ch;
        
        for(b=0;b<max;b++) { //could be a SPACE or odd char 
          ch=s[1] - ok[b];
          if(ch > 122) { //ch > 'z'
            continue;
          }
          if(ch < SPACE) {
            break;
          }
          oddb=0;
          if((ch>32 && ch<65) || (ch>90 && ch<97)) { 
            oddb++;
            if(oddb>1) {
              break; //continue;
            }
          }
          str2[1]=ch;
    
          for(c=0;c<max;c++) {  
            ch=s[2] - ok[c];
            if(ch > 122) //ch > 'z'
              continue;
            if(ch < SPACE) {
              break;
            }
            oddc=oddb;
            if((ch>32 && ch<65) || (ch>90 && ch<97)) {
              oddc++;
              if(oddc>1) {
                break; //continue;
              }
            }
            str2[2]=ch;
    
            for(d=0;d<max;d++) { 
              ch=s[3] - ok[d];
              if(ch > 122) //ch > 'z'
                continue;
              if(ch < SPACE) {
                break;
              }
              oddd=oddc;
              if((ch>32 && ch<65) || (ch>90 && ch<97)) {
                oddd++;
                if(oddd>1) {
                  break;  //was continue;
                }
              }
              str2[3]=ch;
    
              for(e=0;e<max;e++) {  
                ch=s[4] - ok[e];
                if(ch > 122) //ch > 'z'
                  continue;
                if(ch < SPACE) {
                  break; //continue;
                }
                odde=oddd;
                if((ch>32 && ch<65) || (ch>90 && ch<97)) {
                  odde++;
                  if(odde>2) {
                    continue;
                  }
                }
                str2[4]=ch;
                ++count;
                /*
                  install a trial password, full length of text and 
                  try to find a plain text with the fewest odd char's
                */ 
                odd=0;
                for(i=0, j=0;i<len;i++,j++) { //
                  if(j>4) {
                    j=0;
                  }
                  ch=(s[i]-str2[j]);
                  if(ch > 'z' || ch < SPACE) {
                    break; //continue;
                  }
                  if((ch>32 && ch<65) || (ch>90 && ch<97)) {
                    odd++;
                    if(odd>maxOdd) {
                      break;  //continue;
                    }
                  }
                  str3[i]=ch;
                }
                str3[i-1]='\0';
                if(i == len && odd < minOdd) {  
                  minOdd=odd;
                  stop=clock();
                  printf("\nTrials: %ld  Password: %s  Odd: %u Time: %.3f seconds\nText: %s\n",
                  count, str2, odd, (stop-start)/CLK_TCK, str3);
                  //getchar();
                }
              }
            }
          }
        }
      }
      free(s);
      stop=clock();
      printf("\nElapsed time was: %.3f seconds\n", (stop-start)/CLK_TCK);
      printf("\n\t\t\t     press enter when ready");
    
      (void) getchar(); 
      return 0;
    }
    long fileSize(FILE *stream)
    {
       long curpos, length;
    
       curpos = ftell(stream);
       fseek(stream, 0L, SEEK_END);
       length = ftell(stream);
       fseek(stream, curpos, SEEK_SET);
       return length;
    }
    Last edited by Adak; 12-24-2010 at 08:50 PM.

Page 1 of 2 12 LastLast
Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Keeping program running at constant speed.
    By twinzen in forum C++ Programming
    Replies: 2
    Last Post: 07-31-2010, 04:15 AM
  2. different ways to program
    By herWter in forum Windows Programming
    Replies: 3
    Last Post: 07-07-2008, 01:41 AM
  3. Creating a Carriage-Return style Break in a program
    By JoeMomma5000 in forum C Programming
    Replies: 2
    Last Post: 10-29-2002, 01:06 AM
  4. Tab Controls - API
    By -KEN- in forum Windows Programming
    Replies: 7
    Last Post: 06-02-2002, 10:44 AM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21