Ways to speed up a bruteforce-style program

**inequity** · 12-13-2010

On an optional assignment for a recent C class, we were given a binary file in which some text had been jumbled up using a really simple algorithm.

Here's an example of how the text was encoded,
using a password "foobar" and string "THIS IS THE ENCODED STRING"

Code:

     THIS IS THE ENCODED STRING
+   foobarfoobarfoobarfoobarfoob

     ----------------------------------

=   îUUñ¿xí±åæärßßOíaO_±äòñxèO

So our professor gave us a binary file with a bunch of text in that has been "encrypted" like this, using a five letter lowercase word.

We have to go through the file trying to different strings to decode it. We also aren't given a "solved" string, so we have to determine if the result we get from decoding with any given string is in fact a valid message.

I've finished the assignment successfully, so that's not why I'm here, don't worry guys. But I'm just really interested in this, and think it would be interesting to see how much faster I could get the program to accomplish it's task.

Current I am running through 5 loops, each representing one character of the password, and inside them I run through the entire file, converting each character and making a note of when I encounter a BAD character ( any character not A-Z, a-z, or ! ? . , " ), and also making note of how many spaces I encounter.

If the current decoded attempt has the least bad characters and most spaces, then I store it as the current "most correct" string. If I read through the file and find less than two "weird" characters, I assume that we've found the string, and return the value.

Are there faster methods for doing this? Are there better methods for detecting whether or not a string is a valid string in the English language? I just think it would be fun to get this working better, so I don't have to try 8 million things to decode one sentence, but I dont really know where to begin.

Below is my code for the program, and a sample output

Code:

/**************************************************************************
filename    decode_jumbledstring.c
author      inequity
date        12/13/2010

Brief Description:
  This program reads in a binary file which as been jumbled using a
  a very simple algorithm. The original text has been encoded using
  a 5 digit string of lowercase characters, so this program attempts
  to decode it as fast as possible using different string (which 
  isn't very fast at the moment)
**************************************************************************/
#include <stdio.h>  /* printf, fopen, fclose, fseek, fgetc, feof */
#include <stdlib.h> /* malloc, free                              */
#include <string.h>

#define DEBUG /* comment this out to turn off unnecessary printfs */
#define PASSWORD_LENGTH 5

int decode(const unsigned char secret_phrase[], /* encrypted string  */
           unsigned char decoded_phrase[],      /* plain text string */
           unsigned char password[],            /* password used     */
           int length,                          /* string length     */
           int *passes                          /* amount of attempts*/
          );

int main(void)
{

    /* This file contains the encrypted message */
  const char *filename = "secret_phrase.bin";

    /* Account for NULL terminator */
  unsigned char password[PASSWORD_LENGTH + 1] = {0};
  unsigned char *secret_phrase; /* Encoded phrase */
  unsigned char *plain_text;    /* Decoded phrase */

  FILE *infile; /* File pointer we're reading from     */
  size_t size;  /* Number of bytes in the file         */
  size_t count; /* Number of bytes fread from the file */
  int found;    /* Was the password found              */
  int length;   /* Length of the string                */
  int *passes;  /* ++'d each time we try a new string  */

  /* initially set number of passes (attempts) we've made to zero */
  *passes = 0;

    /* Open in binary (raw) since we don't want any translations */
    infile = fopen(filename, "rb");
    if (!infile)
    {
      printf("Can't open %s for reading.\n", filename);
      return -1;
    }

    /* Count bytes in the file */
  size = 0;
  fgetc(infile);
  while (!feof(infile))
  {
    size++;
    fgetc(infile);
  }

    /* The file was empty */
  if (!size)
  {
    printf("Nothing to decode.\n");
    return -2;
  }

    /* Set cursor back to beginning of the file */
  fseek(infile, 0, SEEK_SET);

  length = sizeof(unsigned char)*(size+1);
    /* Allocate the proper amount of memory */
  secret_phrase = (unsigned char *)malloc(size + 1);
  plain_text = (unsigned char *)malloc(size + 1);

    /* Now read in the secret phrase from the file */
  count = fread(secret_phrase, 1, size, infile);
  if (count != size)
  {
    printf("Expected %i bytes, but only read %i.\n", size, count);
    return -3;
  }

    /* Terminate the string of bytes */
  secret_phrase[count] = 0;

    /* Done with the file reading */
  fclose(infile);

  printf("%s\n",secret_phrase);
    /* Attempt to decode the characters */
  found = decode(secret_phrase, plain_text, password, length, passes);
    /* Did we find the password or not? */
  if (found)
  {
    printf("\n\nFOUND!\n");
    printf("The password is: %s\n", password);
    printf("The message was: %s\n", plain_text);
    printf("Tried %d combinations.\n",*passes);
  }
  else
    printf("\n\nThe secret phrase was not discovered.\n");
    /* Free the memory that we allocated. */
  free(secret_phrase);
  free(plain_text);

  return 0;
}

int decode(const unsigned char secret_phrase[], /* encrypted string  */
           unsigned char decoded_phrase[],      /* plain text string */
           unsigned char password[],            /* password used     */
           int length,                          /* length of string  */
           int *passes                          /* number of attempts*/
          )
{
  int i, j, k, l, m, n;         /* Loop counters                                  */
  unsigned char charPtr;        /* Pointer to read values of characters           */
  unsigned char tempStr[200];   /* Temporary to hold string from current          */
                                /* decode attempt.                                */
  int weirdChars = 0;           /* How many bad characters we've encountered      */
  int leastWeirdChars = 40;     /* Least bad characters we've found in an attempt */
  int spacesFound = 0;          /* Spaces found in current attempt                */
  int mostSpacesFound = 0;      /* Most spaces we've found in an attempt yet      */
  
  /* Increment through the alphabet for each of the letters in the password */
  for(i = 'a'; i <= 'z'; i++)
  {
    for(j = 'a'; j <= 'z'; j++)
    {
      for(k = 'a'; k <= 'z'; k++)
      {
        for(l = 'a'; l <= 'z'; l++)
        {
          for(m = 'a'; m <= 'z'; m++)
          {
            /* Initialize variables */
            *passes += 1;
            weirdChars = 0;
            spacesFound = 0;
            /* Set password characters to loop counter values */
            password[0] = i;
            password[1] = j;
            password[2] = k;
            password[3] = l;
            password[4] = m;
            
            /* For each character in the string */
            for(n = 0; n < length; n++)
            {
              /* Set charPtr to the secret_phrase minus the matching password char */
              charPtr = secret_phrase[n] - (password[n % PASSWORD_LENGTH]);
              /* uppercase 65-90, lowercase 97-122, 32, 33, 44, 46, 63, 59 */
              /* regular char, uppercase */
              if( charPtr >= 65 && charPtr <= 90)
              {
              
              }
              /* regular char, lowercase */
              else if( charPtr >= 97 && charPtr <= 122)
              {
              
              }
              /* check if character is ? ! . , ; */
              else if( charPtr == 33 || charPtr == 44 || charPtr == 46 
                    || charPtr == 59 || charPtr == 63 )
              {

              }
              else if( charPtr == 34 )
              {
              
              }
              else if( charPtr == 32 )
              {
                spacesFound++;
              }
              else
              {
                weirdChars++;
              }
                tempStr[n] = charPtr;
                
                if(weirdChars > leastWeirdChars)
                {
                  n = length;
                }
            }
            
            
                
            if(weirdChars <= leastWeirdChars && spacesFound >= mostSpacesFound)
            {
              tempStr[length-1] = ' ';
              tempStr[length] = ' ';
              for(n = 0; n <= length; n++)
              {
                decoded_phrase[n] = tempStr[n];
              }
              decoded_phrase[length] = 0;
              #ifdef DEBUG
              printf("%s - password: %s\n",decoded_phrase,password);
              #endif
              leastWeirdChars = weirdChars;
              mostSpacesFound = spacesFound;
              
              if(weirdChars < 2)
              {
                return 1;
              }
            }
          }
        }
      }
    }
  } 
  return 0;
}

Code:

'á,²ZUaIxO"A¥-s"aÖ%OÜO,YÖO"UOàx"+YÖÜâI<ZOçÇE_O"YOaå"DO_æ3,AYè"UIàO"OOâ"ÜIUOx"EOà"íO_à"ÖÖExáç?%çâé,ÑxaäOUYçOÑOÑ"àÅxO"áÅÜá¡
S?"I y,kng2`CD,2,t d{w"thw3yorv3$th{?m" w+eapw3{ou,3nip.R"Yo╪3yerw3pot2{krev3hor2Oqur2utai?+. y?^"hi,ƒqpo+toic2⌂cnd2?css@  - password: aabin
S?!I y,jng2`BD,2,s d{w!thw3xorv3#th{?l" w+dapw3zou,3mip.R!Yo╪3xerw3oot2{jrev3gor2Opur2usai?+- y?^!hi,ƒppo+tnic2⌂bnd2?bss@  - password: aacin
S? K y,ipg2`AF,2,r"d{w vhw3wqrv3"vh{?k$ w+ccpw3yqu,3lkp.R [o╪3wgrw3nqt2{itev3fqr2Oowr2urci?+,"y?^ ji,ƒoro+tmkc2⌂apd2?aus@  - password: aadgn
S? J y,iog2`AE,2,r!d{w uhw3wprv3"uh{?k# w+cbpw3ypu,3ljp.R Zo╪3wfrw3npt2{isev3fpr2Oovr2urbi?+,!y?^ ii,ƒoqo+tmjc2⌂aod2?ats@  - password: aadhn
S? I"y,ini2`AD.2,r f{w tjw3wotv3"tj{?k""w+carw3yow,3lir.R Yq╪3wetw3nov2{irgv3fot2Oout2urak?+, {?^ hk,ƒopq+tmie2⌂anf2?asu@  - password: aadil
S? I!y,inh2`AD-2,r e{w tiw3wosv3"ti{?k"!w+caqw3yov,3liq.R Yp╪3wesw3nou2{irfv3fos2Oous2uraj?+, z?^ hj,ƒopp+tmid2⌂ane2?ast@  - password: aadim
S? I y,ing2`AD,2,r d{w thw3worv3"th{?k" w+capw3you,3lip.R Yo╪3werw3not2{irev3for2Oour2urai?+, y?^ hi,ƒopo+tmic2⌂and2?ass@  - password: aadin
S⌂ I y?ing2_AD,2?r d{v thw2worv2"th{?k" w.capw2you,2lip.Q Yo╪2werw2not2zirev2for2<our2trai?., y?╪ hi,,opo+smic2~and2⌂ass@  - password: abdin
S~ I y?ing2^AD,2?r d{u thw1worv1"th{⌂k" w,capw1you,1lip.P Yo╪1werw1not2yirev1for2Sour2srai?,, y?+ hi,?opo+rmic2}and2~ass@  - password: acdin
S} I y⌂ing2]AD,2⌂r d{t thw0worv0"th{~k" wƒcapw0you,0lip.O Yo╪0werw0not2xirev0for2%our2rrai?ƒ, y?. hi,?opo+qmic2|and2}ass@  - password: addin
S| I y~ing2\AD,2~r d{s thw/worv/"th{}k" w,capw/you,/lip.N Yo╪/werw/not2wirev/for2^our2qrai?,, y?, hi,⌂opo+pmic2{and2|ass@  - password: aedin
S{ I y}ing2[AD,2}r d{r thw.worv."th{|k" w?capw.you,.lip.M Yo╪.werw.not2virev.for2╪our2prai??, y?ƒ hi,~opo+omic2zand2{ass@  - password: afdin
Sy I y{ing2YAD,2{r d{p thw,worv,"th{zk" w⌂capw,you,,lip.K Yo╪,werw,not2tirev,for2.our2nrai?⌂, y?? hi,|opo+mmic2xand2yass@  - password: ahdin
So I yqing2OAD,2qr d{f thw"worv""th{pk" wucapw"you,"lip.A Yo╪"werw"not2jirev"for2{our2drai?u, y?w hi,ropo+cmic2nand2oass@  - password: ardin
Sn I yping2NAD,2pr d{e thw!worv!"th{ok" wtcapw!you,!lip.@ Yo╪!werw!not2iirev!for2zour2crai?t, y?v hi,qopo+bmic2mand2nass@  - password: asdin
Sm J yoiog2MAE,2or!d{d uhw wprv "uh{nk# wscbpw ypu, ljp.? Zo╪ wfrw npt2hisev fpr2yovr2brbi?s,!y?u ii,poqo+amjc2laod2mats@  - password: atdhn
Sm I"yoini2MAD.2or f{d tjw wotv "tj{nk""wscarw yow, lir.? Yq╪ wetw nov2hirgv fot2yout2brak?s, {?u hk,popq+amie2lanf2masu@  - password: atdil
Sm I!yoinh2MAD-2or e{d tiw wosv "ti{nk"!wscaqw yov, liq.? Yp╪ wesw nou2hirfv fos2yous2braj?s, z?u hj,popp+amid2lane2mast@  - password: atdim
Sm I yoing2MAD,2or d{d thw worv "th{nk" wscapw you, lip.? Yo╪ werw not2hirev for2your2brai?s, y?u hi,popo+amic2land2mass@  - password: atdin
Rm I xoing1MAD,1or dzd thv woru "thznk" vscapv youƒ lip,? Yo+ werv not1hireu for1your1brai⌂s, y?u hi?popo.amic1land1mass?  - password: btdin
Om I uoing.MAD,.or dwd ths worr "thwnk" sscaps you? lip?? Yoƒ wers not.hirer for.your.brai|s, y}u hi~popo,amic.land.mass<  - password: etdin
Mm I soing,MAD,,or dud thq worp "thunk" qscapq you~ lip? Yo? werq not,hirep for,your,braizs, y{u hi|popo?amic,land,mass:  - password: gtdin
Cm I ioing"MAD,"or dkd thg worf "thknk" gscapg yout lipu? Yow werg not"hiref for"your"braips, yqu hirpopovamic"land"mass0  - password: qtdin
Bm I hoing!MAD,!or djd thf wore "thjnk" fscapf yous lipt? Yov werf not!hiree for!your!braios, ypu hiqpopouamic!land!mass/  - password: rtdin
Ao I gqing OAD, qr dif the"word""thipk" eucape"your"lipsA You"were"not jired"for {our drainu, yow hipropotcmic nand oass.  - password: srdin
An I gping NAD, pr die the!word!"thiok" etcape!your!lips@ You!were!not iired!for zour craint, yov hipqopotbmic mand nass.  - password: ssdin
Am J goiog MAE, or!did uhe wprd "uhink# escbpe ypur ljps? Zou wfre npt hised fpr yovr brbins,!you iippoqotamjc laod mats.  - password: stdhn
Am I"goini MAD. or fid tje wotd "tjink""escare yowr lirs? Yqu wete nov hirgd fot yout brakns, {ou hkppopqtamie lanf masu.  - password: stdil
Am I!goinh MAD- or eid tie wosd "tiink"!escaqe yovr liqs? Ypu wese nou hirfd fos yous brajns, zou hjppopptamid lane mast.  - password: stdim
Am I going MAD, or did the word "think" escape your lips? You were not hired for your brains, you hippopotamic land mass.  - password: stdin


FOUND!
The password is: stdin
The message was: Am I going MAD, or did the word "think" escape your lips? You were not hired for your brains, you hippopotamic land mass.
Tried 8561762 combinations.

Anyways, any help would be greatly appreciated. And any comments on my coding style are also greatly appreciated.

**anduril462** · 12-13-2010

First, don't use feof to control a loop: Cprogramming.com FAQ > Why it's bad to use feof() to control a loop. Other than that, I don't think your coding style is too bad.

As for optimization, with a fixed password length that's as short as yours, and a relatively short cipher text, I don't know that 8 million possibilities is that bad, even if you multiply that by the length of your string (since you process characters individually). Should be reasonably quick on new computers. In anycase, here are a few things I would do:

The general strategy here is to analyze your cipher text, determine what the most likely choices for password characters are for each slot and try those solutions first.

1. Don't check the size of the file with fgetc in a loop. Use some arithmetic with fseek or use a platform specific function like stat on Unix/Linux (don't know the Windows equivalent). This is relatively small, but could still help if you had a large text to decode.
2. Use standard library function like strcpy, they're super efficient compared to your for loop.
3. Do a frequency analysis of the most common encoded characters. They should roughly map to the most frequent characters of your plain-text language. E.g. 'e' accounts for roughly 12% of all letters in regular English text, so if you find an encoded character that accounts for roughly 12% of encoded characters, it's probably the encoding of an 'e'.
4. Go through the string once and build a list/range of possible characters for each letter of your password. That is, if the character in position 0 is î, then you may be able to tell that subtracting 'a' through 'g' would still result in a weird character, as would, say, 'w' through 'z', thus your only valid character for the first letter of your password would be 'h' through 'v'. Since your password repeats so many times through out your cipher text, you can potentially narrow this down to a set of 5 or 10 likely characters. If you can't actually throw away letters as potential password characters , you could try sorting them in order of most likely to least likely so you're likely to come across the best solution first.
5. Break out of your loop once your weirdCharacters gives you worse performance. This will compounded with #'s 3 and 4 can help a lot.
6. You don't actually need to copy the whole cipher text each time you find a better password. Just copy the password, then do a full decode at the end, since your text is much longer than the password.

Numbers 2 and 3 pay off more when your encoded text gets longer as letter frequencies become more accurate and you can better narrow down the range of possible characters for an individual password character.

As for determining whether you have a legit English phrase, you could look at estimating string weirdness and spaciness. Say you are willing to accept .1% weird characters (1 in 1,000). If your string is 1,000 chars long and you find 2 weird chars in the first 10 positions, you can bail out after checking just 10 characters b/c the string is just too weird to be properly decoded. Do likewise for spaces.

That's all I have for now.

**Adak** · 12-14-2010

This was a fun program! I spent a lot of time poking around with it.

The one thing that made more difference than anything else I tried, was to short-circuit the for loops - just as soon as possible.

When you want the trials letters that are being generated as a possible solution, you see a lot of junk being produced. That can steal a LOT of time. If you know the password can't have a $ in it, for instance. Then don't make it a candidate letter, only to be rejected by yet more logic, later!

Probably because I was tired, but despite working with ranges quite a bit, I didn't find them a huge help. A few if statements made the code much more readable, and did the same work.

Adding the "candidate letter guardian" if statements, and keeping it simple, gave me the best solving times, by far. I can out simple most anyone, but let's keep that between ourselves, ok?

I don't know if it's still as good as it used to be, but there was a crypto group on usenet (the newsgroups) side of the internet, and they had a couple fantastic cryptographers occasionally visit the group. Did some amazing de-crypting. I'd run this by them, along with your program, and some actual solving times, if you're interested.

In any case, a really fun assignment.

P.S. The if statements were immediately after the for loop:

Code:

for(.........) {
  if(this letter makes junk) continue;

  Two more, just about like it

  for("next for loop) {
    //very similar if statements

     //etc.

  }

}

**anduril462** · 12-14-2010

Nice Adak! I was wishing I had time to actually get some metrics on that stuff, but I threw it together just before leaving work yesterday and never got back to it. Any idea how your average- and worst-case big O stack up against the OP's? I'd be curious to know what kind of performance gain you got.

Crypto stuff like this really is a blast. One of my favorite assignments (from a linguistics class of all things) was to decrypt a Huffman coded text without the dictionary. A fun little algorithm and not too complicated.

**Adak** · 12-14-2010

Over half the Summer, I did a project looking for a 16 givens Sudoku grid with just a single solution. As odd as it might sound, the backbone of the project was about 16 nested loops. Trillion+ grids generated and solved. Cut offs or smart ranges to begin with, are the only thing that makes loops like that bearable on such programs.

My idea here, was to try and NOT generate any "weird" char's, because you're basically working with an "odometer" here - the .1 mile wheel, won't bother you, but wait until you have a "weird" char up in the thousands wheel.

With the clearer letters, I was hoping to tie the program into my special dictionary files. The words are all in files, according to their length, 3words.txt has all the common words with three letters only, etc. Some of these files also have the letter patterns included, like:

123 cat
122 ill
121 mom
etc.

Once the first few words are found, the whole thing should "solve", like Sudoku when you get down to naked singles. That's my hope, anyway.

I spent too much time with the range array of structs (lo, hi), for each letter. When I changed it over -- well, the program now looks like a grenade hit it.

The big O for this, must be truly appalling. N! * N!... ?? Without some help, you could be here a very long time. Before the addition of the speed up's, I could really see it too.

I'm going to tinker some more with this, and clean it up. Don't have a ton of time right now, but it's too good to let go of without a bit more.

I have not met Mr. Huffman, but he sounds quite interesting.

**brewbuck** · 12-14-2010

Instead of inventing your own method for determining the quality of your solution, use Index Of Coincidence.

**Adak** · 12-15-2010

The int pointer "*passes" is initialized with no address, and subsequently used. That needs fixing.

Anyone else get it to run?

I believe I also have a character set problem with this program. I need to generate my own binary of encrypted message.

Thanks for that link, Brewbuck. That Index of Coincidence is pretty clever stuff!

In the assignment, the length of the password was given, but how often is *that* going to occur?

**anduril462** · 12-15-2010

Originally Posted by Adak

The int pointer "*passes" is initialized with no address, and subsequently used. That needs fixing.

Anyone else get it to run?

Yeah, I only had to change passes to a plain int (instead of int *), then stop dereferencing it throughout main. Otherwise, the OP's program worked as expected.

I believe I also have a character set problem with this program. I need to generate my own binary of encrypted message.

I wrote a little program (attached) that seems to encode everything fine, but vi has a habit of appending a new line onto the end of text files, so I had to fix that too.

Your approach took me from 8,561,762 iterations down to 144! Granted each time I exited early, I still had to scan at least part of secret_phrase to count weird characters, but I could jump in 5 character increments and quit as soon as I exceeded the 1 weird character limit.

My intuition says we went from 26^p to 26*p worst-case complexity (where p is the password length), which is pretty darn good. I have yet to play with the index of coincidence stuff yet, but I imagine the time complexity is the same as your algorithm if the password length is known.

**Adak** · 12-15-2010

I got it going with:

Code:

  /*  //creates a new encrypted text file 
  fp=fopen("crypt.txt", "wb");
  for(i=0;i<sizeof(str1);i++) {
    str3[i] = str1[i]+str2[i];
    fputc(str3[i], fp);
  }
  fclose(fp);
  putchar('*');
  return 0;
  */

but I d/l'ed your encode.c anyway

Running his program (the original one, but with the int variable passes being initialized to zero, his program is showing only 420k+ passes (far from 8 million+).

Maybe that 8 million was wacko?

I looked, at my word lists - "hippopotamic" was NOT in there!

Weird word regards.

**anduril462** · 12-15-2010

I still got 8 million from his original code with just the passes modification. I think this fits since he was trying every combination (redoing his m loop for every l loop, etc). It breaks down like this:
For the password stdin, the OP goes through every possible combination for start letters a-r, then on s he only does part of the next 4 loops when the first letter is s (stopping when the second letter is t). This means that for the letters(position in alphabet) s(19) t(20) d(4) i(9) n(14), you have the following number of iterations of your inner loop:

aaaaa-rzzzz = 18 * 26^4 = 8,225,568
saaaa-sszzz = 19 * 26^3 = 333,944
staaa-stczz = 3 * 26^2 = 2,028
stdaa-stdhz = 8 * 26 = 208
stdia - stdin = 14

If you add up all those numbers, you get 8,561,762, which is what the OP got as well. Maybe you have some mod in there you forgot about?

**Adak** · 12-15-2010

You nailed it. It was late; the program was close, but being cranky, so I took a monkey wrench to it, and forgot about it.

It does that "run once too far (which I attribute to the !feof() file checking), so I knocked down the length of the message to just "size", instead of "size + 1".

With size +1, I get a spurious char right on the end, and the program says the secret message is not discovered. I left the malloc's unchanged.

Cutting the workload in half, by fixing a bug - I love it!

**\007** · 12-15-2010

I didn't read the code too much, but can you run some of these loops in parallel? Take advantage of multiple CPUs.. or better yet, if it can run in parallel then try to run the program on the GPU instead of the CPU.

You can also use your C language/code, but need to modify it a bit.

CUDA - Wikipedia, the free encyclopedia

**Adak** · 12-15-2010

Well yes, you can run them in parallel (just split up the string across the other cores), but that's cheating.

Ditto with GPU's. These newer GPU's with CUDA, are *racehorses*. Not the most durable racehorses, however.

First thing is to get the algorithm down to it's best running time - and I'm sure we're not there yet (at least my version isn't there yet). You may want to add or subtract some features of your program, etc. Best to do that now

THEN, you can take this best algorithm/program, and run it on whatever you have, knowing that within reason (reason being no optimizations for different hardware), you have the best program running.

I'm an elitist in that way. A program with a mediocre algorithm, running on a fast NVidia GPU, is a total bore. I'd much prefer a program with a great algorithm, running on some puny cpu, for now. Because maybe next month, I'll get it ported over to that hot GPU or overclocked six core cpu.

I smile just thinking about it.

**\007** · 12-15-2010

Haha, it may be a poor algorithm but when the completion time is cut down 80% no one cares about the algorithm anymore

However, it's best to max the algorithm and then split it up on multiple cores or the GPU etc.

edit:

Not to mention... splitting it up on multiple cores and lowering the time needed to complete the computation may help you design a better algorithm. It's easier to test something that completes in 40 seconds as opposed to 10 minutes etc.

**Adak** · 12-24-2010

The OP may be long gone, but this is an interesting topic.

This program uses the same nested-loop-per-letter-in-the-password, algorithm, with a few changes:

1) Using an array of int's as indeces to the character set, it makes changes to the program, easier.

2) It makes a lot of checks UPSTREAM, which short-circuits lots of processing later on.

It decodes the encrypted text, in about a tenth of a second, at 2.66GHz, and covers the entire search space in about 5.3 seconds. The fast decode is primarily caused by luck, since the text begins with the letter 'A'. It tries 590,499 different permutations before finding the right answer. Note that the 590k number, doesn't include cut-offs made upstream - that count only includes those permutations where the password was "plausible".

This is effective in this case, because the password was short (and known) at just five char's. Longer passwords would make it exponentially slower to solve, and show the entire algorithm to be inadequate.

A better algorithm would take the first letter of the candidate password (cp), and check it's plausibility across the full length of the encrypted text:

Code:

the encrypted text  <-- the plain text we're looking for
acornsacornsacorns  <-- the repeating password

When starting out with the better algorithm, we'd do this:

Code:

odd = 0
maxOdd = length of message/10 to 14 
i=0
while(i < length of message) 
  
  while(odd < maxOdd) 
    ch = encrypted[i] - cp[i]    
    test ch to see if it's an odd char across the entire encrypted text 
    the encrypted text
    a     a     a     a
    if ch is an impossibly odd char for your string (maybe '@') 
      break
    if slightly odd (maybe '.') odd++
  end while
  ++i;
end while

The above is rough (I haven't coded one like this), but I hope it shows the algorithm OK. The tremendous advantage is that the nested loops never get stuck doing millions of permutations, behind a first or second char, that can already be proven impossible.

This short-circuits a LOT of possible work, but you'll notice it only with longer passwords.

Code:

/* When finished, it will encrypt and decrypt words. NOT designed for general cryptology.

  Status: not thoroughly tested.

  attempts to decrypt simple jumbled words where each letter has been
  added to a password:
   The encoded string   multiple words
 + passwordspasswords   (a repeating password)
   ===========================
  s[0] = 'T'+'p'; 
  s[1]='h'+'a'; 
  //etc.

*/

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>

#define SPACE 32

long fileSize(FILE *stream);

int main() {
  unsigned int i,j, len, odd, maxOdd, minOdd;
  unsigned int oddb,oddc,oddd,odde, max;
  time_t start, stop;
  long fsize, count=0;
  //these are all the allowable char's for the password AND the text
  //the first letter of the password must be a regular letter
  char ok[]={
   32, 33, 34, 39, 44, 45, 46, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 63,\
   65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82,\
   83, 84, 85, 86, 87, 88, 89, 90, 97, 98, 99,100,101,102,103,104,105,106,\
  107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122 
  };
  unsigned int a,b,c,d,e; 
  unsigned char ch;
  
/* str1 is used to create encrypted text, only. This is the phrase the OP's prof chose, not me. 
I added some newline char's to it here, to fit it in under the forum display width. */
  //unsigned char str1[]="Am I going MAD, or did the word \"think\" escape your lips?
 You were not hired for your brains, you hippopotamic land mass.";  
  unsigned char str2[]="     ";  //was acorns repeating
  unsigned char str3[]="str3 should be empty, and as long as the plaintext in str1.
Shortened here to stop breaking the forum display width ";
  unsigned char *s;
  FILE *fp, *fpwords;

  /*  //creates a new encrypted text file, if you don't have one
  fp=fopen("crypt.txt", "wb");
  for(i=0;i<sizeof(str1);i++) {
    str3[i] = str1[i]+str2[i];
    fputc(str3[i], fp);
  }
  fclose(fp);
  putchar('*');
  return 0;
  //end of block to create a new encrypted text file  
  */
  start=clock();
  max=sizeof(ok)/sizeof(ok[0]);
  fp=fopen("crypt.txt", "r");
  fsize=fileSize(fp);
  s=calloc(fsize, sizeof(char));
  
  if(s==NULL) {
    printf("\nError allocating memory\n");
    return 1;
  }
  printf("\nLength of encrypted string: %ld", fsize);
  fgets(s, fsize, fp);
  len=strlen(s);
  maxOdd=len/12;
  if(maxOdd<3) 
    maxOdd=3;
  minOdd = maxOdd;
  if(s[len-1]=='\n')
    s[len-1]='\0';

  printf("\nEncrypted Text:\n%s\n", s);
  fclose(fp);

  //start brute force checking
  oddb=oddc=oddd=odde=0;
  //first letter can't be an odd one: .',?!>-=", etc.
  for(a=18;a<max;a++) { //ok[18]=='A' 
    ch=s[0] - ok[a];
    if(ch > 122) //ch > 'z'
       continue;
    if((ch < 'A') || (ch>90 && ch<97)) { //1st letter must be at least 'A' 
      continue;                          //and not punctuation or odd char
    }
    str2[0]=ch;
    
    for(b=0;b<max;b++) { //could be a SPACE or odd char 
      ch=s[1] - ok[b];
      if(ch > 122) { //ch > 'z'
        continue;
      }
      if(ch < SPACE) {
        break;
      }
      oddb=0;
      if((ch>32 && ch<65) || (ch>90 && ch<97)) { 
        oddb++;
        if(oddb>1) {
          break; //continue;
        }
      }
      str2[1]=ch;

      for(c=0;c<max;c++) {  
        ch=s[2] - ok[c];
        if(ch > 122) //ch > 'z'
          continue;
        if(ch < SPACE) {
          break;
        }
        oddc=oddb;
        if((ch>32 && ch<65) || (ch>90 && ch<97)) {
          oddc++;
          if(oddc>1) {
            break; //continue;
          }
        }
        str2[2]=ch;

        for(d=0;d<max;d++) { 
          ch=s[3] - ok[d];
          if(ch > 122) //ch > 'z'
            continue;
          if(ch < SPACE) {
            break;
          }
          oddd=oddc;
          if((ch>32 && ch<65) || (ch>90 && ch<97)) {
            oddd++;
            if(oddd>1) {
              break;  //was continue;
            }
          }
          str2[3]=ch;

          for(e=0;e<max;e++) {  
            ch=s[4] - ok[e];
            if(ch > 122) //ch > 'z'
              continue;
            if(ch < SPACE) {
              break; //continue;
            }
            odde=oddd;
            if((ch>32 && ch<65) || (ch>90 && ch<97)) {
              odde++;
              if(odde>2) {
                continue;
              }
            }
            str2[4]=ch;
            ++count;
            /*
              install a trial password, full length of text and 
              try to find a plain text with the fewest odd char's
            */ 
            odd=0;
            for(i=0, j=0;i<len;i++,j++) { //
              if(j>4) {
                j=0;
              }
              ch=(s[i]-str2[j]);
              if(ch > 'z' || ch < SPACE) {
                break; //continue;
              }
              if((ch>32 && ch<65) || (ch>90 && ch<97)) {
                odd++;
                if(odd>maxOdd) {
                  break;  //continue;
                }
              }
              str3[i]=ch;
            }
            str3[i-1]='\0';
            if(i == len && odd < minOdd) {  
              minOdd=odd;
              stop=clock();
              printf("\nTrials: %ld  Password: %s  Odd: %u Time: %.3f seconds\nText: %s\n",
              count, str2, odd, (stop-start)/CLK_TCK, str3);
              //getchar();
            }
          }
        }
      }
    }
  }
  free(s);
  stop=clock();
  printf("\nElapsed time was: %.3f seconds\n", (stop-start)/CLK_TCK);
  printf("\n\t\t\t     press enter when ready");

  (void) getchar(); 
  return 0;
}
long fileSize(FILE *stream)
{
   long curpos, length;

   curpos = ftell(stream);
   fseek(stream, 0L, SEEK_END);
   length = ftell(stream);
   fseek(stream, curpos, SEEK_SET);
   return length;
}

Thread: Ways to speed up a bruteforce-style program

Thread Tools

Search Thread

Display

Ways to speed up a bruteforce-style program

Similar Threads

Keeping program running at constant speed.

different ways to program

Creating a Carriage-Return style Break in a program

Tab Controls - API