Thread: Hashing (non-technical question)

  1. #1
    Registered User
    Join Date
    May 2006
    Posts
    35

    Hashing (non-technical question)

    I'm a little unclear as to what hashing is exactly. I'm in the process of writing a word jumbling program and I came up with a scheme to compare a large number of words fairly quickly. I want to know if the following is a true form of hashing.

    I got the idea from a letter frequency count algorithm. It seemed like there would be far too many for loops in it for me. So I came up with this. Instead of using arrays, I would use bit fields. I reserve 28 bits. 26 for each letter of the alphabet, and 2 unused bits just to make it a multiple of four (4bits/byte).

    The word "cats" would be encoded like this:
    Code:
    **abcdefghijklmnopqrstuvwxyz
    0010100000000000000011000000
    Now, with a simple bitwise &, I can match all possible permutations AND lengths in one if(). (i.e. cats, cat, cast, a, act, at, sac, etc);

    In case you want to tell me this won't work, forget about the actual frequency of each letter. All I'm concerned with here is if contains AT LEAST one occurence of the letter. It will still match no matter how many occurences there are, but will have to be verified after the match is made.

    Would this be considered hashing or not?

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    I'd say it is.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    May 2006
    Posts
    35
    sorry for posting in the wrong section....i was originallyl going to ask a c++ question, but changed my mind.

    so when i read about hashing and it says "formula", it doesn't necessarily mean a traditional mathematical formula? ( i say "traditional" because i guess technically everything about programming could *possibly* be considered a formula)

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Anything which transforms a string (say "cats") into some numeric representation of that string is a hash function.

    You're taking a potentially large and sparse space (the sequence of all possible letters which happen to be English words) and reducing it to a much smaller space, with possible collisions (your scheme for example collides on anagrams - maybe this is important, then again, maybe it isn't).
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. another do while question
    By kbpsu in forum C++ Programming
    Replies: 3
    Last Post: 03-23-2009, 12:14 PM
  2. Technical Specialist - Update Delivery - C Required
    By ISCRecruiter in forum Projects and Job Recruitment
    Replies: 0
    Last Post: 07-21-2008, 09:50 AM
  3. callbacks and threads - technical question
    By Jumper in forum Windows Programming
    Replies: 2
    Last Post: 07-11-2004, 12:09 AM
  4. Question...
    By TechWins in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 07-28-2003, 09:47 PM
  5. Double Hashing
    By carrja99 in forum C++ Programming
    Replies: 1
    Last Post: 03-28-2003, 08:36 AM