Thread: non byte aligned memory access

  1. #1
    Registered User
    Join Date
    Nov 2007
    Posts
    3

    non byte aligned memory access

    I have a situation where I need to store a large number of long case-insensitive alphabetic strings in RAM. I need to conserve as much space as possible.

    My thinking is that since I only have 27 values (26 letters plus null) to represent for each character, I can get away with having each character represented by only 5 bits instead of the usual 7 (actually 8) bit ASCII.

    Its easy enough to reduce an 8 bit ASCII char to 5 bits by doing a bitwise AND of each 8 bit char with 0001 1111 and then left shift by 3 to get the character I need primed into 5 bit representation.

    However, I have no idea how I would go about writing this to RAM, and then have the next character come right after starting at the next unused bit and so on. The problem is that I will have single characters spanning byte-byte gaps.

    essentially i want RAM to look like this:
    e.g. i have eight characters named a,b,c,d,e,f,g,h

    aaaaabbb bbcccccd ddddeeee efffffgg ggghhhhh

    (the above shows five bytes according to which character each bit is part of)

    except that I wont always be writing in nice blocks of 8 characters so that we get a nice even 5 bytes.

    I'm not sure if it is possible to do this, perhaps I might be able to inline assembly code if it can't be done in C proper?

    Also, I don't really care how the characters are stored, data type doesn't matter. all i want to do is be able to read and write these raw numeric values easily.

    Thanks. (I'm new to programming, so if this is a completely crazy question I apologize :-) )

  2. #2
    Hurry Slowly vart's Avatar
    Join Date
    Oct 2006
    Location
    Rishon LeZion, Israel
    Posts
    6,788
    You will make a buffer wrapper that contains the pointer to the first byte, byte-offsett and bit-offset in the current byte

    and the access functions that will receive pointer to this struct and request to read or write several bits from the buffer

    You wiil read for example 2 successive bytes as 16-bit value, use current bit offset and required bits count to calculate the required bit shift to remove the non-needed bits at the end
    and than mask the result based on the required bit count to remove already read bits at the beginning...

    after that you will update byte and bit-offsets if needed...

    The write operation works in the similar way
    All problems in computer science can be solved by another level of indirection,
    except for the problem of too many layers of indirection.
    – David J. Wheeler

  3. #3
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Are you sure 5 bits is all you need? That gives you 32 possible characters (or 31 + NULL). You can't have mixed case letters, numbers, or a very large number of other characters like comma, period, brackets...

  4. #4
    Registered User
    Join Date
    Nov 2007
    Posts
    3
    Thats correct, five bits is all I need as I will not be representing any punctuation or mixed case letters. Case sensitivity does not matter for my application, so all I need is the 26 letters + null.

  5. #5
    Woof, woof! zacs7's Avatar
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    3,459
    BTW, 0 byte characters are referred to as NUL or nul not NULL, NULL is for pointers -- see the FAQ.

  6. #6
    Registered User
    Join Date
    Nov 2007
    Posts
    3
    aha, I will keep that in mind in the future. Thanks.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Mutex and Shared Memory Segment Questions.
    By MadDog in forum Linux Programming
    Replies: 14
    Last Post: 06-20-2010, 04:04 AM
  2. Problems with shared memory shmdt() shmctl()
    By Jcarroll in forum C Programming
    Replies: 1
    Last Post: 03-17-2009, 10:48 PM
  3. Assignment Operator, Memory and Scope
    By SevenThunders in forum C++ Programming
    Replies: 47
    Last Post: 03-31-2008, 06:22 AM
  4. pointers
    By InvariantLoop in forum C Programming
    Replies: 13
    Last Post: 02-04-2005, 09:32 AM
  5. Locating A Segmentation Fault
    By Stack Overflow in forum C Programming
    Replies: 12
    Last Post: 12-14-2004, 01:33 PM