Thread: What checksum would you recommend for this particular scenario?

  1. #1
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    1,735

    What checksum would you recommend for this particular scenario?

    I'll start with the code snippet that matters plus a few comments added into it:
    Code:
    /* pawrd/pawru are expected to always be the width of a CPU register and since this is to
    contain pids and tids it seemed the right size for any system  */
    typedef pawrd   pawexe;
    #define PAWEXE_WIDTH PAWRD_WIDTH
    
    /* Gonna shove signals, exceptions, errors and win32 messages input through this type */
    #if PAWEXE_WIDTH >= 64
    #define PAWMSG_WIDTH PAWEXE_WIDTH
    #else
    /* compiles down to roughly (((A / B) + !!(A % B)) * B) */
    #define PAWMSG_WIDTH PAWINT_MULCIEL(64,PAWCD_WIDTH)
    #endif
    typedef unsigned _BitInt(PAWMSG_WIDTH) pawmsg;
    
    /* (msg << (PAWMSG_WIDTH*3)) | ((tid << PAWMSG_WIDTH) | index)
     * Empty bytes/register will have something else dumped in it at some point,
     * maybe a checksum? */
    #define PAWSYS_WIDTH (PAWMSG_WIDTH*4)
    typedef unsigned _BitInt(PAWSYS_WIDTH) pawsys;
    That last one, pawsys has an unused space and checksums to verify the index etc are not corrupt (like for example malicious code constructing the mask) seems like a good fit for using that space. I can slip in some data from the object the index references to construct the checksum from so I don't need to deconstruct it for verification but it would be a nice bonus option.

    What would peops recommend knowing the internal object will typically be small, copied into a buffer where the data is then safe to modify without corrupting the source object, and have the thread object prefixed to that data too (with all the varying data skipped during the copy)?
    Last edited by awsdert; 1 Week Ago at 03:06 AM. Reason: Forgot to add a comment on the PAWINT_MULCIEL macro. Also forgot to rename the last macro used to match it's new name

  2. #2
    Registered User
    Join Date
    Sep 2022
    Posts
    57
    So, the reason why you consider to add a checksum is because you have a few unused bits in a structure? Hmm. Would you also consider to increase your structure to make room for a checksum if there was no free space leftover? If the answer is No, don't add complexity to your code only because you can.


    Besides of that, any checksum would be weak anyways. So at least only perform some very basic and cheap operations. In the example below I just use bit rotation and XOR. The function supports adding information in subsequent calls as well as the definition of the bit width to be used for the checksum (e.g. to fill your whole free space even if you have an odd number of bits unused in your structure). The latter, together with the circular shift, ensures that no information ever processed is lost when we extract the checksum from the always 8-byte return value.
    Code:
    #include <inttypes.h>
    #include <stddef.h>
    #include <stdint.h>
    #include <stdio.h>
    #include <stdlib.h>
    
    
    /// @brief Calculate a basic checksum with defined bit width from any data.
    ///
    /// The function relies on both `pData` != NULL and 0 < `useBits` <= 64.
    ///
    /// @param pData     Pointer to any data.
    /// @param dataSize  Size of the data as number of bytes.
    /// @param useBits   Number of bits to be used for the resulting checksum.
    /// @param initial   Initial value of the checksum.
    ///
    /// @return The calculated checksum.
    static uint64_t GetChecksum(const void *pData, size_t dataSize, unsigned useBits, uint64_t initial)
    {
      const unsigned shift = useBits - 1;
      for (const unsigned char *dataIter = pData, *const end = dataIter + dataSize; dataIter < end; ++dataIter)
        initial = ((initial << 1) | ((initial >> shift) & 1)) ^ (uint64_t)*dataIter;
    
    
      return initial & (((uint64_t)1 << useBits) - 1);
    }
    
    
    struct foo
    {
      int32_t a;
      char b; // obviously `b` has a 24-bit padding
      int64_t c;
    };
    
    
    int main(void)
    {
      void *const raw = malloc(sizeof(struct foo));
      if (raw == NULL)
        return 1;
    
    
      struct foo *pFoo = raw;
      uint8_t *buf = raw;
    
    
      pFoo->a = 543210;
      pFoo->b = 'x';
      pFoo->c = -1234567890000;
    
    
      // The resulting checksum occupies only `patternWidth` bits in the returned
      // value. This meets the width of the unused padding in `struct foo`.
      const unsigned patternWidth = 24;
    
    
      // Members can be shuffled to obfuscate the algorithm a little.
      // In the first call, 0 or a suitable veil is passed to `initial`.
      uint64_t checksum = GetChecksum(&pFoo->b, sizeof(pFoo->b), patternWidth, 5316907);
      // The previously returned `checksum` is passed to `initial`in subsequent calls.
      checksum = GetChecksum(&pFoo->a, sizeof(pFoo->a), patternWidth, checksum);
      checksum = GetChecksum(&pFoo->c, sizeof(pFoo->c), patternWidth, checksum);
      printf("checksum: 0x%016" PRIX64 "\n\n", checksum);
    
    
      // Insert the checksum via byte array.
      buf[5] = (uint8_t)checksum;
      buf[6] = (uint8_t)(checksum >> 8);
      buf[7] = (uint8_t)(checksum >> 16);
    
    
      printf("pFoo:\n->a = %" PRIi32 "\n->b = '%c'\n->c = %" PRIi64 "\n\n", pFoo->a, pFoo->b, pFoo->c);
    
    
      const uint64_t readChecksum = (uint64_t)buf[5] | ((uint64_t)buf[6] << 8) | ((uint64_t)buf[7] << 16);
      printf("readChecksum: 0x%016" PRIX64 "\n", readChecksum);
    
    
      free(raw);
    }

  3. #3
    Registered User awsdert's Avatar
    Join Date
    Jan 2015
    Posts
    1,735
    I really wish firefox would reload the page on pinned tabs each time I open it. Anyways thanks, yeah I get it's a cheap check. It's supposed to be a cheap check since the more expensive variant is just a globally assigned index via semaphore. This is just for in thread checking where the dev has full control over safety.

    There's also going to be an index variant for just in thread so the checksum thing is really just to check for accidental corruption of the value. It'll be a while before I get round to actually implementing any checksum for it but I'll probably go with your suggestion if I don't see any somehow better ones before then.

    In either case thank you very much

  4. #4
    Registered User
    Join Date
    Sep 2022
    Posts
    57
    There are a lot of cheap hash algorithms in the wild that you could lean on to calculate a checksum (such as FNV-1a for example). However, they don't offer much added value for your use case because they serve different needs, like a good index distribution into buckets of a hash table.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 1
    Last Post: 07-03-2015, 08:24 AM
  2. Challenging scenario - need help from Gurus
    By startingc in forum C Programming
    Replies: 12
    Last Post: 05-29-2010, 10:39 AM
  3. Runtime error for a simple scenario
    By dpp in forum C++ Programming
    Replies: 5
    Last Post: 02-22-2009, 03:37 AM
  4. lock needed in this scenario?
    By George2 in forum C# Programming
    Replies: 1
    Last Post: 05-25-2008, 07:22 AM
  5. logical progression of scenario
    By Leeman_s in forum A Brief History of Cprogramming.com
    Replies: 3
    Last Post: 12-10-2003, 07:53 PM

Tags for this Thread