Thread: array manipulation (shifting data)

  1. #1
    Registered User
    Join Date
    May 2009
    Posts
    72

    array manipulation (shifting data)

    Hi All

    I have to process a double array of 4096 elements. When processing is done, I need to shift the data by 512 samples
    Code:
     double *window = malloc(4096 * sizeof(*window)) ;
    .....
     for(j = 0; j < 4096 - 512; j++) {
           window[j] = window[j + 512] ;
        }
    after this process, I add 512 new values add the end of the array. This process is repeated a million times, so performance is very important.

    So my question is, does C has a library that can do certain array manipulation (like shifting)
    If not, what would be the fastest way to do this? I can imaging that there might be certain pointer tricks to speedup this process!

    Any help would be appreciated!
    thnx
    LuCa

  2. #2
    and the Hat of Guessing tabstop's Avatar
    Join Date
    Nov 2007
    Posts
    14,336
    You can use allocate new memory then use memmove, I suppose.

    Your name "window" is suggestive. Are you always looking at the most recent 4096 data entries? If so, no re-allocation is needed, just memmove.

  3. #3
    pwning noobs Zlatko's Avatar
    Join Date
    Jun 2009
    Location
    The Great White North
    Posts
    132
    I don't think you have to do any memory copying or reallocation. Just treat your memory as a circular buffer. Here's what I mean.
    You have a block of memory divided into chunks. You have a ChunkSize of 512 bytes. Your have an array of some multiple of ChunkSize.

    const int ChunkCount = 8;
    const int ChunkSize = 512;

    You have a chunkNumber which indicates the virtual start of your array. Every time you shift your data, you increment chunkNumber modulo the ChunkCount

    int chunkNumber = 0;

    You have a block of memory holding your array

    double* block = malloc(ChunkCount * ChunkSize * sizeof(*block));

    You have a pointer to the virtual start of your array

    double* pStart = block;

    Your index, j is an index into your data. It goes from 0 to ChunkCount*ChunkSize - 1. It needs to be translated to an index into the allocated block with this function:
    Code:
    int calcBlockIndex(int j)
    {
            if (j < 0 || j >= ChunkSize * ChunkCount) {
                //handle the error
            }
    
            int result = (j + pStart - block) % (ChunkCount * ChunkSize);
            return result;
    }
    When you want to shift, you need to adjust the pStart and the chunkNumber

    Code:
    void shiftData(void)
    {
    chunkNumber = (chunkNumber + 1) % ChunkCount; pStart = block + chunkNumber * ChunkSize
    }
    I ran a simulation of this with ChunkCount = 3 and 4 shifts
    Here is the result

    --ChunkNumber 0
    j of 0 -> blockIx of 0
    j of 511 -> blockIx of 511
    j of 512 -> blockIx of 512
    j of 1023 -> blockIx of 1023
    j of 1024 -> blockIx of 1024
    j of 1535 -> blockIx of 1535
    ==Shifting
    --ChunkNumber 1
    j of 0 -> blockIx of 512
    j of 511 -> blockIx of 1023
    j of 512 -> blockIx of 1024
    j of 1023 -> blockIx of 1535
    j of 1024 -> blockIx of 0
    j of 1535 -> blockIx of 511
    ==Shifting
    --ChunkNumber 2
    j of 0 -> blockIx of 1024
    j of 511 -> blockIx of 1535
    j of 512 -> blockIx of 0
    j of 1023 -> blockIx of 511
    j of 1024 -> blockIx of 512
    j of 1535 -> blockIx of 1023
    ==Shifting
    --ChunkNumber 0
    j of 0 -> blockIx of 0
    j of 511 -> blockIx of 511
    j of 512 -> blockIx of 512
    j of 1023 -> blockIx of 1023
    j of 1024 -> blockIx of 1024
    j of 1535 -> blockIx of 1535


    When chunkNumber is 0, the index into block is the same as j. When ChunkNumber is 2, as j goes from 0 to 1535, the block index goes from 1024 to 1535 then wraps around and goes to 1023.



    Edit:
    As I look at it again, I think you don't need the value of block in the calculations, because in shiftData, pStart is offset by block, and in calcBaseIx, the offset is removed so it's pretty silly to have it, but I left it in for the record.
    Last edited by Zlatko; 06-24-2009 at 09:41 PM.

  4. #4
    Registered User
    Join Date
    May 2009
    Posts
    72
    thnx a lot for both replies!! I was looking for something like memcpy, but the circular array might work for me too!

    cheers
    LuCa

  5. #5
    pwning noobs Zlatko's Avatar
    Join Date
    Jun 2009
    Location
    The Great White North
    Posts
    132
    LuCa, if you're going to move bytes around then use memmove, not memcpy. The memmove is safe when source and destination memory regions overlap. As I understand it, you want to move bytes 512 to 4095 into bytes 0 to 3583. The memcpy behavior is not defined for overlapping regions.

    Best regards.
    Zlatko

  6. #6
    Registered User
    Join Date
    May 2009
    Posts
    72
    Zlatko, thnx a lot for pointing this out, because there is indeed overlap (as you describe). Where do you find this kind of information ?

    UPDATE: I found it: http://www.codecogs.com/reference/c/string.h/memcpy.php

  7. #7
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    I'd like to know what subsequent process you are doing that requires the contiguous placement of 4096 elements? Perhaps there is a better way if the process is able to access discontinuous chunks of 512 elements at a time. Then you would need a reusable stack of 4 x 512 elements, no shifting whatsoever. Just a top-of-stack pointer which increments modulo 4.

  8. #8
    Registered User
    Join Date
    May 2009
    Posts
    72
    I think a contiguous array is not really necessary. I have to loop the array a couple of times. But how would you do this with your top-of-stack pointer, use an if-statement, to switch from block to block ? If so, is this approach still faster than use memmove ?

  9. #9
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    Something like... (just typing ... too lazy to build a prototype and letting a compiler check...)
    Code:
    double array[8][512];
    int sp;
    
    // initialize every element
    
    memcpy(array[0], ... <input array>, 512 * sizeof(double));
    memcpy(array[1], ... <input array>, 512 * sizeof(double));
    memcpy(array[2], ... <input array>, 512 * sizeof(double));
    memcpy(array[3], ... <input array>, 512 * sizeof(double));
    ...
    memcpy(array[7], ... <input array>, 512 * sizeof(double));
    
    sp = 0;
    
    for (;;) {
        // do whatever processing on the entire 4096 element array
    
        memcpy(array[sp], ... <input new chunk>, 512 * sizeof(double));
        sp = (sp + 1) % 8; // get ready for next slot
    
        }
    Each new set of 512 elements is slotted into the next "stack" position. There's never any physical shifting of old data. One of eight ( 8 ) 512-element slots are merely overwritten in turn.

    Sorry, my previous post said 4 x 512 when it's really 8 x 512 = 4096.

    If you can elaborate on what algorithm is doing the actual processing it may influence the way the data is organized.
    Last edited by nonoob; 06-26-2009 at 12:02 PM.

  10. #10
    Registered User
    Join Date
    May 2009
    Posts
    72
    your solution would work too, it has a lot in common with the circular array solution, although this one is easier to understand

    thnx a lot!
    LuCa

  11. #11
    Registered User
    Join Date
    Sep 2008
    Location
    Toronto, Canada
    Posts
    1,834
    I think it's pretty much what Zlatko meant by 'circular buffer'.

    I'd still like to know what you're trying to do with those 4096 elements? Some sort of moving average? FFT? It may be possible to get away with not needing to store anything at all except 512 elements' worth. Oh well. As long as we've done away with a lot of shifting of 4096 - 512 elements millions of times which was our goal.

  12. #12
    Registered User
    Join Date
    May 2009
    Posts
    72
    What I need to do is not easy to explain, so I'll skip the details. I have a couple of these windows. which I stack together like (after I've applied FFT)
    Code:
    int i, j ;
    for(i = 0; i < 4096; i++) {
      for(j = 0; j < 6; j++) {
        result[i] =window[j][i] ;
      }
    }
    Here I assume contiguous windows/arrays. Furthermore, I have (in this example) 6 huge data files, over which I move the 6 windows (reading 512 samples each time for each window), so the amount of data I've to process eacht time remains small. For each set of windows I have to run the above code +/- 10000 times (not the FFT). Depending on the size of the data file, I can get >>1000 windows. That can take a while to process

    Eventually I need to move all this processing to my NVidia graphics card (CUDA). for which I have to copy the windows to the cards memory.

    What I'm trying to do here is find signals hidden in noise!

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 19
    Last Post: 12-17-2007, 02:57 AM
  2. Bitmasking Problem
    By mike_g in forum C++ Programming
    Replies: 13
    Last Post: 11-08-2007, 12:24 AM
  3. Replies: 4
    Last Post: 06-14-2005, 05:45 AM
  4. [question]Analyzing data in a two-dimensional array
    By burbose in forum C Programming
    Replies: 2
    Last Post: 06-13-2005, 07:31 AM
  5. Template Array Class
    By hpy_gilmore8 in forum C++ Programming
    Replies: 15
    Last Post: 04-11-2004, 11:15 PM