Thread: Self-Updating Data Structures...

  1. #1
    Registered User
    Join Date
    Sep 2004
    Posts
    124

    Self-Updating Data Structures...

    I need something that updates itself, like for instance a linked-list or a chain or something. When you set a parameter let's say x = 1, then the x in the entire chain updates to 1. I know I can do this using functions and all...but is there something native that will be more efficient in terms of processing?

    If not in C++, can this be done in some other language. Or like in Excel where I could set properties for each cell, so if I set something on one cell, it updates the entire row or something...

    Thanks!

  2. #2
    Registered User
    Join Date
    May 2005
    Posts
    73
    you've got me confused... whats wrong with a function and a loop to iterate through a linked list to update each value? Seems O.K. to me..

    perhaps someone can help u..as i'm quite lost..

  3. #3
    Registered User
    Join Date
    Sep 2004
    Posts
    124
    hmm...i don't really know i'm confused myself...it's just that the computation involved is HUGE...sequence comparisions with upto 10,000 letters in each sequence (made up of 6 different types of letters only)...looping and iterating is one thing im trying to go around cause it just seems it'll load the program pretty bad especially considering that this is just the preliminary comparision of data and other more useful functions will be applied on this data later on...

    i guess it would be ok if someone would atleast comment whethere iterating so many values stores in vectors (or maps) or even structs will have significant performance hit or not...the worst ive ever worked with is iterating through 1000 objects...and that too a single pass...

  4. #4
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    You clearly have no idea whether this will be a performance sink or not, so don't waste your time with premature optimization. Write the simplest code that works, then profile it to see if it's fast enough. If not, then you can optimize the hotspots or change your algorithms as necessary.

    >looping and iterating is one thing im trying to go around
    If you want to update every x in a sequence of unknown length, no matter how deeply you hide it in abstraction, there's always going to be a loop. You can cache common searches into a container of pointers and get O(M) where M is the number of items that match the criteria for those common searches and O(N) where N is the total number of items in the sequence for uncommon searches, but that's an optimization that you may not need.
    My best code is written with the delete key.

  5. #5
    Toaster Zach L.'s Avatar
    Join Date
    Aug 2001
    Posts
    2,686
    Well, you can make replacing all instances of a given piece of data faster, but it will be at the price of making single substitutions much more tedious. Although, if this is for the type of application I think (by the original post, I'm guessing manipulation of mathematical formulae), then single substitutions are not an issue.

    Consider that you have the following structure:
    Code:
    struct variable {
       char name; // e.g. 'x'
       int value; // e.g. 1
    };
    Now, instead of keeping a list of 'variable' objects, you keep a list of pointers to 'variable' objects. Now, for each differently named variable ('x', 'y', etc), you keep only a single structure, and store it in an array or map, etc. So, every occurence of 'x' in your list actually points to the same structure, so when you want to change the value of all of them, you make only one change. Of course, changing an 'x' variable to a 'y' variable now becomes a matter of relinking, and you have the issue of storing the structures, so this will be more efficient when you are dealing with long lists, and probably relatively inefficient for short lists.
    The word rap as it applies to music is the result of a peculiar phonological rule which has stripped the word of its initial voiceless velar stop.

  6. #6
    Registered User
    Join Date
    Sep 2004
    Posts
    124

    Exclamation

    thanks Zach L., I never thought about using pointers in my case...my program is a sequence comparision algorithm...Prelude I know where the performance hit is coming from so I want to get around that...ill give you some details...check out this pseudocode (using only 4 letters here)...Code bottlenecks are marked with ASTERICKS (*)

    seq1: ABBADDA (rep. as x-axis)
    seq2: ABBCCDDAC (rep. as y-axis)

    The result of this comparision is shown in the attachment picture. Please look at it to get a better idea of what i'm doing. If you notice, the first part in the algo. is to produce the match points in the sequences, and the second part is to sort them according to their c value (from y = mx + c) to get points which lie on lines with gradient = 1.

    Code:
    Define: vector_A
    Define: vector_B
    Define: vector_C
    Define: vector_D
    
    struct COORD
    {
       int x, y;            //for the coordinates
       char name;      //for which letter it represents 
    };
    
    //c_value = (y-cord - x-cord)
    Define: MULTIMAP with COORD and c_value (key) as pairs
    Define: MAP_OF_KEYS of c_value
    
    int counter = 1;
    
    Read seq1  //char by char
    {
       If A then push counter on vector_A
       If B then push counter on vector_B
       If C then push counter on vector_C
       If D then push counter on vector_D
       counter ++; //x-value
    }
    
    counter = 1;  //reset counter
    
    Read seq2     //char by char
    {
       If A then
       {
          for all x in vector_A        ***1***
          {
             c_value = counter - x;   //counter is now the y-value 
             
             initialize new COORD
             coord.x = x;
             coord.y = counter;
             coord.name = A;
             push the pair coord and c_value (as key) on the map;
             push c_value on the MAP_OF_KEYS;   ***2***
          }
       }
    
    //similarly for the other letters
    
    }
    e.g. At the end of it the 'maps' will look like this:

    Code:
    MULTIMAP:
    [(x=1,y=1) Name=A ] ; Key=0
    [(x=2,y=2) Name=B ] ; Key=0
    [(x=3,y=3) Name=B ] ; Key=0
    [(x=5,y=6) Name=D ] ; Key=1
    [(x=6,y=7) Name=D ] ; Key=1
    [(x=7,y=8) Name=A ] ; Key=1
    
    MAP_OF_KEYS:  //not considering the other element in the pair
    Key = 0
    Key = 1
    ***1***: Each time I get an A on the y-axis, I'll have to go through this vector...It's ok right now but when I have let's say 1000 elements in this vector and I get let's say 500 A's on the y-axis (it won't be pretty)

    ***2***: Why make this map. If I could just collate the same key values from the multimap somehow using map.find(), it would save me a lot of resources.

  7. #7
    Registered User
    Join Date
    Sep 2004
    Posts
    124
    so Zach, given this situation...how could i get help through pointers...i actually tried doing a bit of research on this but i can't nail it down...but i made this figure (an extension of the graph above) that should help. I'm treating the entire space as a giant matrix now. So, notice in the first pass (red), the program knows that it's handling all the column cells for A.

    On the second pass (green), it knows that it does not need to go through all the elements. It knows to just update the blue colored cells. How to quantify this in terms of pseudocode is something I'm having a problem with.

    Prelude...I can see the caching thing helping me out with problem ***1***. Do you think it would be better than what I'm doing currently. How would I accomplish the caching to memory?

    Cutting down on computation costs is very vital. I'll eventually be comparing large sequences. So, I could have a 5000 vs. 6000 letter sequence. If there are 1000 A's in seq1 and 2000 in seq2, then we're talking about 1000 x 2000 iterations just for A by normal looping!
    Last edited by alvifarooq; 06-02-2005 at 02:46 AM.

  8. #8
    Registered User
    Join Date
    Sep 2004
    Posts
    124
    hi guys...here's another graph with another comparision...this is another method i investigated...where i would have structs for A,B,C, and D and then and x-vector and a y-vector in each of them. If I read the characters one by one and store their positions then I'll be filling up the row first and then the column (e.g. in A, with seq1 I'll get 3, 7 and then with seq2 I'll get 1, 5, and 9) The matrix cells are the positions I want. The color codes are:

    A- match: blue
    B- match: green
    C- match: red
    D- match: brown

    Thanks.

  9. #9
    Toaster Zach L.'s Avatar
    Join Date
    Aug 2001
    Posts
    2,686
    Unfortunately, you will not be able to use the pointer method (flyweights) to save time in this case. That particular pattern would allow you to defer the costs of the computations until you needed to access each element, but you will not save anything here, because you must access each element of the array immediately to compute the key. To access every element, there is simply no way to avoid a loop like that.

    Cheers
    The word rap as it applies to music is the result of a peculiar phonological rule which has stripped the word of its initial voiceless velar stop.

  10. #10
    Registered User
    Join Date
    Sep 2004
    Posts
    124
    thanks a lot Zach. I'm trying other things right now...most notably replacing STL maps with has_maps...you think there's a performance benefit in that?

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Bitmasking Problem
    By mike_g in forum C++ Programming
    Replies: 13
    Last Post: 11-08-2007, 12:24 AM
  2. What's a good generalized data structures book?
    By indigo0086 in forum A Brief History of Cprogramming.com
    Replies: 12
    Last Post: 11-16-2006, 01:01 PM
  3. i need advice about data structures
    By sawer in forum C Programming
    Replies: 2
    Last Post: 04-22-2006, 03:40 AM
  4. Data Structures C++ Book
    By Cprogrammer in forum C++ Programming
    Replies: 1
    Last Post: 09-16-2005, 12:50 AM
  5. Warnings, warnings, warnings?
    By spentdome in forum C Programming
    Replies: 25
    Last Post: 05-27-2002, 06:49 PM