Thread: Saving trees in files

  1. #1
    Confused Magos's Avatar
    Join Date
    Sep 2001
    Location
    Sweden
    Posts
    3,145

    Saving trees in files

    What would the best (most efficient) way to store a binary tree in a file? I've been thinking of one way, but if there is a better one I'd like to know.

    Imagine a tree like this:
    Code:
        A
       / \
      B   C
     /   / \
    D   E   F
           / \
          G   H
    When storing it in a file it is stored in level order. Since it's not neccessarily balanced you must use some special character for empty branches. Except for the first character, which is the root, the characters are in pairs (siblings to a node). The first pair is the children to the first character, the second pair to the second character, etc... with the exception of a special character that cannot have any children

    The above tree would turn into:

    A B C D ¤ E F ¤ ¤ G H
    with pairs: A (B C) (D ¤) (E F) (¤ ¤) (¤ ¤) (G H)

    This means that (B C) is the children to A, (D ¤) is the children to B, (E F) is the children to C, (¤ ¤) is the children to D (two special nodes in a pair means no children at all, but it needs to be stored anyway), the next (¤ ¤) would be the children to ¤, but that's invalid so they will be the children to the next character instead, E. Finally (G H) will be the children to F. No more pairs exists so the tree is finsihed. No need to store the last 2 empty pairs (for G and H).

    Hope you understands my thinkings. Any thoughts? Is this a pretty ok approach?

    EDIT: Fixed a mistake in the algorithm
    Last edited by Magos; 08-28-2004 at 04:45 AM.
    MagosX.com

    Give a man a fish and you feed him for a day.
    Teach a man to fish and you feed him for a lifetime.

  2. #2
    unleashed alphaoide's Avatar
    Join Date
    Sep 2003
    Posts
    696
    What's the purpose of this? Why don't you just store is as:
    A B C D E F G H
    and just contruct the tree when you read them in??
    source: compsci textbooks, cboard.cprogramming.com, world wide web, common sense

  3. #3
    Crazy Fool Perspective's Avatar
    Join Date
    Jan 2003
    Location
    Canada
    Posts
    2,640
    Quote Originally Posted by alphaoide
    What's the purpose of this? Why don't you just store is as:
    A B C D E F G H
    and just contruct the tree when you read them in??
    indeed.

    index the first element as node "1", and sequentially number the nodes as they are read in. (starting at 1 is important) This method also assumes a "full" (i think thats the term, damn, its been a while) tree.

    for any node i, its left child is 2*i and its right child is 2*i + 1. voila
    example
    Code:
         1
        / \
       2    3
       /\   /\
      4 5  6  7
      /\
     8 9

  4. #4
    C++ n00bie :D
    Join Date
    Jul 2004
    Posts
    63
    I think he means so he can construct the tree through the file, without having to have the tree premade and use the file to fill it. I think your idea should work, but if that edit means you found a way to make it work, then ok! If not, I think the idea is good, and so you know, if you want to use null instead of ¤ its the smily face(its unicode-263A) I believe, so put the in the file if you use it instead of the star. You could always also use a basic symbol for it(like - or #)

  5. #5
    Crazy Fool Perspective's Avatar
    Join Date
    Jan 2003
    Location
    Canada
    Posts
    2,640
    Quote Originally Posted by LloydUzari
    I think he means so he can construct the tree through the file, without having to have the tree premade and use the file to fill it.
    as long as you write the tree to the file with the same conventions you read it with there is no difference.

  6. #6
    Confused Magos's Avatar
    Join Date
    Sep 2001
    Location
    Sweden
    Posts
    3,145
    Why don't you just store is as:
    A B C D E F G H
    and just contruct the tree when you read them in??
    How would you know where everything would go then? If you just construct it from top left it would end up as:
    Code:
           A
         /   \
        B     C
       / \   / \
      D   E F   G
     /
    H
    But the original tree (see first post) is:
    Code:
        A
       / \
      B   C
     /   / \
    D   E   F
           / \
          G   H
    Notice the branchings, which is NOT the same.

    This method also assumes a "full" tree.
    as long as you write the tree to the file with the same conventions you read it with there is no difference
    Well, since the trees are arbitrary you have no idea how it looked like when it was written into the file. I was looking for a method not only to store the values but also the structure of it, meaning the branching of the tree, which is NOT always a balanced full tree.

    if you want to use null instead of ¤ its the smily face
    The ¤ was just a placeholder symbol. In my real application a character is an entire byte (256 possible values), not just a letter, so I prolly need 2 bytes do distinct the special character from a normal character. Like 0 followed by 0 is special, 0 followed by something else is a 0.
    MagosX.com

    Give a man a fish and you feed him for a day.
    Teach a man to fish and you feed him for a lifetime.

  7. #7
    Registered User hk_mp5kpdw's Avatar
    Join Date
    Jan 2002
    Location
    Northern Virginia/Washington DC Metropolitan Area
    Posts
    3,817
    Output the tree's nodes to file using a preorder traversal such that the nodes would be output as such:

    ABDCEFGH

    When you read them back in the original form of the tree should be preserved. This way you don't need to write any placeholders to the file at all. This should work for any tree full/sparse etc...
    "Owners of dogs will have noticed that, if you provide them with food and water and shelter and affection, they will think you are god. Whereas owners of cats are compelled to realize that, if you provide them with food and water and shelter and affection, they draw the conclusion that they are gods."
    -Christopher Hitchens

  8. #8
    Confused Magos's Avatar
    Join Date
    Sep 2001
    Location
    Sweden
    Posts
    3,145
    hk_mp5kpdw:
    How would you then differ between these trees:
    Code:
      A
     / \
    B   C
    
        A
       /
      B
     /
    C
    Both gives the sequence: A B C
    MagosX.com

    Give a man a fish and you feed him for a day.
    Teach a man to fish and you feed him for a lifetime.

  9. #9
    C++ n00bie :D
    Join Date
    Jul 2004
    Posts
    63
    Me and Magos are only ones that get it Hes not talking about a premade tree and putting the data into it from file, but making the tree entirely from the file so eveything is in the right order. I say your method seems good enough, you just need to find a digit that will resemble an empty node, and use if statements while constructing it so that it will ignore any of that digit. Maybe I can throw together a way of doing this later, but this doesnt seem too difficult to do aslong as you understand the building from file.

  10. #10
    Cheesy Poofs! PJYelton's Avatar
    Join Date
    Sep 2002
    Location
    Boulder
    Posts
    1,728
    The ¤ was just a placeholder symbol. In my real application a character is an entire byte (256 possible values), not just a letter, so I prolly need 2 bytes do distinct the special character from a normal character. Like 0 followed by 0 is special, 0 followed by something else is a 0.
    Isn't there a character like the NULL character that there is no chance of being stored in the tree? That way you could just stick to using one byte and save space, not that that matters that much these days

    As far as your algorithm goes, I don't see a better way of saving the tree and preserving the branching than what you have without getting unnecassarily complicated to save a nanosecond or a couple of bytes.

  11. #11
    unleashed alphaoide's Avatar
    Join Date
    Sep 2003
    Posts
    696
    Well, to me it seems that you wanna do the hard work instead of letting your program do all the processing. I mean why you want a tree in your file? So that when you computer is down you could do the searching manually?
    Anyway, I'm just messing around :P
    source: compsci textbooks, cboard.cprogramming.com, world wide web, common sense

  12. #12
    Code Goddess Prelude's Avatar
    Join Date
    Sep 2001
    Posts
    9,897
    From what I can tell, your insertion algorithm tries to create a complete tree yet you expect to read the data from a file and restore an incomplete structure. That's not really going to work well unless you save the data with corresponding structure locations that a specialized rebuild algorithm can use to place the nodes correctly. You'll need an unpacking algorithm that reads structure information and inserts accordingly and a packing algorithm that saves the tree data along with structure information from the top down.

    The biggest problem is that you can only unpack what was previously packed with the sibling function. This is a very specialized operation.
    My best code is written with the delete key.

  13. #13
    Registered User
    Join Date
    Aug 2003
    Posts
    470
    If you store the nodes in order, you will get a unbalanced tree. So it's probably best to breath first algorithm and then insert the nodes. For instance, if you have the tree

    Code:
          6
      4     10
    2   5      11
    You could store
    6, 4, 10, 2, 5, 11.

    You can use use a serialization algorithm for pointers to objects and objects. It's a little more complex however, and it requires you to store mapping data at the beginning of the file.

  14. #14
    Registered User hk_mp5kpdw's Avatar
    Join Date
    Jan 2002
    Location
    Northern Virginia/Washington DC Metropolitan Area
    Posts
    3,817
    Quote Originally Posted by Magos
    hk_mp5kpdw:
    How would you then differ between these trees:
    Code:
      A
     / \
    B   C
    
        A
       /
      B
     /
    C
    Both gives the sequence: A B C
    Yes the preorder traversal results in the same ABC sequence, but if we say that nodes to the right of its parent are "greater than" the parent and nodes that are left of its parent are "less than" the parent, then doesn't the first tree demonstrate that C is "greater than" A and B, and as demonstrated by the second tree that C is "less than" both A and B. These two trees contradict eachother if the meanings of A, B, and C are the same in both trees.

    Lets say the two tree have nothing at all to do with each other. In both examples we store A, B, C (the preorder traversal of each) into a file.

    In the first case, B < A < C. So when we read in A, we build a root node and copy A into it. We then read in B and since it is less than A we construct a left child and stuff/copy B into it. We then read C and since it is more than A we go construct a right child and stuff/copy C into that. This results in the first tree you have in your above example.

    In the second case, C < B < A. So we read in A and build a root node and copy/stuff A into the root. We then read in B and since it is less than A we construct a left child and stuff/copy B into it. We then read in C which is less than A so we go to its left child which is B. C is also less than B so we construct another left child for B and copy/stuff C into it. This results in the second tree you have in your example.

    So, in both cases we can still duplicate your tree exactly as long as you don't change the strick weak ordering of A, B, and C.

    Of course this all be the alcohol I happen to be drinking at the moment talking.
    Last edited by hk_mp5kpdw; 08-28-2004 at 01:30 PM.
    "Owners of dogs will have noticed that, if you provide them with food and water and shelter and affection, they will think you are god. Whereas owners of cats are compelled to realize that, if you provide them with food and water and shelter and affection, they draw the conclusion that they are gods."
    -Christopher Hitchens

  15. #15
    Registered User
    Join Date
    Aug 2003
    Posts
    470
    No, the preorder should work as well as the breath first search. The preorder algorithm looks like
    print node
    preorder(node->left)
    preorder(node->right)

    The possible trees with A < B < C are
    Code:
           C
         /
       B
     / 
    A
    
        B
      /   \
    A     C
    
    A
      \
        B
         \
          C
    The preorder of them is C, B, A; B, A, C; and A, B, C.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. saving name characters to files
    By clonedhero in forum C Programming
    Replies: 5
    Last Post: 01-16-2008, 03:20 PM
  2. Saving numbers to files
    By cpudaman in forum C++ Programming
    Replies: 5
    Last Post: 01-10-2008, 01:05 AM
  3. Help with loading files into rich text box
    By blueparukia in forum C# Programming
    Replies: 3
    Last Post: 10-19-2007, 12:59 AM
  4. Folding@Home Cboard team?
    By jverkoey in forum A Brief History of Cprogramming.com
    Replies: 398
    Last Post: 10-11-2005, 08:44 AM
  5. Batch file programming
    By year2038bug in forum Tech Board
    Replies: 10
    Last Post: 09-05-2005, 03:30 PM