Speeding up file loading

This is a discussion on Speeding up file loading within the C++ Programming forums, part of the General Programming Boards category; I was wondering if there are ways to speed up loading from a file if that file is very large. ...

  1. #1
    Cheesy Poofs! PJYelton's Avatar
    Join Date
    Sep 2002
    Location
    Boulder
    Posts
    1,728

    Speeding up file loading

    I was wondering if there are ways to speed up loading from a file if that file is very large. I'm writing a game program that stores a boards information one character at a time, and in the end I have several thousands (if not more!) of boards saved into a text file.

    Saving is no problem, but the problem comes when the game starts up because the program then loads each and every board to be placed into a hash file. It reads the file one character at a time recreating a board, and when it hits the end of the line the board is complete and it places it to the hash, then goes to the next line. This is starting to take a long time and will only get longer as the text file grows. I wasn't sure if there was possibly a different way to save and load that might possibly be faster like for example maybe saving in binary? Any ideas would be appreciated!

  2. #2
    ggs
    ggs is offline
    C > C++ duders ggs's Avatar
    Join Date
    Aug 2001
    Posts
    435
    do you need _all_ the boards in memory at once? you could keep all of the stuff in a file indexed by hash key and only load up the table of keys when the game starts. keep numbers and boards in fixed sized records, saved in binary.
    .sect signature

  3. #3
    Cheesy Poofs! PJYelton's Avatar
    Join Date
    Sep 2002
    Location
    Boulder
    Posts
    1,728
    I thought about doing something like that, but the problem though would be that it would slow down when the artificial intelligence *thinks* because instead of looking at boards in memory it has to constantly load from file every time it looks at a position, and since it looks at thousands if not millions of positions per move, it would slow down thinking time considerably... at least I assume, maybe I'm wrong.

    The other problem is that the computer updates the hash file while its thinking, and I'm not sure how to do that your way unless I had a separate file for each hash key. I only know how to append at the end of a file.

  4. #4
    ggs
    ggs is offline
    C > C++ duders ggs's Avatar
    Join Date
    Aug 2001
    Posts
    435
    struct htable {
    uint32 size, left;
    uint32 *keys;
    uint32 *offsets;
    uint32 chain;
    };

    as long as your file only grows, this is no problem.
    you assign a fixed size to the hash key table lookup (maybe a nice power of 2, like 8192) and then whenever the table is used up (left == size) append a new table to the end of the file and use chain to point to it.

    as for your computation problem, you can keep a cache of considered positions, but that probably won't help much. all this does is distribute the loading out over the program instead of making the user impatient at start (and it has considerably lower memory cost). programming is full of compromises.. maybe somebody has a silver-bullet algorithm here to help you, though

    good luck
    .sect signature

  5. #5
    Blank
    Join Date
    Aug 2001
    Posts
    1,034
    You probably should store the boards using a binary file.
    If I understand it correctly all you would have to do would store
    the hash (a 32/64 bit integer) and the correct move (maybe
    two integers). There's no point in storing the entire board.
    This is all loaded into the hash table at
    the start for the opening book. Also you want to have the hash table updated in memory during the course of the game.

  6. #6
    Cheesy Poofs! PJYelton's Avatar
    Join Date
    Sep 2002
    Location
    Boulder
    Posts
    1,728
    Thats true, didn't occur to me to save binary numbers or something similar instead of whole board objects. I'll try that and see how much of a difference it makes. Thanks!

  7. #7
    Cheesy Poofs! PJYelton's Avatar
    Join Date
    Sep 2002
    Location
    Boulder
    Posts
    1,728
    Actually, I'm having problems even starting this idea since I've never worked with binary before in programs. How would I save a board position to a 32 or 64 bit integer? This is for a connect four program (a warmup before I begin a chess program) that has a 8x6 board.

  8. #8
    Registered User
    Join Date
    Jan 2002
    Location
    Vancouver
    Posts
    2,220
    Originally posted by PJYelton
    Actually, I'm having problems even starting this idea since I've never worked with binary before in programs. How would I save a board position to a 32 or 64 bit integer? This is for a connect four program (a warmup before I begin a chess program) that has a 8x6 board.
    This should help:
    http://www.howstuffworks.com/c17.htm

  9. #9
    Blank
    Join Date
    Aug 2001
    Posts
    1,034
    What is your hash function? I havn't ever implemented Zorbrist
    hashing but it looks like all you do is just create something like
    hash_val[3][NUM_ROW][NUM_COL] then fill this with random
    integers preferably with your own predictable random function.
    Once you do that you turn the board into hashed number with
    something like this

    Code:
    uint_64 hash(int board[NUM_ROW][NUM_COL]) 
    {
          uint_64 h = 0;
    
          for (int i = 0; i < NUM_ROW; ++i) {
                  for (int j = 0; j < NUM_COL; ++j) {
                            h ^= has_val[board[i][j]][i][j];
                  }
          }
    
           return h;
    }
    You might want to try implementing it with text files first.
    Binary files will be unportable unless you add extra code
    to deal with it.

  10. #10
    Blank
    Join Date
    Aug 2001
    Posts
    1,034
    I don't really see the point of even using a opening book.
    Just implement it without an openning book and then add
    one if you want to.

  11. #11
    Cheesy Poofs! PJYelton's Avatar
    Join Date
    Sep 2002
    Location
    Boulder
    Posts
    1,728
    Thanks, that should be enough of a start to get me going. Like I said, I've never worked with bit manipulation so I'm sure to come across more questions!

    I know I don't need an opening book and it plays pretty well without one. Basically its just practice. I'm having the comp search the hash table every move and it actually speeds things up since it can tell right away whether a move will lose 7 moves deep instead of actually having to check every position 7 moves deep. Plus I want to add a feature where the computer *learns* and refuses to lose the same way twice since every way it has lost before will be in the hash. Just practice really before I venture into the world of chess programs!

  12. #12
    Cheesy Poofs! PJYelton's Avatar
    Join Date
    Sep 2002
    Location
    Boulder
    Posts
    1,728
    Okay, I've fiddled about with this this morning and unfortunately have two problems. One is how do I use the identifier uint_64 (or uint64, UINT_64, uint32 for that matter)? Do I have to typedef this myself? My compiler doesn't recognize it. So instead I used unsigned long.

    So I tried creating a hashfunction that puts random prime numbers over a million into the 3x8x6 array and then using your function Nick to create a hash number. Works great for hash keys BUT there are enough collisions with some boards creating the same hash key that I wouldn't be able to store JUST the hash key but still have to store something that is unique to each board. Or am I missing something? All I want is a way to store a board position so that if the computer looks in the hash table and finds a position exists within, that means DON'T move there since it will lose.

  13. #13
    Blank
    Join Date
    Aug 2001
    Posts
    1,034
    Just do something like
    typedef unsigned long long uint_64;
    this will depend on what compiler your using.

    This code is sort of untested but I wanted to see how
    zobrist hashing works.

    Code:
    #include <cstdlib>
    #include <fstream>
    #include <iostream>
    using namespace std;
    
    typedef unsigned long long int uint_64;
    const int HASH_TABLE_SIZE = 100003;
    const int NUM_ROW         = 8;
    const int NUM_COL         = 6;
    const int NUM_BOARD       = 2000;
    
    struct Hash_node {
        uint_64 key;
        bool empty;
        int row;
        int col;
    };
    
    Hash_node hash_table[HASH_TABLE_SIZE];
    
    uint_64 rand64(void)
    {
        return rand() ^ ((uint_64) rand() << 15) ^ ((uint_64)rand() << 30)
            ^ ((uint_64) rand() << 45) ^ ((uint_64)rand() << 60);
    }
    
    void load_board(int board[NUM_BOARD][NUM_ROW][NUM_COL])
    {
        for (int i = 0; i < NUM_BOARD; ++i) {
            for (int j = 0; j < NUM_ROW; ++j) {
                for (int k = 0; k < NUM_COL; ++k) {
                    board[i][j][k] = rand() % 3;
                }
            }
        }
    }
    
    void print_board(int board[NUM_ROW][NUM_COL])
    {
        for (int i = 0; i < NUM_ROW; ++i) {
            for (int j = 0; j < NUM_COL; ++j) {
                cout << board[i][j] << " ";
            }
            cout << endl;
        }
        cout << endl;
    }
    
    uint_64 hash_board(int board[NUM_ROW][NUM_COL],
                       uint_64 zobrist_key[3][NUM_ROW][NUM_COL])
    {
        uint_64 h = 0;
    
        for (int i = 0; i < NUM_ROW; ++i) {
            for (int j = 0; j < NUM_COL; ++j) {
                h ^= zobrist_key[board[i][j]][i][j];
            }
        }
    
        return h;
    }
    
    void fill_keys(uint_64 zobrist_key[3][NUM_ROW][NUM_COL])
    {
        for (int i = 0; i < 3; ++i) {
            for (int j = 0; j < NUM_ROW; ++j) {
                for (int k = 0; k < NUM_COL; ++k) {
                    zobrist_key[i][j][k] = rand64();
                }
            }
        }
    }
    
    void init_hash_table()
    {
        for (int i = 0; i < HASH_TABLE_SIZE; ++i) {
            hash_table[i].key = 0;
            hash_table[i].row = 0;
            hash_table[i].col = 0;
            hash_table[i].empty = true;
        }
    }
    
    void hash_insert(int row, int col, uint_64 key)
    {
        int i = key % HASH_TABLE_SIZE;
        int n = 0;
        int probes = 0;
        
        while(!hash_table[i].empty &&
              hash_table[i].key != key && n < HASH_TABLE_SIZE) {
            i++;
            if (i == HASH_TABLE_SIZE)
                i = 0;
            n++;
            probes++;
        }
    
        if (probes > 0)
            cout << "number of probes = " << probes << endl;
        if (hash_table[i].empty) {
            hash_table[i].empty = false;
            hash_table[i].row = row;
            hash_table[i].col = col;
            hash_table[i].key = key;
        }
    }
    
    Hash_node hash_search(uint_64 key)
    {
        int i = key % HASH_TABLE_SIZE;
        int n = 0;
    
        while(!hash_table[i].empty && hash_table[i].key != key
              && n < HASH_TABLE_SIZE) {
            i++;
            if (i == HASH_TABLE_SIZE)
                i = 0;
            n++;
        }
    
        if (!hash_table[i].empty && hash_table[i].key == key) {
            return hash_table[i];
        }
    
    
        Hash_node hash_node;
        hash_node.empty = true;
        hash_node.key = 0;
        hash_node.row = 0;
        hash_node.col = 0;
        return hash_node;
    }
       
            
    
    int main(int argc, char* argv[])
    {
        int board[NUM_BOARD][NUM_ROW][NUM_COL];
        uint_64 zobrist_key[3][NUM_ROW][NUM_COL];
    
        init_hash_table();
        load_board(board);
        for (int i = 0; i < NUM_BOARD; ++i) {
            print_board(board[i]);
        }
        
        fill_keys(zobrist_key);
        for (int i = 0; i < NUM_BOARD; ++i) {
            hash_insert(4, 5, hash_board(board[i], zobrist_key));
        }
    
        int new_board[NUM_ROW][NUM_COL];
        for (int i = 0; i < NUM_ROW; ++i) {
            for (int j = 0; j < NUM_COL; ++j) {
                new_board[i][j] = (i * j) % 3;
            }
        }
    
        hash_insert(3, 2, hash_board(new_board, zobrist_key));
    
        Hash_node hn = hash_search(hash_board(new_board, zobrist_key));
        if (!hn.empty) {
            cout << "move = (" << hn.row << ", " << hn.col << ")"  << endl;
        } else {
            cout << "position not in hash_table" << endl;
        }
    
        return 0;
    }

  14. #14
    Cheesy Poofs! PJYelton's Avatar
    Join Date
    Sep 2002
    Location
    Boulder
    Posts
    1,728
    MSVC++ is saying that I can't write "long long", but I'll search the internet to see what I'm supposed to do about that. Also I'll ponder over your code tonight when I have time to really sit down and look at it. Thanks Nick!

  15. #15
    Blank
    Join Date
    Aug 2001
    Posts
    1,034
    Don't look too hard. My bet is that there are lots of bugs...
    MS uses __int64

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. To find the memory leaks without using any tools
    By asadullah in forum C Programming
    Replies: 2
    Last Post: 05-12-2008, 07:54 AM
  2. gcc link external library
    By spank in forum C Programming
    Replies: 6
    Last Post: 08-08-2007, 03:44 PM
  3. Replies: 3
    Last Post: 03-04-2005, 01:46 PM
  4. Unknown Memory Leak in Init() Function
    By CodeHacker in forum Windows Programming
    Replies: 3
    Last Post: 07-09-2004, 09:54 AM
  5. archive format
    By Nor in forum A Brief History of Cprogramming.com
    Replies: 0
    Last Post: 08-05-2003, 07:01 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21