Thread: How to Improve?

  1. #1
    Registered User
    Join Date
    Mar 2007

    How to Improve?

    So I'm working on a project that solves the Kevin Bacon problem. The code works fine for a small file size, but I have to read in the entire IMDB movie database file, which contains thousands of movies. I have a graph where the vertices are the actors. I then create edges between the actors if they were in the same movie. However, I think my problem lies in creating the vertices because it is creating a string for each vertex, and so it quickly runs out of memory.

    #ifndef GRAPHSTUFF_H_
    #define GRAPHSTUFF_H_
    #include <iostream>
    #include <string>
    #include <sstream>
    #include <fstream>
    #include <list>
    #include <vector>
    using namespace std;
    struct Vertex;
    struct Edge
                         // First vertex in edge is implicit
        Vertex  *dest;   // Second vertex in edge
        double   cost;   // Edge cost
        Edge( Vertex *d = 0, double c = 0.0 )
          : dest( d ), cost( c ) { }
    // Structure stored in priority queue for Dijkstra's algorithm.
    struct Path
        Vertex *dest;   // w
        double  cost;   // d(w)
        Path( Vertex *d = 0, double c = 0.0 )
          : dest( d ), cost( c ) { }
        bool operator> ( const Path & rhs ) const
          { return cost > rhs.cost; }
        bool operator< ( const Path & rhs ) const
          { return cost < rhs.cost; }
    const int INFINITY = std::numeric_limits<int>::max();
    // Basic info for each vertex.
    struct Vertex
        std::string name;    // Vertex name
        vector<Edge>  adj;     // Adjacent vertices (and costs)
        double        dist;    // Cost
        Vertex       *prev;    // Previous vertex on shortest path
        int           scratch; // Extra variable used in algorithm
        Vertex( const std::string & nm ) : name( nm )
          { reset( ); }
        void reset( )
          { dist = INFINITY; prev = NULL; scratch = 0; }
    #endif /*GRAPHSTUFF_H_*/

    Is there any way to improve this?
    Last edited by tallguy; 04-03-2007 at 08:21 AM.

  2. #2
    Lean Mean Coding Machine KONI's Avatar
    Join Date
    Mar 2007
    Luxembourg, Europe
    The question is what you really expect to optimize? I mean, you somewhere have to store the information and that space is limited (of course at some point the processor starts page swapping etc).

    If you are sure that you're not creating a vertex twice for the same actor or that your program is allocating unnecessary space, then I can't see what you could optimize.

    I hope you realize that you don't have to read "the entire imdb database", but simply if you start with Kevin bacon (for example) and build your graph from there, you will encounter a situation where you can't add any more actors because you've got them already.

    Unfortunately for you, I heard some vague theory that it is possible to link any two actors together with a chain of only 4 intermediate nodes (with a working proof of concept on some website).

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 11
    Last Post: 03-29-2009, 12:27 PM
  2. What do I need to improve on for the real world?
    By vhunterd in forum A Brief History of
    Replies: 3
    Last Post: 04-08-2007, 07:49 PM
  3. why page based I/O can improve performance?
    By George2 in forum C Programming
    Replies: 1
    Last Post: 06-12-2006, 07:42 AM
  4. Replies: 6
    Last Post: 06-09-2006, 12:44 AM