Shortest path problem for undirected graph, with no weights per edge

This is a discussion on Shortest path problem for undirected graph, with no weights per edge within the Tech Board forums, part of the Community Boards category; In my application, I have a large graph (I am not sure that it will fit in the memory, but ...

  1. #1
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,675

    Shortest path problem for undirected graph, with no weights per edge

    In my application, I have a large graph (I am not sure that it will fit in the memory, but let's assume it will), which **I think** will have |E| > |V|.

    I want to find the shortest path between two nodes. The graph is undirected and the edges have no weights. Do you happen to know anything on this topic? Any algorithm(s)/paper(s)?

    //on the background
    I am thinking of setting the graph (I am in the designing phase) with every node holding its linking with its neighbors data (instead for a global structure, like a list or matrix). That way, I am almost sure that searching will become more efficient than in the case with the list or matrix. You see the graph is going to be large and the goal is to perform as fast queries as possible. By holding local indexes per node of graph, I will access only the indices of the neighbors per node.
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  2. #2
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,444
    Some preliminary info that may lead to more good stuff. Worth reading even if it isn't exactly what you need.
    Shortest path problem - Wikipedia, the free encyclopedia
    Dijkstra's algorithm - Wikipedia, the free encyclopedia

  3. #3
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,444
    Also, it sounds like the way you want to store your nodes is in an adjacency matrix/list. Or maybe, some partial version of that, only concerned with nodes that are close to the current node.

    One more good reference, to help get you using the right language: Glossary of graph theory - Wikipedia, the free encyclopedia.

    Note, if time is of little concern, but space is, you can use DFS to exhaustively search every possible path.

  4. #4
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,675
    Quote Originally Posted by anduril462 View Post
    Note, if time is of little concern, but space is, you can use DFS to exhaustively search every possible path.
    Why prefer DFS over BFS?

    The goal is the time. The faster, the better!

    As for the algorithms, I think that I should avoid those that have E in their time complexity and only check the ones left and find the "optimal" from them (the ones that have only V inside them).
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  5. #5
    Registered User
    Join Date
    Jun 2005
    Posts
    6,200
    Quote Originally Posted by std10093 View Post
    Why prefer DFS over BFS?
    I'd query that too, without knowing more about the specific problem. Each has tradeoffs.

    Quote Originally Posted by std10093 View Post
    As for the algorithms, I think that I should avoid those that have E in their time complexity and only check the ones left and find the "optimal" from them (the ones that have only V inside them).
    I'd look a bit more carefully than that. Some of the algorithm with complexity only in terms of V have higher powers (for example, because the number of edges is a more-than-linear function of number of vertices).
    Right 98% of the time, and don't care about the other 3%.

  6. #6
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,675
    Quote Originally Posted by grumpy View Post
    I'd query that too, without knowing more about the specific problem. Each has tradeoffs.
    Didn't I state the specific problem? What more should I state?

    Quote Originally Posted by grumpy View Post
    Some of the algorithm with complexity only in terms of V have higher powers (for example, because the number of edges is a more-than-linear function of number of vertices).
    What do you mean by "higher powers"? Probably I am lost in the language here. Also the example is not clear to me too. Sorry :/
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  7. #7
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,444
    Quote Originally Posted by std10093 View Post
    Why prefer DFS over BFS?


    The goal is the time. The faster, the better!
    You made no mention of runtime needs, yet did mention that the size of the graph was large, and that it may exceed memory. That, combined with your "background" paragraph -- particularly the part about now wanting to hold the entire graph in memory -- led me to believe you were more concerned with memory than speed, which you didn't mention at all. Note, most people optimize for time nowadays, but optimizing for space is still something that people do.
    Quote Originally Posted by std10093 View Post
    Didn't I state the specific problem? What more should I state?
    Given that grumpy found it unclear, and I misunderstood what you were getting at, it may be underspecified. Now we know you're optimizing for speed. But we don't know how much bigger |E| is than |V|.
    Quote Originally Posted by std10093 View Post
    As for the algorithms, I think that I should avoid those that have E in their time complexity and only check the ones left and find the "optimal" from them (the ones that have only V inside them).
    Quote Originally Posted by std10093 View Post
    What do you mean by "higher powers"? Probably I am lost in the language here. Also the example is not clear to me too. Sorry :/
    By higher power, grumpy means "order" or "exponent". What grumpy is saying is that, you may find an algorithm where there is only |V| in the time complexity, but it may still be slower than one with |V| and |E|, even if |E| > |V|. For example, assume |V| is large (millions of nodes) and |E| is on the order of 100 |V|.
    Now, imagine you have two algorithms, one with just |V| in the time complexity:
    O(|V|3)
    and another with both |E| and |V| in it
    O(|E| * |V|)
    Which is faster in this case? The second one. The better algorithm will depend on just how much bigger |E| is than |V|. In the above, as long as |E| < |V|2 (which I think it must be -- you cannot have |V|2 edges in an undirected graph), the second algorithm is faster. This is one area where you could have specified better: just how big is |V| and how much bigger is |E| than |V|?

  8. #8
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,675
    I wanted to emphasize that the graph is big, that's it why I stated the "does not fit in memory" case. However, I cause confusion. Moreover, the people that make the rules, are not clear in this part (if the graph fits in memory or not, but they tend to let as assume it will fit in the memory). As a result, let's take as granted that the graph fits in the memory.

    Of course considering the memory is important too, but since speed is the "success criterion" here, I want to focus on that first. Then if I see that I have memory problems, I will consider the tradeoffs and make a desicion.

    >how much bigger is |E| than |V|?
    I wish I knew. The graph is supposed to model a social network. But there are no details for that aspect. I feel that E is surely bigger than V, but I do not know how much.

    I understood the example now, thanks.

    >|E| < |V|^2 (which I think it must be -- you cannot have |V|^2 edges in an undirected graph)
    This is a key aspect here.
    "The maximum number of edges in an undirected graph without a self-loop is n(n - 1)/2." from wiki
    That means I must start checking all the algorithms, whether they contain E or V in their time complexity.
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  9. #9
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,444
    Quote Originally Posted by std10093 View Post
    >how much bigger is |E| than |V|?
    I wish I knew. The graph is supposed to model a social network. But there are no details for that aspect. I feel that E is surely bigger than V, but I do not know how much.

    I understood the example now, thanks.
    Good thing there are several social network sites, for which you could search for statistics on number of connections. Note, I would wager on the order of a few hundred: look up "Dunbar's number" or "monkey sphere".
    Quote Originally Posted by std10093 View Post
    >|E| < |V|^2 (which I think it must be -- you cannot have |V|^2 edges in an undirected graph)
    This is a key aspect here.
    "The maximum number of edges in an undirected graph without a self-loop is n(n - 1)/2." from wiki
    That means I must start checking all the algorithms, whether they contain E or V in their time complexity.
    There's really only a handful of algorithms to consider here, and their time and space complexities are well known, so it's not difficult to evaluate all of them. Once you find out a reasonable distribution for number of connections, you can make a good guess

  10. #10
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,675
    >Dunbar's number
    great info
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  11. #11
    Registered User MutantJohn's Avatar
    Join Date
    Feb 2013
    Posts
    1,008
    What's this E and V thing? Are we talking about Lagrangians here? The only time I've seen such terms is when discussion potential energy vs. energy as L = T - V.

  12. #12
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,444
    Quote Originally Posted by MutantJohn View Post
    What's this E and V thing? Are we talking about Lagrangians here? The only time I've seen such terms is when discussion potential energy vs. energy as L = T - V.
    Click some of the links about graphing algorithms and theories, you'll figure it out pretty quickly .

  13. #13
    C++まいる!Cをこわせ! Elysia's Avatar
    Join Date
    Oct 2007
    Posts
    22,428
    Quote Originally Posted by MutantJohn View Post
    What's this E and V thing? Are we talking about Lagrangians here? The only time I've seen such terms is when discussion potential energy vs. energy as L = T - V.
    V = vertices (or nodes)
    E = edges (or paths between nodes)
    Did you study graph theory?

    Btw, DFS vs BFS really depends on the layout of the graph. Without knowing how the graph looks like, there's just no way to tell which one will be faster.
    But remember that BFS generally uses more memory than DFS. DFS can search seemingly unbounded graphs, while BFS will struggle.
    Last edited by Elysia; 02-06-2014 at 10:14 AM.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  14. #14
    SAMARAS std10093's Avatar
    Join Date
    Jan 2011
    Location
    Nice, France
    Posts
    2,675
    Quote Originally Posted by Elysia View Post
    Btw, DFS vs BFS really depends on the layout of the graph.
    Undirected graph with no weights. |E| is greater than |V|, but I do not know how much. That's all I know. However, if someone wants to ask something, he or she is welcome.
    Code - functions and small libraries I use


    It’s 2014 and I still use printf() for debugging.


    "Programs must be written for people to read, and only incidentally for machines to execute. " —Harold Abelson

  15. #15
    C++まいる!Cをこわせ! Elysia's Avatar
    Join Date
    Oct 2007
    Posts
    22,428
    Well, that's not the problem (I'm after). The problem is where the solution is.
    Assume a tree from top-down perspective. If the solution is far to the right in the tree, then BFS might be faster. If the solution is deep down in the left part of the tree, DFS might be faster.
    The more edges you have near the root, the more expensive DFS becomes if the solution is far right in the tree.

    You should try analysing some graphs of your problem a little bit. This could help you determine which algorithm is likely to be faster.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

Page 1 of 2 12 LastLast
Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 7
    Last Post: 05-02-2012, 06:44 AM
  2. shortest path problem
    By bananorama in forum C Programming
    Replies: 7
    Last Post: 05-24-2011, 12:38 PM
  3. Undirected graph and Binary Tree problem
    By arafat21 in forum C Programming
    Replies: 3
    Last Post: 05-02-2008, 05:52 AM
  4. Shortest path problem
    By Digitalxero in forum C++ Programming
    Replies: 0
    Last Post: 10-25-2005, 05:32 PM
  5. Help!!! Shortest path graph
    By hansy32 in forum C Programming
    Replies: 5
    Last Post: 03-01-2002, 05:39 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21