in reply to H of A optimization

This question has resulted in quite a mess of answers.

The purpose is to read an edge list and convert it into a hash, there's about 30,000 nodes in this graph, but 90,000 lines. So there are loops which need to be eliminated. (Graph is undirected.)

I am a little confused about what it is that you want. It seems there has been some confusion in the answers as well. The fact that the graph is undirected doesn't have anything to do with how many "loops" it has. An undirected graph may certainly have loops. It may even have multiple edges with the same vertices or edges where both ends are the same vertex. The undirected graph: 0-1,1-0,1-2,2-0,2-2 has loops with 1, 2, and 3 vertices.

All "undirected" means is that the edges don't have a direction so it doesn't matter what order you list the vertices when defining an edge. In an undirected graph, a-b is the same as b-a. That's not the case in a directed graph.

Your code seems to suggest that you don't want any cycles in the graph. Really, all it does is not allow the same node to show up more than twice on the right hand side of your input. If the graph is undirected, the side that the vertex shows up on shouldn't even matter. Most of the solutions presented to you are optimizations of your own code including those by dws, ferrency, fruiture, and Arien.

With your example input:

0 1 1 2 1 3 1 4 2 4 4 5

Your code will simply eliminate the edge "2 4". Because 4 shows up more than once on the right.

If this is what you want, then great; you've got lots of answers. This doesn't jive with the idea that they are undirected graphs though.

When I first read it, I suspected that runrig had a better idea of what you wanted to do. I thought that maybe you wanted to keep only one of sets of duplicate edges. His (runrig's) code would do that for a directed graph but it would still allow two edges like a-b and b-a to exist together. I'm guessing Zaxo had the same thought and was under the impression that the Graph module would do what you wanted. It doesn't. (In fact, it seems to have a bug(?) such that I can use Zaxo's code to add 7 edges and actually end up with 9. I'll demonstrate that below in a readmore section.)

In order to get a good answer here, I think you are going to have to more clearly explain what it is that you are trying to do. Is the graph really undirected? Are you trying to eliminate all duplicate edges? (You could have a 30K node graph with 90K unique edges, by the way.) Are you trying to eliminate all cycles? (There are likely to be many ways of doing that which result in totally different graphs.) Or are you trying to do something else entirely?

If you want to know more about that bug(?) I encountered in Graph...

I ran this code:

#!/usr/bin/perl -w use strict; use Graph::Undirected; my $net = Graph::Undirected->new; while (<DATA>) { $net = $net->add_edge( split ); } my @e = $net->edges(); print shift@e,shift@e,"\n" while @e; __END__ 0 1 1 0

I got as output:

01 01 10 10

For some reason the edges, 0-1 and 1-0, seem to be repeated. My testing seems to indicate that this only happens with the edges 0-1 and 1-0 and that it only seems to happen when they are both present. If anyone can shed some light on this or dig into it, that would be really cool. Otherwise, I'll dig into it when I have some time and submit a bug report to the author once I can offer more insight.

To be clear, there are no blank lines after the _END_ tag in the code given above.

Update: Zaxo indirectly pointed out that this was the case with other input as well. The edge set "1 2" and "2 1", for example, will also result in four edges. I'm still not sure why this should be the case as "1 2" results in one edge not two.

-sauoq
"My two cents aren't worth a dime.";

Replies are listed 'Best First'.
Re: Re: H of A optimization
by abitkin (Monk) on Aug 31, 2002 at 15:40 UTC
    Actually I was turning an aciclic undirected graph into a spanning tree, so removing 2 4 was the correct approach. The code I posted works correctly for me, with that and some help I got on the CB, I went from 80+ minutes of run time, to 15.

    Thanks for your thoughts though
      Actually I was turning an aciclic undirected graph into a spanning tree,

      But you weren't.

      Firstly, the data you started with was not acyclic (the edges 1-2, 1-4, and 2-4 composed a cycle) and you treated it as if it were directed by paying special attention to the nodes on the right side of your input edges. Secondly and more importantly, your algorithm won't guarantee that the resulting graph is connected or acyclic, the two requirements for a spanning tree.

      so removing 2 4 was the correct approach.

      If the nodes had been listed as 2-1, 1-4, 4-2 the cycle would not have been eliminated.

      The code I posted works correctly for me, with that and some help I got on the CB, I went from 80+ minutes of run time, to 15.

      I'm glad you got an answer that worked for you. Certainly the code you posted was not very efficient. But efficiency is meaningless without correctness and your statement of the problem leaves some question about the solutions.

      Maybe you could summarize the help you got on the CB for those monks who may stumble upon this node while seeking enlightenment on their own graph related problems.

      Good luck!

      -sauoq
      "My two cents aren't worth a dime.";