in reply to print lines which are not reverse duplicates

What you've got sounds like a directed graph, and CPAN has the module Graph. I'm also using Text::CSV to make parsing and output more robust (also install Text::CSV_XS for speed).

input.csv:

personA,personB,10 personB,personA,190 personA,personC,23 personA,personD,43 personE,personF,10

Code:

#!/usr/bin/env perl use warnings; use strict; use Text::CSV; use Graph; my $filename = 'input.csv'; my $g = Graph->new(directed=>1); open my $fh, '<', $filename or die "$filename: $!"; my $csv = Text::CSV->new({ binary=>1, auto_diag=>2, eol=>$/ }); while ( my $row = $csv->getline($fh) ) { my ($author1, $author2, $interactions) = @$row; $g->set_edge_attribute( $author1, $author2, # auto-creates edge 'interactions', $interactions ); } $csv->eof or $csv->error_diag; close $fh; for my $e ($g->edges) { my ($author1, $author2) = @$e; next if $g->has_edge($author2, $author1); my $interactions = $g->get_edge_attribute( $author1, $author2, 'interactions' ); $csv->print(select, [ $author1, $author2, $interactions ]); }

Output:

personE,personF,10 personA,personD,43 personA,personC,23

Update: The above does not handle the case of duplicates in the input. Is that a concern for you, and if yes, what happens with the "interactions"? Are they supposed to be summed up?

Replies are listed 'Best First'.
Re^2: print lines which are not reverse duplicates
by Maire (Scribe) on Nov 26, 2018 at 07:58 UTC
    Brilliant, thank you very much for this! Duplicates are not an issue for me this time, an earlier code summed them up and the values formed the interaction columns. Thanks again.