Let me see if I have this straight.
- Your input is divided into blocks.
- Each block starts with a weird line saying ENERGY.
- Within each block you have lines with a vertex, G, then vertices that the first vertex is connected to.
- You don't want to count a vertex as being connected to 0 or itself. (In your data this solves your last column issue, and I think is the real requirement.)
- Every block has the same vertices. (This is a big assumption that is implicit in how you want to process everything.)
- First you want to go through the first block, then print off all of the vertices.
- Then for each block, at the end of the block,'
- For each vertex
- For each vertex it is connected to
- print off first vertex, second vertex, and the total multiplicity you have found.
Note in particular that in my understanding if a pair of vertices does not appear in a later block, it won't be printed in *edges.
If this is what you are asking for, the following code should do it:
#!/usr/bin/perl -w
use strict;
# This should really be passed in on the command line or something.
my $data_file = "HIVgag.ct";
# Find the vertices.
my @vertices;
open(my $fh, "<", $data_file) or die "Can't open '$data_file': $!";
while (<$fh>) {
if (/energy/i) {
if (not @vertices) {
# This is the first line of the first block. Do nothing.
next;
}
else {
# We have completed the first block.
last;
}
}
my @row = split /\s+/, $_;
push @vertices, $row[0];
}
# Scalar context turns @vertices into the number of elements it has.
print "*vertices " . @vertices . "\n";
for my $vertex (@vertices) {
print "$vertex G\n";
}
print "*edges\n";
seek($fh, 0, 0);
my %connect;
my $position = @vertices - 1;
while (<$fh>) {
$position++;
if (/energy/i) {
if ($position != @vertices) {
die "In line $., too few vertices found";
}
$position = -1;
next;
}
my ($this_vertex, $type, @row) = split /\s+/, $_;
if ($this_vertex ne $vertices[$position]) {
die "Unexpected vertex '$this_vertex' at line $.";
}
for my $other_vertex (@row) {
if (0 == $other_vertex or $this_vertex == $other_vertex) {
next;
}
my $key = "$this_vertex $other_vertex";
$connect{$key}++;
print "$key $connect{$key}\n";
}
}
Things to note.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.