the desires output is to have the exact name of the module : net : and gate where the change has taken place.

What change? Without knowing which values you are trying to compare, I can't offer a complete solution.

Ah, you have two files that you want to compare. Ok. So you turn each one into a data structure, then iterate over one, and report differences against the other.

my $info1 = grok_file $file1; my $info2 = grok_file $file2; MODULE: while ( my ($module, $nets_href) = each %$info1 ) { unless ( $info2->{$module} ) { warn "$file2 has no module $module"; next MODULE; } NET: while ( my ($net, $gates_href) = each %$nets_href ) { unless ( $info2->{$module}->{$net} ) { warn "$file2 has no net $module/$net"; next NET; } GATE: while ( my ( $gate, $href1 ) = each %$gates_href ) { my $href2 = $info2->{$module}->{$net}->{$gate}; unless ( $href2 ) { warn "$file2 has no gate $module/$net/$gate"; next GATE; } diff_gates "$module/$net/$gate", $href1, $href2; } # GATE } # NET } # MODULE

Where diff_gates is now pretty simple:

sub diff_gates ( $ $ $ ) { my ( $path, $g1, $g2 ) = @_; my %all_keys = ( %$g1, %$g2 ); foreach my $key ( sort keys %all_keys ) { if ( ! exists $g2->{$key} ) { print "$path: g2 missing $key\n"; } elsif ( ! exists $g1->{$key} ) { print "$path: g1 missing $key\n"; } elsif ( $g1->{$key} ne $g2->{$key} ) { print "$path: $key diff between g1 and g2\n"; } } }

Just turning the format you give into a data structure is easy. Although, if the actual input is as large as you say it is, note that this might not be the right answer. If this simulation takes a substantial part of your computer's memory when done in C, trying to create a Perl data structure for the entire output will likely fail. Picking just certain gates should be ok, though.

Here's the basic pattern you want to follow. Each line is either introducing a new thing to describe, or it is continuing the description of the most recently introduced thing. Either way, we store the description in a data structure (potentially creating a new "slot" if it is a new thing being introduced.)

In this case, it is easy enough to make patterns to match either a new object, or further information about the most recent object. I am going to make some assumptions about what is what on each line; hopefully the code will be clear enough for you to modify appropriately.

sub grok_file ( $ ) { my ( $fh ) = @_; $fh ||= \*STDIN; # somewhere to store all the info: # # $info{module}{net}{gate_id}{...} = $value # # where "..." can be one of these keys: # # type # cap_max # cap_net # cap_viol # trans_max # trans_worst # trans_viol my %info; # keep track of the last thing we saw my $last_module; my $last_net; my $last_stat; # cap or trans my $last_gate_href; # common regex subexpressions my $float = qr/[\d.\-]+/; # floating point value my $ind = qr/ /; # indent # ok, now the main loop. while (<$fh>) { # remove trailing whitespace s/\s+\z//; # new module? # |Module: gg_PCI_PORT_bc_unit_221 [bc_pci_u3/bc_unit_u46] if ( /^Module: (.*)$/ ) { $last_module = $1; print STDERR "module: $last_module\n"; $info{$last_module} ||= {}; } # new net? # | Net: inp_c_1 elsif ( /^ Net: (.*)$/ ) { $last_net = $1; print STDERR " net: $last_module / $last_net\n"; $info{$last_module}{$last_net} ||= {}; } # look for individual gates now # | Max Capacitance = 0.61 (bc_0:A1 [NAN2D1]) # | Max Transition = 1.50 (bc_3:A2 [NAN2D1]) elsif ( / ^ $ind $ind Max \s (Capacitance| Transition) \s+ # what = \s+ ($float) \s+ # amt \( (\S+) \s+ # id \[ ([^\]]+) \] \) # type $ /x ) { my ( $what, $amt, $id, $type ) = ( $1, $2, $3, $4 ); print STDERR " gate: $id\n"; $last_stat = $what eq 'Capacitance' ? 'cap' : 'trans'; my $key = $last_stat . '_max'; $info{$last_module}{$last_net}{$id} ||= {}; $last_gate_href = $info{$last_module}{$last_net}{$id}; $last_gate_href->{$key} = $amt; $last_gate_href->{type} = $type; } # finally, a bunch of other values that apply to # the most recent gate: # | Pin Worst Transition = 7.48 # | Net Capacitance = 6.04 # | VIOLATION = -5.98 elsif ( / ^ $ind $ind (?: ( Pin \s Worst \s Transition ) | ( Net \s Capacitance ) | VIOLATION ) \s+ = \s+ ($float) $ /x ) { my ( $pin, $net, $amt ) = ( $1, $2, $3 ); my $key = $pin ? 'trans_worst' : $net ? 'cap_net' : $last_stat . '_viol'; $last_gate_href->{$key} = $amt; } else { print STDERR "$0: could not parse line $.: $_\n"; } } return \%info; }

In reply to Re^3: Parsing A File by tkil
in thread Parsing A File by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.