Hi, All,

Thanks for all of the great suggestions. I have reduced the processing by a whopping 40% after rolling in a number of suggestions. The new result is below.

I don't know how to parallel process something like this. I think splitting the file will be too time-consuming. (I am running on a system with 32 CPUs, though, so it is tempting.) I think the file sizes will have to be much bigger before I consider that.

Again, thanks everyone!!

Fiddler42

open (NETSTATS,"$input_file"); $TotalNets = 0; while (<NETSTATS>) { if (/^net \'(.*)'\:\s*$/) { $NetName = $1; $c = 1; $TotalNets++; if ($TotalNets % 100_000 == 0 && $TotalNets > 0) { print ("Parsed $TotalNets nets...\n"); } do { if (/^\s+wire capacitance\:\s+(\d+.*\d*)\s*$/) { $NetCapRaw = $1; $NetCap = $CapMultiplier*$NetCapRaw; $c++; } elsif (/^\s+wire resistance\:\s+(\d+.*\d*)\s*$/) { $NetRes = $1; $c++; } elsif (/^\s+number of loads\:\s+(\d+)\s*$/) { $NetFanout = $1; $c++; } elsif (/^\s+total wire length\:\s+(\d+.*\d*)\s*/) { $NetLength = $1; $c++; } $_ = <NETSTATS>; } until ((/Driver Pins/) || ($_ eq "" )); if (/Driver Pins/) { $_ = <NETSTATS>; $_ = <NETSTATS>; ($FirstDriver) = $_ =~ /^\s*(\S.*)\s*/; $c++; } $AddToCustomTable = 0; if (($NetName ne "") && (($NetCap ne "") && ($NetCap ne "NaN") +) && (($NetRes ne "") && ($NetRes ne "NaN")) && (($NetFanout ne "") & +& ($NetFanout ne "NaN")) && (($NetLength ne "") && ($NetLength ne "Na +N")) && ($FirstDriver ne "") && ($c == 6)) { if ($NetFanout <= $UpperFanoutLimitOfTable) { if (($UseNetPattern == 0) && ($UseDriverCell == 0) && +($TopLevelOnly == 0)) { $AddToCustomTable = 1; } elsif (($UseNetPattern == 0) && ($UseDriverCell == 0 +) && ($TopLevelOnly == 1)) { $DriverForwardSlashCount = $FirstDriver =~ s/(\/)/ +$1/gs; # Simple command to count characters... $NetNameForwardSlashCount = $NetName =~ s/(\/)/$1/ +gs; if (($DriverForwardSlashCount <= 1) && ($NetNameFo +rwardSlashCount <= 1 )) {$AddToCustomTable = 1;} if ($DebugMode == 1) { print ("Adding net $NetName (driver = $FirstDr +iver)...\n"); print DEBUG_VERBOSE ("$NetFanout $NetRes\n"); } } elsif (($UseNetPattern == 0) && ($UseDriverCell == 1 +) && ($TopLevelOnly == 0)) { if ($FirstDriver =~ qr/$DriverPattern/x) {$AddToCu +stomTable = 1;} # to regard variable as a regular expression... } elsif (($UseNetPattern == 1) && ($UseDriverCell == 0 +) && ($TopLevelOnly == 0)) { if ($NetName =~ qr/$NetPattern/x) {$AddToCustomTab +le = 1;} } elsif (($UseNetPattern == 1) && ($UseDriverCell == 1 +) && ($TopLevelOnly == 0)) { if ($NetName =~ qr/$NetPattern/x) { $AddToCustomTable = 1; } elsif ($FirstDriver =~ qr/$DriverPattern/x) {$Ad +dToCustomTable = 1;} } # These conditions are not allowed per input argument +parsing... #} elsif (($UseNetPattern == 0) && ($UseDriverCell == +1) && ($TopLevelOnly == 1)) { #} elsif (($UseNetPattern == 1) && ($UseDriverCell == +0) && ($TopLevelOnly == 1)) { #} elsif (($UseNetPattern == 1) && ($UseDriverCell == +1) && ($TopLevelOnly == 1)) { } if ($AddToCustomTable == 1) {push @{$NetStats[ $NetFanout +] ||= []}, [ $NetName, $NetCap, $NetRes, $NetLength, $FirstDriver ];} } else { if ($DebugMode == 1) { print DEBUG_VERBOSE ("ERROR: Problem deriving stats fo +r net $NetName!\n"); print DEBUG_VERBOSE ("ERROR: c=$c NetName=$NetName Net +Fanout=$NetFanout NetCap=$NetCap NetRes=$NetRes NetLength=$NetLength +FirstDriver=$FirstDriver\n\n"); } } } $NetName = ""; $NetCap = "NaN"; $NetRes = "NaN"; $NetFanout = "NaN"; $NetLength = "NaN"; $FirstDriver = ""; } print ("Parsed $TotalNets nets...\n\n"); close (NETSTATS);

In reply to Re^2: Looking for ways to speed up the parsing of a file... by fiddler42
in thread Looking for ways to speed up the parsing of a file... by fiddler42

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.