in reply to Re: Looking for ways to speed up the parsing of a file...
in thread Looking for ways to speed up the parsing of a file...
Thanks for all of the great suggestions. I have reduced the processing by a whopping 40% after rolling in a number of suggestions. The new result is below.
I don't know how to parallel process something like this. I think splitting the file will be too time-consuming. (I am running on a system with 32 CPUs, though, so it is tempting.) I think the file sizes will have to be much bigger before I consider that.
Again, thanks everyone!!
Fiddler42
open (NETSTATS,"$input_file"); $TotalNets = 0; while (<NETSTATS>) { if (/^net \'(.*)'\:\s*$/) { $NetName = $1; $c = 1; $TotalNets++; if ($TotalNets % 100_000 == 0 && $TotalNets > 0) { print ("Parsed $TotalNets nets...\n"); } do { if (/^\s+wire capacitance\:\s+(\d+.*\d*)\s*$/) { $NetCapRaw = $1; $NetCap = $CapMultiplier*$NetCapRaw; $c++; } elsif (/^\s+wire resistance\:\s+(\d+.*\d*)\s*$/) { $NetRes = $1; $c++; } elsif (/^\s+number of loads\:\s+(\d+)\s*$/) { $NetFanout = $1; $c++; } elsif (/^\s+total wire length\:\s+(\d+.*\d*)\s*/) { $NetLength = $1; $c++; } $_ = <NETSTATS>; } until ((/Driver Pins/) || ($_ eq "" )); if (/Driver Pins/) { $_ = <NETSTATS>; $_ = <NETSTATS>; ($FirstDriver) = $_ =~ /^\s*(\S.*)\s*/; $c++; } $AddToCustomTable = 0; if (($NetName ne "") && (($NetCap ne "") && ($NetCap ne "NaN") +) && (($NetRes ne "") && ($NetRes ne "NaN")) && (($NetFanout ne "") & +& ($NetFanout ne "NaN")) && (($NetLength ne "") && ($NetLength ne "Na +N")) && ($FirstDriver ne "") && ($c == 6)) { if ($NetFanout <= $UpperFanoutLimitOfTable) { if (($UseNetPattern == 0) && ($UseDriverCell == 0) && +($TopLevelOnly == 0)) { $AddToCustomTable = 1; } elsif (($UseNetPattern == 0) && ($UseDriverCell == 0 +) && ($TopLevelOnly == 1)) { $DriverForwardSlashCount = $FirstDriver =~ s/(\/)/ +$1/gs; # Simple command to count characters... $NetNameForwardSlashCount = $NetName =~ s/(\/)/$1/ +gs; if (($DriverForwardSlashCount <= 1) && ($NetNameFo +rwardSlashCount <= 1 )) {$AddToCustomTable = 1;} if ($DebugMode == 1) { print ("Adding net $NetName (driver = $FirstDr +iver)...\n"); print DEBUG_VERBOSE ("$NetFanout $NetRes\n"); } } elsif (($UseNetPattern == 0) && ($UseDriverCell == 1 +) && ($TopLevelOnly == 0)) { if ($FirstDriver =~ qr/$DriverPattern/x) {$AddToCu +stomTable = 1;} # to regard variable as a regular expression... } elsif (($UseNetPattern == 1) && ($UseDriverCell == 0 +) && ($TopLevelOnly == 0)) { if ($NetName =~ qr/$NetPattern/x) {$AddToCustomTab +le = 1;} } elsif (($UseNetPattern == 1) && ($UseDriverCell == 1 +) && ($TopLevelOnly == 0)) { if ($NetName =~ qr/$NetPattern/x) { $AddToCustomTable = 1; } elsif ($FirstDriver =~ qr/$DriverPattern/x) {$Ad +dToCustomTable = 1;} } # These conditions are not allowed per input argument +parsing... #} elsif (($UseNetPattern == 0) && ($UseDriverCell == +1) && ($TopLevelOnly == 1)) { #} elsif (($UseNetPattern == 1) && ($UseDriverCell == +0) && ($TopLevelOnly == 1)) { #} elsif (($UseNetPattern == 1) && ($UseDriverCell == +1) && ($TopLevelOnly == 1)) { } if ($AddToCustomTable == 1) {push @{$NetStats[ $NetFanout +] ||= []}, [ $NetName, $NetCap, $NetRes, $NetLength, $FirstDriver ];} } else { if ($DebugMode == 1) { print DEBUG_VERBOSE ("ERROR: Problem deriving stats fo +r net $NetName!\n"); print DEBUG_VERBOSE ("ERROR: c=$c NetName=$NetName Net +Fanout=$NetFanout NetCap=$NetCap NetRes=$NetRes NetLength=$NetLength +FirstDriver=$FirstDriver\n\n"); } } } $NetName = ""; $NetCap = "NaN"; $NetRes = "NaN"; $NetFanout = "NaN"; $NetLength = "NaN"; $FirstDriver = ""; } print ("Parsed $TotalNets nets...\n\n"); close (NETSTATS);
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Looking for ways to speed up the parsing of a file...
by samtregar (Abbot) on May 18, 2008 at 17:01 UTC | |
by sgifford (Prior) on May 18, 2008 at 18:59 UTC |