Hello monks,

Now that My program runs, I need to optimize it. Currently this program runs through 120MB files(spreaded
into 11 files) and parse out the information and spits out the info/format that user need to see. It takes about 3 min and when I do dprofpp, I see that main () is taking the most of the time.

I copy my main routine below and as you can see it's not doing anything fancy.
My question is,
1)will writing this in C will prove to be much faster(just this parsing part) ? how much faster?
2)is there anything I can do to improve my main function ?
[root@myserver]# dprofpp $Monfile is tmon.out Exporter::Heavy::heavy_export AutoLoader::__ANON__[/usr/lib/perl5/5.8.8/AutoLoader.pm:96] Total Elapsed Time = 192.5636 Seconds User+System Time = 193.6643 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 42.5 82.48 198.71 1 82.486 198.71 main::main 29.8 57.81 56.779 194654 0.0003 0.0003 Text::CSV_XS::Parse 12.9 24.98 23.929 194654 0.0001 0.0001 Text::CSV_XS::fields 9.65 18.68 17.619 194654 0.0001 0.0001 Text::CSV_XS::new 8.11 15.71 70.368 194654 0.0001 0.0004 Text::CSV_XS::parse 2.42 4.687 3.660 187290 0.0000 0.0000 main::extract 0.89 1.719 0.649 194654 0.0000 0.0000 AutoLoader::__ANON__[/usr +/lib/perl 5/5.8.8/AutoLoader.pm:96] 0.03 0.060 0.079 4 0.0150 0.0198 main::BEGIN 0.01 0.010 0.010 1 0.0100 0.0099 Exporter::as_heavy 0.01 0.010 0.010 20 0.0005 0.0005 Getopt::Long::BEGIN 0.00 0.000 -0.000 5 0.0000 - strict::import 0.00 0.000 -0.000 3 0.0000 - Text::CSV_XS::BEGIN 0.00 0.000 -0.000 7 0.0000 - vars::import 0.00 0.000 -0.000 1 0.0000 - DynaLoader::bootstrap 0.00 0.000 -0.000 1 0.0000 - DynaLoader::dl_load_flags sub main { for (@files) { open ( NOW , "$directory/$_" ) || die "you suck\n"; while (<NOW>) { my (%rec,%HoH); my $p; chomp; $t_counter++; my $csv = Text::CSV_XS->new; $csv->parse($_); my @fields = $csv->fields; if (/^STOP/) { @rec{@attrs_sto} = @fields[0,1,13,14,16,20,33,3 +4,36,67]; if ($rec{_i_pstn_trunk}) { $p = extract($rec{_i_pstn_circuit}, $rec +{_i_pstn_trunk}); $HoH{$p} = {%rec}; } elsif ($rec{_e_pstn_trunk}) { $p = extract($rec{_e_pstn_circuit}, $rec +{_e_pstn_trunk}); $HoH{$p} = {%rec}; } else { $ncounter++; } } elsif (/^START/) { @rec{@attrs_sta} = @fields[0,1,11,15,28,29,31,5 +3]; if ($rec{_i_pstn_trunk}) { $p = extract($rec{_i_pstn_circuit}, $rec{_i_p +stn_trunk}); $HoH{$p} = {%rec}; } elsif ($rec{_e_pstn_trunk}) { $p = extract($rec{_e_pstn_circuit}, $rec{_e_ +pstn_trunk}); $HoH{$p} = {%rec}; } else { $ncounter++; } } elsif (/^ATTEMPT/) { @rec{@attrs_att} = @fields[0,1,11,13,17,30,31,3 +3,57]; if ($rec{_i_pstn_trunk}) { $p = extract($rec{_i_pstn_circuit}, $rec{_i_ +pstn_trunk}); $HoH{$p} = {%rec}; } elsif ($rec{_e_pstn_trunk}) { $p = extract($rec{_e_pstn_circuit}, $rec{_e_ +pstn_trunk}); $HoH{$p} = {%rec}; } else { $ncounter++; } } else { $ncounter++; } push @data, {%HoH}; } close NOW; } } sub extract { return join('-', (split(/:/, $_[0]))[1], $_[1]); } 1653222 -rw-r--r-- 1 1036 101 10773472 Feb 14 12:00 100A8B9.F 1653223 -rw-r--r-- 1 1036 101 11110758 Feb 14 12:05 100A8BA.F 1653224 -rw-r--r-- 1 1036 101 11106128 Feb 14 12:10 100A8BB.F 1653225 -rw-r--r-- 1 1036 101 10851079 Feb 14 12:15 100A8BC.F 1653226 -rw-r--r-- 1 1036 101 10758864 Feb 14 12:20 100A8BD.F 1653227 -rw-r--r-- 1 1036 101 10665272 Feb 14 12:25 100A8BE.F 1653228 -rw-r--r-- 1 1036 101 10722126 Feb 14 12:30 100A8BF.F 1653229 -rw-r--r-- 1 1036 101 10204733 Feb 14 12:35 100A8C0.F 1653230 -rw-r--r-- 1 1036 101 10292893 Feb 14 12:40 100A8C1.F 1653231 -rw-r--r-- 1 1036 101 9990122 Feb 14 12:45 100A8C2.F 1653232 -rw-r--r-- 1 1036 101 10073364 Feb 14 12:50 100A8C3.F 1653233 -rw-r--r-- 1 1036 101 10188466 Feb 14 12:55 100A8C4.F

In reply to need to optimize my sub routine by convenientstore

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.