Bama_Perl has asked for the wisdom of the Perl Monks concerning the following question:

I am fairly new to perl, and I am struggling with an output using a for loop and a nested for loop. I have an input file that looks like this:
MCCC processed: unknown event at: Tue, 21 Oct 2014 13:39:56 CST station, mccc delay, std, cc coeff, cc std, pol , t0_times + , delay_times ZJ.APRL 0.5735 0.0270 0.8548 0.1060 0 APRL.BHZ 30 +1.0824 -1.0954 ZJ.BEBP 0.0431 0.0173 0.8982 0.0495 0 BEBP.BHZ 30 +0.6827 -1.2262 ZJ.DUBY -0.3951 0.0242 0.8635 0.0550 0 DUBY.BHZ 30 +0.9965 -1.9781 ZJ.FOOT 0.4722 0.0570 0.7965 0.0987 0 FOOT.BHZ 30 +1.2407 -1.3550 ZJ.GRAW -0.2962 0.0203 0.8875 0.0789 0 GRAW.BHZ 30 +0.5646 -1.4473 ZJ.KNYN 0.2933 0.0428 0.7879 0.1305 0 KNYN.BHZ 30 +1.3060 -1.5992 ZJ.LEON 0.5243 0.0235 0.8996 0.0634 0 LEON.BHZ 30 +0.4850 -0.5473 ZJ.MICH -0.1824 0.0165 0.8599 0.0713 0 MICH.BHZ 30 +0.1649 -0.9339 ZJ.RAPH 0.3076 0.0422 0.8096 0.0954 0 RAPH.BHZ 30 +0.4645 -0.7435 ZJ.RKST -0.7187 0.0401 0.8060 0.0827 0 RKST.BHZ 30 +0.3940 -1.6992 ZJ.SAMH -0.0702 0.0260 0.8930 0.0465 0 SAMH.BHZ 30 +1.0272 -1.6839 ZJ.SHRD -0.3952 0.0319 0.8343 0.0938 0 SHRD.BHZ 30 +0.8002 -1.7819 ZJ.SPLN -0.1563 0.0306 0.8653 0.0878 0 SPLN.BHZ 30 +0.5314 -1.2742 Mean_arrival_time: 299.4135 No weighting of equations. Window: 3.19 Inset: 1.28 Shift: 0.25 Variance: 0.03135 Coefficient: 0.85047 Sample rate: 40.000 Taper: 0.40 Phase: P PDE 2013 4 20 3 47 55.02 -5.002 152.111 65.3 0.0 5.6
What the following script hopes to do is read the last line (with respective formatting) -- the line beginning with PDE. Let's call this the event information. Next, for EACH event, I need to read in the Station Name (Column 1, eg. ZJ.DONT), and the delay time (column 9). What I need to do, is output the Station name,the delay time and the number 1, 6 times across a column, and then the remaining stations (and respective delay times) will move to the next column, and if the remainder doesn't add up to six, pad the rest of the columns with zeroes, such as in this output:
97 42121 2 27.38 0.00-12.544 0.000 166.815 0.000 29.90 0.00 83 7.6 0.0 +0 44 -1.45 1135 0.70 1105 -0.13 1547 0.04 1184 0.91 1168 -1.07 1 209 -1.28 1 41 -0.79 1163 0.72 1134 -0.59 1254 -0.95 1148 0.31 1 40 -0.24 1322 -0.68 1276 1.09 1338 0.11 1321 0.15 1132 -0.80 1 442 1.08 1107 -1.28 1 39 -0.09 1196 -0.04 1 31 -0.76 1 78 0.20 1 38 -1.43 1 80 0.45 1131 1.07 1164 0.19 1274 -0.29 1526 1.29 1 186 0.15 1108 0.45 1277 0.83 1 91 0.83 1554 0.45 1160 -0.30 1 225 0.33 1505 -0.11 1154 0.75 1204 -0.18 1228 0.94 1143 -0.60 1 243 -1.82 1229 0.18 1 93 -0.29 1247 -0.94 1227 -0.47 1 76 0.10 1 123 0.58 1 96 0.78 1 84 -0.03 1242 0.51 1182 -0.26 1244 0.37 1 232 -0.25 1246 0.70 1226 -0.22 1245 0.71 1189 1.05 1165 0.21 1 230 0.17 1444 -0.95 1272 0.51 1234 1.20 1 32 0.34 1 77 -1.90 1 150 0.34 1124 0.47 1157 -0.33 1 34 -0.58 1 28 -0.59 1199 -0.37 1 185 -0.58 1119 0.04 1490 0.03 1463 -0.06 1330 0.50 1255 -0.04 1 231 -0.17 1 30 0.16 1331 0.77 1523 -0.43 1191 0.58 1 0 0.00 0
Where the first number represents the station number, the second number is the delay time (column 9) and the third column is just a 1. What I have thus far is below:
open(TABLEA, "mcp_list"); @tablea = <TABLEA>; # Specify the correspoding output file open(OUT,">output_inversion"); for ($i = 0; $i < @tablea; $i++) { chomp ($tablea[$i]); ($mcpFile) = (split /\s+/,$tablea[$i])[0]; system("wc $mcpFile > crap"); open(TABLEB,'crap'); @tableb = <TABLEB>; chomp ($tableb[0]); ($count) = (split /\s+/,$tableb[0])[1]; $numObs = $count - 9; close(TABLEB); unlink('crap'); #print $mcpFile," ",$numObs,"\n"; $numLines = int($numObs/6); $remainder = $numObs - ($numLines*6); if ($numLines eq 0) { $numLines = $numLines + 1; } #print $numLines," ",$remainder,"\n"; # Now begin with the output file open(TABLEB, $mcpFile); @tableb = <TABLEB>; for ($j = 0; $j < @tableb; $j++) { chomp ($tableb[$j]); ($PDE,$year,$month,$day,$hour,$minute,$second,$eqlat,$eqlong,$ +eqdepth,$mag) = (split /\s+/,$tableb[$j])[0,1,2,3,4,5,6,7,8,9,11]; if ($PDE eq "PDE") { printf OUT "%2d%2d%2d%2d%2d %s %s%7.3f %s%8.3f %s%6.2f %s +%s %s %s \n", $year%100,$month,$day,$hour,$minute,$second,"0.00",$eql +at,"0.00",$eqlong,"0.00",$eqdepth,"0.00",$numObs,$mag, "0.00", "\n"; } for ($k = 0; $k < @tableb; $k++) { chomp ($tableb[$k]); ($netsta, $delay_time) = (split /\s+/,$tableb[$j])[1,9]; ($net, $sta) = (split /\./, $netsta)[0,1]; print $net, " ", $sta, "\n";
In summary, I need to figure out a way to print (underneath each $mcpFile, the first and 9th column in that $mcpFile, along with the number "1", 6 times, with the remainders on the next line. It's long, I know, but I hope someone here can provide wisdom to send me on my way! Cheers.

Replies are listed 'Best First'.
Re: For Loop Output Errors
by NetWallah (Canon) on May 03, 2015 at 02:12 UTC
    Apologies in advance for the brutality, but your code is absolutely abonimable, and does not even compile.

    Your sample output data does not match your description, so it is hard to figure out what you want done.

    That said, here is a starting point for getting started parsing the file.

    Feel free to post questions if you do not understand the code, or the reasons it is coded that way.

    use strict; use warnings; open(my $inp, "test.mcp_list") or die "Could not open mcp_list: $!"; my @station_cols = qw|station mcccdelay std cccoeff ccstd pol t0 +_times delay_time1 delay_time2|; my @stations; while (<$inp>){ my @line = split; if (9 == @line){ push @stations, {map {$station_cols[$_] => $line[$_]} 0..$#stati +on_cols }; next; } my $numObs = @stations; if ($line[0] eq "PDE"){ my ($PDE,$year,$month,$day,$hour,$minute,$second,$eqlat,$eqlong, +$eqdepth,undef, $mag) = @line; printf "%2d%2d%2d%2d%2d %s %s%7.3f %s%8.3f %s%6.2f %s %s %s %s \n +", $year%100,$month,$day,$hour,$minute,$second,"0.00",$eqlat,"0.0 +0", $eqlong,"0.00",$eqdepth,"0.00",$numObs,$mag, "0.00", "\n"; } } print "Done\n";

            "You're only given one little spark of madness. You mustn't lose it."         - Robin Williams

Re: For Loop Output Errors
by Laurent_R (Canon) on May 03, 2015 at 10:09 UTC
    I also do not really understand your requirement and also not the relation between such requirement and your code.

    A few comments on your code, though.

    for ($i = 0; $i < @tablea; $i++) {
    It would be more perlish, clearer and probably faster (although this probably does not matter much) to write:
    for my $i (0..$#tablea) {
    or, better yet, to drop altogether the $i counter and get directly the content of the line:
    for my $line(@tablea) {
    Please consult http://perldoc.perl.org/perlsyn.html#Compound-Statements for more information.

    Counting the lines of a file:

    system("wc $mcpFile > crap"); open(TABLEB,'crap'); @tableb = <TABLEB>; chomp ($tableb[0]); ($count) = (split /\s+/,$tableb[0])[1]; $numObs = $count - 9; close(TABLEB); unlink('crap');
    This is probably the most contrived and unnatural way of counting the lines of a file that I have ever seen. Why don't you open the file and just count the lines? Something like this:
    my $count = 0; open my $IN, "<", $mcpFile or die "unable to open $mcpFile $!"; $count++ while <$IN>; close $IN;
    Well, actually, I made an explicit counter for the sake of clarity, but you don't even need the $count variable here, since Perl is maintaining a line counter for you in the $. special variable (see perlvar).

    If you really want to use a system call, still don't use this crap file:

    my $wc_output = `wc -l $mcpFile`;
    Also observe the good practice way to open a file, with a three-argument syntax and a lexical file handle. Check open for more information.
    $numLines = int($numObs/6); $remainder = $numObs - ($numLines*6);
    There is a modulo operator in Perl (check perlop, especially http://perldoc.perl.org/perlop.html#Multiplicative-Operators:
    my $remainder = $numObs % 6;
    You are now opening the file whose lines you just counted:
    open(TABLEB, $mcpFile);
    I don't understand the logic of what you are trying to do, but it seems to me that you could probably open it only once. Ditto on the open syntax.
    @tableb = <TABLEB>;
    Nothing wrong here, but it is usually better to use the while operator to iterate line by line over the file (especially if the file is large):
    while (my $line = <TABLEB>) {
    which means you don't need the for loop afterwards. Ditto on the more perlish for syntax.
    ($PDE,$year,$month,$day,$hour,$minute,$second,$eqlat,$eqlong,$eqdepth, +$mag) = (split /\s+/,$tableb[$j])[0,1,2,3,4,5,6,7,8,9,11];
    can be written simpler:
    my ($PDE,$year,$month,$day,$hour,$minute,$second,$eqlat,$eqlong,$eqdep +th,$mag) = (split /\s+/,$tableb[$j])[0..9,11];
    or
    my ($PDE,$year,$month,$day,$hour,$minute,$second,$eqlat,$eqlong,$eqdep +th,undef, $mag) = split /\s+/,$tableb[$j];
    Then, you are iterating a second time on the array:
    for ($k = 0; $k < @tableb; $k++) {
    Ditto on the for syntax. But why don't you do everything within the same loop?

    Now to the final and probably most important advise. You may have noticed that I used throughout the my operator to declare new lexical variables. This is very important, declare your lexical variables with my. And you should use the

    use strict; use warnings;
    pragmas at the top of every program having more than one single line. This will enable the compiler to warn you about many of your errors, typos, dangerous or deprecated constructs, etc., and you will end up saving a lot of time.

    I hope this helps.

    Update, 11:14 UTC:: it appears that I read one of the OP's code lines too quickly and that the OP is not counting the lines, but the words, of the $mcpFile. But that does not really change the underlying idea that I wanted to come across.

    Je suis Charlie.
Re: For Loop Output Errors
by aaron_baugher (Curate) on May 02, 2015 at 23:04 UTC

    I'm having trouble understanding your requirements. You say you want to output the station name, but I don't see any station names in your sample output. I also don't understand where many of the numbers in your sample output are coming from, like the "97" and "42121" in the first line. Can you show us the correct output you want to get from the sample input you've provided?

    Aaron B.
    Available for small or large Perl jobs and *nix system administration; see my home node.

Re: For Loop Output Errors
by pme (Monsignor) on May 03, 2015 at 10:36 UTC
    Hi Bama_Perl,

    I suggest to collect station data lines into a hash (%station_data) and then use collected data right after the 'PDE' line has been read.

    use strict; use warnings; use diagnostics; use Data::Dumper; my $num_obs; my %station_data; while (<DATA>) { chomp; my @line = split; if (@line == 9) { $num_obs++; $station_data{$line[0]} = \@line; } elsif (/^PDE/) { my (undef, $year, $month, $day, $hour, $minute, $second, $eqlat, $ +eqlong, $eqdepth, $mag) = @line; # y m d h m s la lo de no mag printf "%02d%02d%02d %02d%02d%.2f %f %f %f %d %f\n", $year%100, $month, $day, $hour, $minute, $second, $eqlat, $eql +ong, $eqdepth, $num_obs, $mag; # process station data in according to your needs print Dumper(\%station_data) . "\n"; } } __DATA__ MCCC processed: unknown event at: Tue, 21 Oct 2014 13:39:56 CST station, mccc delay, std, cc coeff, cc std, pol , t0_times + , delay_times ZJ.APRL 0.5735 0.0270 0.8548 0.1060 0 APRL.BHZ 301 +.0824 -1.0954 ZJ.BEBP 0.0431 0.0173 0.8982 0.0495 0 BEBP.BHZ 300 +.6827 -1.2262 ZJ.DUBY -0.3951 0.0242 0.8635 0.0550 0 DUBY.BHZ 300 +.9965 -1.9781 ZJ.FOOT 0.4722 0.0570 0.7965 0.0987 0 FOOT.BHZ 301 +.2407 -1.3550 ZJ.GRAW -0.2962 0.0203 0.8875 0.0789 0 GRAW.BHZ 300 +.5646 -1.4473 ZJ.KNYN 0.2933 0.0428 0.7879 0.1305 0 KNYN.BHZ 301 +.3060 -1.5992 ZJ.LEON 0.5243 0.0235 0.8996 0.0634 0 LEON.BHZ 300 +.4850 -0.5473 ZJ.MICH -0.1824 0.0165 0.8599 0.0713 0 MICH.BHZ 300 +.1649 -0.9339 ZJ.RAPH 0.3076 0.0422 0.8096 0.0954 0 RAPH.BHZ 300 +.4645 -0.7435 ZJ.RKST -0.7187 0.0401 0.8060 0.0827 0 RKST.BHZ 300 +.3940 -1.6992 ZJ.SAMH -0.0702 0.0260 0.8930 0.0465 0 SAMH.BHZ 301 +.0272 -1.6839 ZJ.SHRD -0.3952 0.0319 0.8343 0.0938 0 SHRD.BHZ 300 +.8002 -1.7819 ZJ.SPLN -0.1563 0.0306 0.8653 0.0878 0 SPLN.BHZ 300 +.5314 -1.2742 Mean_arrival_time: 299.4135 No weighting of equations. Window: 3.19 Inset: 1.28 Shift: 0.25 Variance: 0.03135 Coefficient: 0.85047 Sample rate: 40.000 Taper: 0.40 Phase: P PDE 2013 4 20 3 47 55.02 -5.002 152.111 65.3 0.0 5.6