velocitymodel has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

I have a file with about 17,000 "events" for earthquake arrival times. One event looks something like this.

82 2 22 1043 54.7 48.020 114.037 17.5 3.2 2.9 13 177 84.3 0.20 +1.6 2.7 C MBMG * 3.1 KALISPELL VALLEY; FELT 3.07 82022210 BUT EP4432.804ES60.7 LRM IPD4435.40 IS67.2 180. AMM IPD4429.50 ES57.3 133. MSO EPC4415.90 ES32.3 CMT EP4430.50 IS58.3 LDM IPC4412.20 ES24.3 3 RXF EPC4414.3 CLX IPC4408.70 ES19.7

I want to take the header of each event and break it up and assign variables to the information I need. I then want to keep some of the information at the heading of the event and move the rest to the end of the event in a specified order. For the body of the event I simply want to delete extraneous information. I then want to print this new format out to a seperate file so that it may be used in another software program. So my over all goal is to format my original file into a new format to run in the software. This is what my events should look like

13 BUT 4432.804 60.7 LRM 4435.40 67.2 AMM 4429.50 57.3 MSO 4415.90 32.3 CMT 4430.50 58.3 LDM 4412.20 24.3 3 RXF 4414.3 CLX 4408.70 19.7 1043 54.7 114.037 48.020 17.5 3.2 177

this is my current code.

use strict; use warnings; my $fin = "SELECTDAT2"; my $fout = "myfile"; open my $ih, '<', $fin or die "cannot open $fin for reading, $!"; open my $oh, '>', $fout or die "cannot open $fout for writing, $!"; sub events { my @tokens = split; my $header = $tokens[0]; my ($n, @list); ($n, @list) = @_; my (@list); foreach $_ (@list) { my @tokens = split; my @list = @tokens[0,3,4,7,8]; if ($_ > my $header) { push (@list, $_); } return @list; } } while (<$ih>) { chomp; my @tokens = split; my $header = $tokens[0]; my @records = @tokens [3,4,5,6,10,11]; return @records; my $y = $tokens[0]; my $m = $tokens[1]; my $d = $tokens[2]; my $h = $tokens[3]; events; } print $oh events(), "\n", @$_ for sort {$a->[3] <=> $b->[3]} @records; + close ($oh); close ($ih);

Right now with my code I am just trying to focus on rearranging my variables and place them correctly.I am having a lot of trouble with lexical scope and canceling of variables within blocks.

Replies are listed 'Best First'.
Re: Lexical scope
by GrandFather (Saint) on Apr 14, 2011 at 06:08 UTC
    1. return should always (except in a few special situations) be within a sub. Where do you expect the line return @records; to return to?
    2. The first two lines of sub events seem to be misplaced.
    3. my ($n, @list); ($n, @list) = @_;

      is better written

      my ($n, @list) = @_;
    4. foreach $_ (@list) is just wrong. Use either foreach (@list) or
      foreach my $item (@list) { my @tokens = split ' ', $item; ...
    5. Reusing the same name for variables in the same area of code (@list for example) is at best confusing and at worst is completely wrong - don't do that
    True laziness is hard work

      Thank you for your advice. I am still lost however on how to handle my array @records and call it outside the while block to be printed. I tried using our but encountered and error and I don't think I fully understand how to use our. Would using a package variable be okay in this situation? Thanks again.

        Your code is a complete train wreck so I'm not going to even try and guess where your head is at. Instead, working from your specification, I guess you want something like:

        use strict; use warnings; while (defined (my $header = <DATA>)) { my @parts = split ' ', $header, 17; next if @parts < 12; processEvent (); print join (' ', @parts[3, 4, 6, 5, 7, 8, 11]), "\n"; } sub processEvent { <DATA>; for (1 .. 8) { my $line = <DATA>; last if ! defined $line; my @parts = split ' ', $line; splice @parts, 3, (@parts - 3) if @parts > 3; s/[^\d\s.]+/ /g for @parts[1 .. $#parts]; s/^\s+|\s+$//g for @parts; print join (' ', @parts), "\n"; } } __DATA__ 82 2 22 1043 54.7 48.020 114.037 17.5 3.2 2.9 13 177 84.3 0.20 +1.6 2.7 C MBMG * 3.1 KALISPELL VALLEY; FELT 3.07 82022210 BUT EP4432.804ES60.7 LRM IPD4435.40 IS67.2 180. AMM IPD4429.50 ES57.3 133. MSO EPC4415.90 ES32.3 CMT EP4430.50 IS58.3 LDM IPC4412.20 ES24.3 3 RXF EPC4414.3 CLX IPC4408.70 ES19.7

        which prints:

        BUT 4432.804 60.7 LRM 4435.40 67.2 AMM 4429.50 57.3 MSO 4415.90 32.3 CMT 4430.50 58.3 LDM 4412.20 24.3 RXF 4414.3 CLX 4408.70 19.7 1043 54.7 114.037 48.020 17.5 3.2 177

        Note that this code is not at all robust and depends on the event records' layout being consistent.

        True laziness is hard work
Re: Lexical scope
by jwkrahn (Abbot) on Apr 14, 2011 at 10:01 UTC
    my @tokens = split; my $header = $tokens[0];

    Or just:

    my $header = ( split )[ 0 ];


    my ($n, @list); ($n, @list) = @_;

    Or just:

    my ( $n, @list ) = @_;


    my (@list); foreach $_ (@list) {

    You just created @list so there is nothing in it and the foreach loop will not iterate over an empty list.



    foreach $_ (@list) { ... my @list = @tokens[0,3,4,7,8]; ... push (@list, $_);

    You shouldn't modify an array that you are iterating over in a foreach loop.    As perlsyn warns "If any part of LIST is an array, "foreach" will get very confused if you add or remove elements within the loop body, for example with "splice".    So don't do that.".



    my @tokens = split; my $header = $tokens[0]; my @records = @tokens [3,4,5,6,10,11];

    Or just:

    my ( $header, @records ) = ( split )[ 0, 3, 4, 5, 6, 10, 11 ];


    my $y = $tokens[0]; my $m = $tokens[1]; my $d = $tokens[2]; my $h = $tokens[3];

    Or just:

    my ( $y, $m, $d, $h ) = @tokens;
Re: Lexical scope
by anonymized user 468275 (Curate) on Apr 14, 2011 at 09:35 UTC
    You might be happy with "random" indentation of braces and statements and mixing main and subroutine code throughout a source file, but it seems to me you should start with that issue.

    One world, one people