Wes_M has asked for the wisdom of the Perl Monks concerning the following question:

The log files look like this:

Date Time <stuff> 3344 123456789
Date Time could be good
Date Time <otherstuff> 4321
Date Time <sohappy> this is really great

This keeps repeating for several hundreds of MB of data in the file.

Assuming each line of the filehandle is $_ and "parsing" that into @line, and then trying to test $line(2) to see if it is "<stuff>" and print that...

Here is the code that doesn't work...
#!/usr/bin/perl use feature ':5.10'; use 5.010; use strict; use warnings; open (DATA, 'text.log') or die print $!; binmode DATA #I am trying everything I can while (<DATA>) { @line = split(/ /,$_); $test = $line[2]; if ($test == '<stuff>') { say "$test\n"; } } close DATA;
#end script. I can edit this script and have it print every incoming line of text with a print $_ But to try and operate on each line seems to be failing. Thanks Monks!

Replies are listed 'Best First'.
Re: String evaluation of log files
by GrandFather (Saint) on Jul 19, 2009 at 02:36 UTC

    I suggest you rewrite your sample to include a little of the problematic data you have and to show us what you actually want to see in the output. Consider:

    use strict; use warnings; while (my $line = <DATA>) { next if $line !~ /<stuff>/; print $line; } __DATA__ Date Time <stuff> 3344 123456789 Date Time could be good Date Time <otherstuff> 4321 Date Time <sohappy> this is really great

    Prints:

    Date Time <stuff> 3344 123456789

    True laziness is hard work

      Could you please provide the link for strict and warnings complicance?


      Code runs if I comment them out, doesn't run if I leave them active.

      I do want clean and efficient code, but to see something working makes me want to jump ahead.


      after changing the == to eq and #comment out use feature ':5.10';


      my errors look like this...

      Global symbol "@line" requires explicit package name at ./five.pl line + 11

      and it just keeps going... :(

        I completely understand the temptation to comment out strict and warnings so you can get that "feel good it works" experience. Resist that temptation with all your might. The first reason to do so is because "it works" is often an illusion. The only thing you've really gained is the ego boost of not having your mistakes shoved into your face (there is a reason why coders live in a monastery - coding can be a humbling experience!).

        The second reason is that you rob yourself of valuable feedback on your coding habits. As you code more and more you will find yourself developing a coding style that is warning proof. Your error rates and warning rates will go down. Just be patient. This learning process takes time.

        One way to learn about what the different error messages mean is to put use diagnostics -verbose at the top of your script, i.e.

        #!/usr/bin/perl use diagnostics -verbose; use 5.010; use strict; use warnings; open (DATA, 'text.log') or die print $!; binmode DATA #I am trying everything I can while (<DATA>) { @line = split(/ /,$_); $test = $line[2]; if ($test == '<stuff>') { say "$test\n"; } } close DATA;

        When diagnostics are turned on, you are likely to get a lot of messages and only the first one or two errors will get a full explanation. If the list is overwhelming, the strategy I find most useful is to:

        • look at the warning output and find the location of the first syntax error, if any. Syntax errors can totally confuse the parser so it is best to deal with those first before approaching any other warning or error.
        • If there are lots of errors in lines above the syntax error, you may not get an explanation. If so, try commenting out the lines with non-syntax errors until the first error found is a syntax error. If you do this, be careful you don't create errors by commenting out too much. A common mistake is to comment out the open curly of a block but not the close curly brace (or vice versa).
        • Once you have dealt with syntax errors, then start dealing with the other warnings, uncommenting code as needed.

        Following these steps, your first attempt to run the script would have gotten something like this:

        DESCRIPTION OF DIAGNOSTICS These messages are classified as follows (listed in increasing ord +er of desperation): (W) A warning (optional). (D) A deprecation (optional). (S) A severe warning (default). (F) A fatal error (trappable). (P) An internal error you should never see (trappable). (X) A very fatal error (nontrappable). (A) An alien error message (not generated by Perl). The majority of messages from the first three classifications abov +e (W, D & S) can be controlled using the warnings pragma. If a messag +e can be controlled by the warnings pragma, its warning category is i +ncluded with the classification letter in the description below. Default warnings are always enabled unless they are explicitly dis +abled with the warnings pragma or the -X switch. Trappable errors may be trapped using the eval operator. See perl +func/eval. In almost all cases, warnings may be selectively disabled + or promoted to fatal errors using the warnings pragma. See warnings. syntax error at Monks/Snippet.pm line 9, near ") {" Global symbol "@line" requires explicit package name at Monks/Snippet. +pm line 10. ... lots more stuff ...

        The syntax error on line 9 is due to the binmode line above. It needs to end in a semi-colon, but it does not. If we fix that problem, our diagnostic output looks like this:

        DESCRIPTION OF DIAGNOSTICS ...same as above... Trappable errors may be trapped using the eval operator. See perlfunc +/eval. In almost all cases, warnings may be selectively disabled or +promoted to fatal errors using the warnings pragma. See warnings. Global symbol "@line" requires explicit package name at Monks/Snippet. +pm line 9. Global symbol "$test" requires explicit package name at Monks/Snippet. +pm line 10. Global symbol "@line" requires explicit package name at Monks/Snippet. +pm line 10. Global symbol "$test" requires explicit package name at Monks/Snippet. +pm line 11. Global symbol "$test" requires explicit package name at Monks/Snippet. +pm line 13. Execution of Monks/Snippet.pm aborted due to compilation errors (#1) (F) You've said "use strict vars", which indicates that all variab +les must either be lexically scoped (using "my"), declared beforehand + using "our", or explicitly qualified to say which package the global + variable is in (using "::").

        Now we get a fuller explanation for that cryptic error message about an "explict package name". Note that this is the same explanation that GrandFather gave you, but you didn't even have to come to the monastery and wait for an answer to get it. It came straight to you.

        Best, beth

        I suggest you start with the sample code I posted and work from there. Links for online documentation of strictures are: strict and warnings.

        The error you are seeing is generally due to using a variable that you haven't declared using my.


        True laziness is hard work
Re: String evaluation of log files
by Anonymous Monk on Jul 19, 2009 at 02:29 UTC
    • Your code is not strict/warnings compliant, you should fix those first.
    • use eq for string comparison, not ==.
    • say already adds a newline, maybe stick to print ? :)
    • use any::feature 'say';
    • For examples, stuff your example data into __DATA__ aka __END__
    That all adds up to
    #!/usr/bin/perl -- use strict; use warnings; use any::feature 'say'; while (<DATA>) { my @line = split(/ /,$_); say $line[2] if $line[2] eq '<stuff>'; } __END__ Date Time <stuff> 3344 123456789 Date Time could be good Date Time <otherstuff> 4321 Date Time <sohappy> this is really great
Re: String evaluation of log files
by momo33 (Beadle) on Jul 19, 2009 at 08:54 UTC
    replace
    @line = split(/ /,$_); $test = $line[2]; if ($test == '<stuff>')
    by
    if ( $_ =~ m/<stuff>/ )
    then remove the '$_ =~' part, and use a regular expression in m/ /