in reply to Perl = Greek to me

This is the first step in something much bigger that's probably way more than i can chew. Thank you for all the replies so far.


Here are a few lines of data:


2011-12-21 00:24:20.422318%%0f-1f-15%%TRF%% Received event <lmevtPORT_ +DIGIT_PATTERN_MATCH> cc<lmccSTATE_CHANGED> R<otLIF_PORT:0x4E06016> D< +otPARTY_OBJECT:0x80B4D> T<otUNKNOWN:0x0> SRC<otLIF_PORT:0x4E06016> Ad +dr<01-15-0c> 2011-12-21 00:24:20.422888%%0f-1f-16%%TRF%% Received event <paevDIGIT_ +PATTERN_RECEIVED> cc<0x0> R<otPARTY_OBJECT:0x80B4D> D<otCC_PARTY_OBJE +CT:0x31A9A> T<otUNKNOWN:0x0> SRC<otPARTY_OBJECT:0x80B4D> Addr 2011-12-21 00:24:20.423024%%0f-1f-16%%DB1%% ccParty<0x31A9A> Port<1-14 +-7-23> DTMF<4> 2011-12-21 00:24:20.423138%%0f-1f-16%%DB2%% ccParty<0x31A9A> next CF<2 +1> 2011-12-21 00:24:20.423229%%0f-1f-16%%DB2%% ccParty<0x31A9A> CF<21> I< +0>


Here's one example of what I'm trying:

my $slurp; local $/ = undef; open my $textfile, '<', 'filename.txt' or die $!; $slurp = <$textfile>; close $textfile; } while($line = m/ccParty<(.+)> DTMF<(.+)>/) print "$line" if $line =~ /_NN/; }

Replies are listed 'Best First'.
Re^2: Perl = Greek to me
by Not_a_Number (Prior) on Dec 22, 2011 at 20:56 UTC
    my $slurp; { local $/ = undef; open my $textfile, '<', 'filename.txt' or die $!; $slurp = <$textfile>; close $textfile; }

    The above code is unnecessary. It is syntactically correct if you want to read an entire file into a single scalar variable ($slurp), but then your code needs to use this variable, which it doesn't.

    Far better to do this:

    open my $textfile, '<', 'filename.txt' or die $!; while ( my $line = <$textfile> ) { ## Do something with $line }

    Update: Do you see how much better formated my post looks than yours? That's because I followed the guidelines, wrapping code (and data) in <c>...</c> tags and paragraphs of text in <p>...</p> tags.

    There's nothing stopping you from going back and correcting your posts now...

    Update: Corrected minor tpyos.

Re^2: Perl = Greek to me
by JavaFan (Canon) on Dec 22, 2011 at 20:58 UTC
    A few things:
    • It's easier to process the file line by line.
    • = is assignment, something you don't want when matching.
    • The text of the file is in $slurp, which you then never use.
    I'd do something like (untested, but complete program):
    @ARGV = "filename.txt"; while (<>) { print if /ccParty<.+>DTMF<.+>/ && /_NN/; }
    Or rather, forget about Perl, and just use a shell one liner (again, untested):
    $ grep 'ccParty<..*>DTMF<..*>' filename.txt | grep _NN
    I am assuming your regexp is indeed what you want, I haven't checked whether it matches your description.
Re^2: Perl = Greek to me
by ww (Archbishop) on Dec 22, 2011 at 22:08 UTC
    Your last four lines have several problems

    The showstoper? "=" is an assignment. You're looking for "=~"

    And once you get your regex right, you can combine the ideas behind your while and print lines, like this:

    if ( $line =~ /(regex)/ ) { print "$line\n"; } else { print "whoops. no match!\n";

    And this may be my bad (for eyes being shot?) but what the heck is your

    if $line =~ /_NN/;

    supposed to be doing for you? I don't see that string anywhere in your data.

    Update: Fixed markup & typos -- thanks to GrandFather for his attention and advice.

Re^2: Perl = Greek to me
by aaron_baugher (Curate) on Dec 22, 2011 at 23:48 UTC

    It looks like you pasted together a couple bits of code that do something similar to what you're trying to do, but together they don't actually work, and they're really not what you want anyway. You could do this task by slurping the entire file into a single string and then stepping through it by using a //g regex in a while test, but it'd look more like this:

    while($huge_multi_line_string =~ /ccParty<(\S+)> Port<(\S+)> DTMF<(\S+ +)>/g){ # do something with $1, $2, and $3 }

    But don't do that, because that's an ugly way to do it, especially with large files, since it means pulling the whole file into memory. You don't want to do that unless it's necessary, and it's not in your case. Start by thinking through your problem, and how you'd solve it logically, before getting into the code. Write pseudo-code if you have to. You've got a bunch of lines, and you want to pick out the lines that contain certain strings. So your logic will be:

    open the file while there are more lines, get a line if the line matches certain strings do something with it close the file

    Then it's just a matter of turning it into code. In perl, "while there are more lines, get a line" is normally done by using <angle brackets> around a filehandle in a while test. This returns one line from the filehandle to a scalar variable, or to $_ if you didn't specify a variable. Then within your while loop, you can use a regex (or regexes) to see if your line matches your requirements. A little time spent reading about using a while loop to read a file line-by-line, and some more time spent in perlretut, and you should be close, at least.

    Aaron B.
    My Woefully Neglected Blog, where I occasionally mention Perl.

      First, thank you, everyone, for your tips, suggestions, etc.

      I'm getting extremely frustrated which isn't helping solve anything. Here's my latest bit of code.

      { open my $textfile, '<', 'C:\Users\msmolik\Desktop\PERL test stuff\tes +tfile.log' or die $!; while ( my $line =~ m/ccParty<(.+)> DTMF<(.+)>/ ) { print "$line\n"; } } { close $textfile; }

      This returns nothing:

      C:\Users\msmolik\Desktop\PERL test stuff>testingperl.pl C:\Users\msmolik\Desktop\PERL test stuff>

      I've shortened my data file to 5 lines which contains 2 lines (the first and last) that "should" match my search. My data file contains the following lines:

      2011-12-21 00:23:06.904520%%0f-1f-16%%DB1%% ccParty<0x31A99> Port<1-14 +-7-22> DTMF<4> 2011-12-21 00:21:57.729881%%0f-1f-15%%TRF%% Received event <lmevtVOICE +_PATH_MODE_CHANGE> cc<lmccSTATE_CHANGED> R<otLIF_PORT:0x4107003> D<ot +PARTY_OBJECT:0x80B4B> T<otUNKNOWN:0x0> SRC<otLIF_PORT:0x4107003> Addr +<01-15-0c> 2011-12-21 00:23:06.904667%%0f-1f-16%%DB2%% ccParty<0x31A99> next CF<2 +1> 2011-12-21 00:23:06.904765%%0f-1f-16%%DB2%% ccParty<0x31A99> CF<21> I< +0> 2011-12-21 00:24:20.423024%%0f-1f-16%%DB1%% ccParty<0x31A9A> Port<1-14 +-7-23> DTMF<4>

      At this point I'm done. I have to take a timeout or I'll just leave this endeavor because I'm so frustrated.

        You're opening the file, but you're not reading it. $line never gets any data in it, yet you're trying to do a pattern match against it. First thing you should do is Use strict and warnings and you should get uninitialized warnings about $line (Update: That is, after you fix the scoping problem with $textfile). The basic pattern of your program should be something like:
        use strict; use warnings; open my $fh, .... or die "Err: $!"; while (my $line = <$fh>) { if ( $line =~ /...some pattern.../ ) { print $line; } } close $fh;
        Although you really should have done what I suggested in my first post and started with something like:
        use strict; use warnings; open my $fh, .... or die "Err: $!"; while (my $line = <$fh>) { print $line; } close $fh;
        Because until you get past that, you can't expect to run with scissors before you can walk :-)

        How do you expect this

        m/ccParty<(.+)> DTMF<(.+)>/ #..............^

        to match this?;

        2011-12-21 00:23:06.904520%%0f-1f-16%%DB1%% ccParty<0x31A99> Port<1-14 +-7-22> DTMF<4> #............................................................^^^^^^^^^ +^^^^^^

        If you changed your regex to m/ccParty<(.+)> .+DTMF<(.+)>/ it stands a chance of working.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        The start of some sanity?