Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

hi, ppl
i have a problem with a regexp i have this lines:
[Fri Sep 30 14:02:22 2005]Local/ESSBASE0///Info(1051001) Received client request: Logout (from user Procbat)

my regexp get all info of first line, but second line dont return anything mys prints stops in $9
i try C:>type teste.txt|perl ini.pl where teste.txt contains 2 lines.
thanks and sorry for bad english ;D

while (<>) { if (m{ \[ (\w{3}) \s* (\w{3}) \s* (\d{2}) \s* (\d{2}:\d{2}:\d{2})\s* (\d{4}) \] (\w*) \/ (\w*) /// Info(\(\d*\)) (\s*) }xm ){ print "regexp ok\n"; print "$1\n"; print "$2\n"; print "$3\n"; print "$4\n"; print "$5\n"; print "$6\n"; print "$7\n"; print "$8\n"; print "$9\n"; print "$10\n"; print "$11\n"; } }

20051116 Janitored by Corion: Moved from snippets, added formatting

Replies are listed 'Best First'.
Re: regexp in win32
by ptum (Priest) on Nov 16, 2005 at 18:55 UTC
    Your regular expression only seems to have nine parenthetical expressions, hence only up to $9 will print.

      the print go to $8 variable, te second line, with \s* , (.*) or \s* (.*) dont print anything, i dont understand why rint does not work on $9 at last...

      Lorn

      -www.slackwarezine.com.br-

        You don't seem to understand how the while (<FILEHANDLE>) and regex comparison code works. Maybe you've already moved past this, but just in case, I'll explain. Your while loop reads in a line at a time from the file, using \n to detect the end of a line. The string representing that line is placed in $_, which is used as input for your regular expression. So you are testing your regex first against:
        [Fri Sep 30 14:02:22 2005]Local/ESSBASE0///Info(1051001)
        ... and then are printing out the $1 through $9 components you have extracted by the use of parentheses. Then you finish and do it all over again with the second line:
        Received client request: Logout (from user Procbat)
        The second line doesn't have any bracket characters and so your regex doesn't work on that line at all, and the print statements inside your if block aren't executed.
        ikegami's recommendation (and mine, for that matter) involve joining the multiple lines in your file. Hope that helps.
Re: regexp in win32
by ikegami (Patriarch) on Nov 16, 2005 at 18:43 UTC
    I think you want to replace
    (\s*) }xm
    with
    \s* (.*) }xm
    The 'm' switch has no effect in that regexp, but it doesn't hurt either.

    Update: Does your input span two lines? I couldn't tell before Corion added <code> tags. If so, you'll need something like:

    while (<>) { my @fields; if (@fields = m{ \[ (\w{3}) \s* (\w{3}) \s* (\d{2}) \s* (\d{2}:\d{2}:\d{2})\s* (\d{4}) \] (\w*) \/ (\w*) /// Info(\(\d*\)) }xm) { push(@fields, scalar <>); print(join("\n", @fields)); } }

    Update: Fixed error in join.

      with

      \s* (.*) }xm

      dont work, dont show any data of the sencond line :/ and i dont understand you solution

      Lorn -www.slackwarezine.com.br-

        OK, so you can either (a) parse both lines in succession with two different regular expressions (or one very flexible one) or (b) you can join the lines together (with or without the carriage return) and parse the line once. I recommend (b), but if you decide to leave the \n inside your string, you'll want to use the /s flag on your regex. Maybe something like this:
        my $buffer = undef; while (<YOURFILE>) { $buffer .= $_; } if ($buffer =~ /your regex/s) { # do stuff with $1 ... etc. }

        I had join's parameters backwards. Fixed. Tested. Works.

        The test:

        The output:

        Fri Sep 30 14:02:22 2005 Local ESSBASE0 (1051001) Received client request: Logout (from user Procbat)

        (Changed <> to <DATA> for test.)