Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I have the following input file which contains :
Timed out (reason: in while loop) ::expect_out(0,string) = > ::expect_out(1,string) = RF use CSWT#RF### dis qremote(MQSI.3PL846) RNAME1 : dis qremote(MQSI.3PL846) RNAME AMQ8409: Display Queue details.QUEUE(MQSI.3PL846)TYPE(QREMOTE)RNAME(MQ +SI.3PL846) No commands have a syntax error. AMQ8409: Display Queue details.QUEUE(MQSI.3PL944)TYPE(QREMOTE)RNAME(MQ +SI.3PL944) end2 : end
From this input file I am trying to caputure the value QUEUE(#####) and RNAME(#####) from each line starts with AMQ8409 , I am able to get to the line and capture it but not sure how to split it or do any trick to only get the QUEUE and RNAME value , my output should look like :
QUEUE(MQSI.3PL846) ----- > RNAME(MQSI.3PL846) QUEUE(MQSI.3PL944) ------> RNAME(MQSI.3PL944)
Here is what I have so far :
open (INPUT, "out") || die "couldn't open the file!"; open (OUTPUT, ">outF") || die "couldn't open the file!"; foreach $line (<INPUT>) { chomp $line; if ( $line =~ "AMQ8409" ) { print OUTPUT "$line\n"; } } close(OUTPUT); close(INPUT);
However , this will catch the whole line.. Any advice is appreciated. Thanks

Replies are listed 'Best First'.
Re: capturing words
by kyle (Abbot) on Nov 08, 2007 at 16:06 UTC

    This should work.

    my $match_queue = qr{ QUEUE # literal word 'QUEUE' \( # literal open paren (.*?) # non-greedy capture of >=0 \) # literal close paren }xms; my $match_rname = qr{ RNAME # literal word 'RNAME' \( # literal open paren (.*?) # non-greedy capture of >=0 \) # literal close paren }xms; if ( $line =~ /AMQ8409/ ) { my ($queue) = ($line =~ $match_queue); my ($rname) = ($line =~ $match_rname); printf OUTPUT "QUEUE(%s) ------> RNAME(%s)\n", $queue, $rname; }

    It could be quite a bit shorter, but I thought some /x clarity would be good.

      Thanks guys ,, that should help a lot
Re: capturing words
by gamache (Friar) on Nov 08, 2007 at 16:01 UTC
    Try:
    while (<INPUT>) { if (/^AMQ8409 .* QUEUE\( ([^\)]+) \) .* RNAME\( ([^\)]+) \)/x) { print OUTPUT "QUEUE($1) ------> RNAME($2)\n"; } }
      Why not save a little typing and put the captures around the whole "QUEUE(...)" and "RNAME(...)"?

      while ( <INPUT> ) { print OUTPUT qq{$1 ------> $2\n} if m{(?x) ^ AMQ8409 .*? ( QUEUE .*? \) ) .* ( RNAME .*? \) ) }; }

      You don't need to escape the closing parenthesis in your negated character class, BTW.

      Cheers,

      JohnGG

Re: capturing words
by mwah (Hermit) on Nov 08, 2007 at 19:21 UTC
    Any advice is appreciated

    There have been already good and solid solutions, but on a regex thread, I can't just sit on my hands ;-)

    I like johngg's solution and tried to fancify his code further:

    ... my $data = ' Timed out (reason: in while loop) ::expect_out(0,string) = > ::expect_out(1,string) = RF use CSWT#RF### dis qremote(MQSI.3PL846) RNAME1 : dis qremote(MQSI.3PL846) RNAME AMQ8409: Display Queue details.QUEUE(MQSI.3PL846)TYPE(QREMOTE)RNAME(MQ +SI.3PL846) No commands have a syntax error. AMQ8409: Display Queue details.QUEUE(MQSI.3PL944)TYPE(QREMOTE)RNAME(MQ +SI.3PL944) end2 : end '; my $match = qr/^AMQ8409 .+? (Q.+?(\w+)\)) .+? (R.+?\2\)) /mx; for( split /\n/, $data ) { print "$1 ----> $3\n" if /$match/ } ...

    ... but the question here is: Is there a speed requirement, e.g. do you have to parse some 100 MB and spit out the results in one second? Then, all the solutions so far (including mine) would look rather bad (slow), because they depend heavily on (.+?) or (.*?).

    Regards

    mwa

      I'm pleased you liked my solution and wanted to take it further but I think you've introduced a slight bugette. Because you have used .+? R.+? in your pattern it has matched from the first "R" it encounters after the "QUEUE(...)" sequence so the output from your script is actually

      Queue details.QUEUE(MQSI.3PL846) ----> REMOTE)RNAME(MQSI.3PL846) Queue details.QUEUE(MQSI.3PL944) ----> REMOTE)RNAME(MQSI.3PL944)

      I am not familiar with whatever application produced the text so I don't know if it is wise to rely on the "QUEUE" and "RNAME" being the same. By the same token, I don't know if "QUEUE" and "RNAME" can appear more than once in one line. If they could then a different approach with a global match might be appropriate. However, if they are unique in a line then you can avoid non-greedy matching.

      my $data = ' Timed out (reason: in while loop) ::expect_out(0,string) = > ::expect_out(1,string) = RF use CSWT#RF### dis qremote(MQSI.3PL846) RNAME1 : dis qremote(MQSI.3PL846) RNAME AMQ8409: Display Queue details.QUEUE(MQSI.3PL846)TYPE(QREMOTE)RNAME(MQ +SI.3PL846) No commands have a syntax error. AMQ8409: Display Queue details.QUEUE(MQSI.3PL944)TYPE(QREMOTE)RNAME(MQ +SI.3PL944) end2 : end '; my $match = qr {(?mx) ^ AMQ8409 .+ (QUEUE\([^)]+\)) .+ (RNAME\([^)]+\)) }; for ( split /\n/, $data ) { print "$1 ----> $2\n" if /$match/ }

      This produces

      QUEUE(MQSI.3PL846) ----> RNAME(MQSI.3PL846) QUEUE(MQSI.3PL944) ----> RNAME(MQSI.3PL944)

      I hope this is of interest.

      Cheers,

      JohnGG