Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

hello monks,

I have the following data I'm parsing:

1 [111.123.164.80] [123.126.126.126] 87 000:00:00.0000.000.000 0 +6/07/2000 04:39:00 PM SNMP: Get
How can I grab just 1 [111.123.164.80]  [123.126.126.126]

I was using :

while (<DATA>) { if ($_ =~ /\d+\s+\[\d+\.\d+\.\d+\.\d+\]/) { print $_; } }
... but this is grabbing the entire line.

any ideas?

Replies are listed 'Best First'.
Re: regex needed
by Fletch (Bishop) on Jan 30, 2003 at 18:14 UTC

    It's not `grabbing the entire line', because $_ is the entire line. Just matching against a regex doesn't change anything. You either need to use ()s to capture the parts of the data you're interested in, or use s/// to remove the parts you're not. See perldoc perlretut.

Re: regex needed
by MZSanford (Curate) on Jan 30, 2003 at 18:17 UTC
    You have the matcing correct, but you did not save any of the data you matched. You may want to take a look at the perlre man page, in praticular what parenthesis are for. Something like the follwing (untested) code :
    while (<DATA>) { if ($_ =~ /^(\d+\s+\[\d+\.\d+\.\d+\.\d+\]\s+[\d+\.\d+\.\d+\.\d+\]) +/) { print "Match=$1\n"; } }

    The addition of getting the field is an excersize left to the reader.


    from the frivolous to the serious
Re: regex needed
by Abigail-II (Bishop) on Jan 31, 2003 at 01:28 UTC
    Well it isn't "grabbing the entire line". It's printing the entire line. But that's because you said it should print that. Try printing $&.

    I must say that both your title, and your question are poorly phrased. From your question, it's not at all clear what you want to "grab" if the line contains something else. Do you want to grab up to the second ], regardless what else is on the line? Do you want to "grab" up to the third whitespace character? Are you trying to match IP-numbers between the brackets, or version numbers, or just a bunch of numbers and dots? Or do you want the first 36 characters?

    Asking for regexes with no better specification than one example is a waste of everyones time.

    Abigail

Re: regex needed
by CountZero (Bishop) on Jan 30, 2003 at 20:00 UTC

    Try this: m/(^\d+\s+\[.*\]).*(SNMP:\s+.*)$/

    $1 will contain the indexnumber and the two IP's; $2 will contain the "SNMP:" plus string

    Notice how the greediness of the regex-engine allows you to capture from the first [ to the last ] in one go.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: regex needed
by glwtta (Hermit) on Jan 30, 2003 at 18:23 UTC
    You are matching what you want, you are just not capturing it. When you print $_;</code $_ still has the value assigned to it by <code>while (<DATA>). You need to tell the regexp what you want out of what it matches, by putting brackets around it: /\[([\d.]+?)\]\s+\[([\d.]+?)\]/ and then referencing it with $1, $2, etc.
    eg: print "Matched $1 and $2.\n";
Re: regex needed
by JamesNC (Chaplain) on Jan 30, 2003 at 20:46 UTC
    Try this: Map is pretty fast too...
    my @ips; while(<DATA>){ chomp; push @ips, map { /\[(\d+\.\d+\.\d+\.\d+)\]/ig } + $_; } print "First IP = $ip[0], Second IP = $ip[1]\n"; print "All IP's Found: @ips"; __DATA__ 1 [111.123.164.80] blah blah [123.126.126.126] 87 000:00:00.0000.000. +000 06/07/2000 04:39:00 PM SNMP: Get 2 [212.123.34.80] 283.120.123 [3.126.126.126] 87 000:00:00.0000.000. +000 06/07/2000 04:39:00 PM SNMP: Get 3 [201.0.164.80] 8.23.1.2.3 [51.126.126.126] 87 000:00:00.0000.000.00 +0 06/07/2000 04:39:00 PM SNMP: Get 4 [111.123.164.80] [12.126.126.126] 87 000:00:00.0000.000.000 06 +/07/2000 04:39:00 PM SNMP: Get
    I use this to grab them... cheers
Re: regex needed
by OM_Zen (Scribe) on Jan 30, 2003 at 20:52 UTC
    Hi ,

    #1 [111.123.164.80] [123.126.126.126] 87 #000:00:00.0000.000.000 +0 #+6/07/2000 04:39:00 PM SNMP: Get # Guess you have to ( pattern ) to store and use it with $1 +,$2,$3 , in your code you match it but you do not store it and hence +print the line by itself if ($_ =~ /(\[.*\])\s+(\[.*\]).*(SNMP: Get$)/){ print "[$1] [$2] [$3]\n"; }


    # $_ =~ s/(\[.*\]) #match within first [ #charachters inside it ] # \s+ #match a space after this # (\[.*\]).* #match again for the second [ #charachters inside it ] # (SNMP: Get$)/; #match the SNMP part of # the string # print "$1 $2 $3\n";


Re: regex needed
by Anonymous Monk on Jan 30, 2003 at 18:07 UTC
    actually I need:
    1 [111.123.164.80] [123.126.126.126]
    and the SNMP: field, i.e.
    SNMP: Get
    I need to be able to grab them both from the line