Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am really bad with regular expressions (I've been reading tutorials all day, that is how I found this site)

I was wondering if I could get you guys to help me out with a problem that I am having. I need to parse out one lease from a file that looks like this:

lease 10.2.56.40 { starts 3 2003/08/20 10:37:28; ends 3 2003/08/20 22:37:28; hardware ethernet 00:20:af:52:12:0f; uid 01:20:af:52:12:0f; client-hostname "Telephone"; } lease 10.2.56.75 { starts 3 2003/08/20 09:57:11; ends 3 2003/08/20 21:57:11; hardware ethernet 00:ef:18:ae:52:83; uid 01:00:ef:18:ae:52:83; client-hostname "beast"; } lease 10.2.56.77 { starts 2 2003/08/19 21:13:05; ends 3 2003/08/20 21:13:05; hardware ethernet 00:02:95:9b:78:18; uid 01:02:0b:95:9b:78:18; }
I just cant seem to write a working regular expression to parse this out. Thus far I have lease /$search {.+?}/ but it doesn't work

Replies are listed 'Best First'.
Re: newb regular expression question
by dreadpiratepeter (Priest) on Aug 20, 2003 at 15:24 UTC
    Try this:
    use strict; use warnings; my $str=<<'foo'; lease 10.2.56.40 { starts 3 2003/08/20 10:37:28; ends 3 2003/08/20 22:37:28; hardware ethernet 00:20:af:52:12:0f; uid 01:20:af:52:12:0f; client-hostname "Telephone"; } lease 10.2.56.75 { starts 3 2003/08/20 09:57:11; ends 3 2003/08/20 21:57:11; hardware ethernet 00:ef:18:ae:52:83; uid 01:00:ef:18:ae:52:83; client-hostname "beast"; } lease 10.2.56.77 { starts 2 2003/08/19 21:13:05; ends 3 2003/08/20 21:13:05; hardware ethernet 00:02:95:9b:78:18; uid 01:02:0b:95:9b:78:18; } foo my $l = '10.2.56.77'; if ($str =~ /lease\s+\Q$l\E\s+\{.*?\}/s) { print "$&\n"; }
    The regexp expanded:
    / lease # literal \s+ # one or more white space chars \Q # escape the lease so that the .'s aren't special $l # the lease to find \E # end the escape \s+ # more white space \{ # literal .*? # slurp until } (non-greedy) \} # literal /sx


    -pete
    "Worry is like a rocking chair. It gives you something to do, but it doesn't get you anywhere."
      The solution already posted might be improved by making a few small changes:
      use strict; use warnings; my $str=<<'_END_DATA_'; lease 10.2.56.40 { starts 3 2003/08/20 10:37:28; ends 3 2003/08/20 22:37:28; hardware ethernet 00:20:af:52:12:0f; uid 01:20:af:52:12:0f; client-hostname "Telephone"; } lease 10.2.56.75 { starts 3 2003/08/20 09:57:11; ends 3 2003/08/20 21:57:11; hardware ethernet 00:ef:18:ae:52:83; uid 01:00:ef:18:ae:52:83; client-hostname "beast"; } lease 10.2.56.77 { starts 2 2003/08/19 21:13:05; ends 3 2003/08/20 21:13:05; hardware ethernet 00:02:95:9b:78:18; uid 01:02:0b:95:9b:78:18; } _END_DATA_ my $l = '10.2.56.77'; if ($str =~ /^(lease\s\Q$l\E[^\}]+?^\})/m) { print "$1\n"; }

      First, I would make the end delimiter for the heredoc a bit more distinct, but that's a side issue.

      I would rewrite the main portion as:

      my $l = '10.2.56.77'; if ($str =~ /^(lease\s\Q$l\E[^\}]+?^\})/m) { print "$1\n"; }

      The .* from the original can get you into trouble, you're better off explicitly stating what kind of character you don't want to see (the "}" in this case.)

      The $& variable also slows down the program considerably. See the perlre section of the perl docs.

Re: newb regular expression question
by broquaint (Abbot) on Aug 20, 2003 at 15:33 UTC
    What do need exactly? The block or a specific line? How about
    use strict; use Data::Dumper: my($lease, %info); while(<DATA>) { next unless (($lease) = /^lease \s+ ((?:\d+.?)+) \s+ \{$/x) .. /^\}$/; push @{ $info{$lease} } => { $1 => [split ' ', $2] } if /^\s+ (\S+) \s+ (.*);/x; } print Dumper(\%info); __DATA__ lease 10.2.56.40 { starts 3 2003/08/20 10:37:28; ends 3 2003/08/20 22:37:28; hardware ethernet 00:20:af:52:12:0f; uid 01:20:af:52:12:0f; client-hostname "Telephone"; } lease 10.2.56.75 { starts 3 2003/08/20 09:57:11; ends 3 2003/08/20 21:57:11; hardware ethernet 00:ef:18:ae:52:83; uid 01:00:ef:18:ae:52:83; client-hostname "beast"; } lease 10.2.56.77 { starts 2 2003/08/19 21:13:05; ends 3 2003/08/20 21:13:05; hardware ethernet 00:02:95:9b:78:18; uid 01:02:0b:95:9b:78:18; }
    Which creates a hash of arrays of hashes of arrays and outputs this

    See. perlre and the ever helpful YAPE::Regex::Explain for more info on the regexes used above and perlreftut, perldsc and tye's References quick reference for more info on data structures and dereferencing them.
    HTH

    _________
    broquaint

    update: modified code to store the per lease values as an array

Re: newb regular expression question
by CombatSquirrel (Hermit) on Aug 20, 2003 at 15:32 UTC
    I assume that you mean you want to pull out a specified data set, which is surrounded by { }. If this is correct, how is your data set specified (i.e. by contents, or the name) and what kind of output would you like?
    For example, the following works:
    #!perl use strict; use warnings; my $lookFor = quotemeta('10.2.56.75'); my $data; { local $/; $data = <DATA>; } if ($data =~ /lease\s+$lookFor\s+\{([^}]*)\}/) { print $1; } else { print "No match\n"; } __DATA__ lease 10.2.56.40 { starts 3 2003/08/20 10:37:28; ends 3 2003/08/20 22:37:28; hardware ethernet 00:20:af:52:12:0f; uid 01:20:af:52:12:0f; client-hostname "Telephone"; } lease 10.2.56.75 { starts 3 2003/08/20 09:57:11; ends 3 2003/08/20 21:57:11; hardware ethernet 00:ef:18:ae:52:83; uid 01:00:ef:18:ae:52:83; client-hostname "beast"; } lease 10.2.56.77 { starts 2 2003/08/19 21:13:05; ends 3 2003/08/20 21:13:05; hardware ethernet 00:02:95:9b:78:18; uid 01:02:0b:95:9b:78:18; }
    producing the output
    starts 3 2003/08/20 09:57:11; ends 3 2003/08/20 21:57:11; hardware ethernet 00:ef:18:ae:52:83; uid 01:00:ef:18:ae:52:83; client-hostname "beast";
    but unless you provide more specific information, it will be hard (at least for me) to answer this question.
Re: newb regular expression question
by Limbic~Region (Chancellor) on Aug 20, 2003 at 15:32 UTC
    Anonymous Monk,
    When possible, I find this method much nicer than a messy regex:
    #!/usr/bin/perl -w use strict; open (LEASE, "lease.file") or die "Unable to open input file : $!"; $/ = "\n}\n"; while (defined (my $lease = <LEASE>)) { next unless ($lease =~ /Telephone/); print $lease; }
    Now if you need to parse a lease further, you could do something like:
    while (defined (my $lease = <LEASE>)) { next unless ($lease =~ /Telephone/); foreach my $line (split /\n/ , $lease) { # Do something with the line? print $line, "\n"; } }
    Hope this helps - L~R

    PS Telephone could be replaced by just about any piece of information in the lease that you want to search for

Re: newb regular expression question
by Mr. Muskrat (Canon) on Aug 20, 2003 at 15:34 UTC

    I suspect that you forgot the s modifier for the regex.

    #!/usr/bin/perl my $search = '10.2.56.75'; # the ip we are looking for, could pull fro +m @ARGV my $data; { # slurp in all of the info in the DATA section local $/ = undef; $data = <DATA>; } my ($lease) = $data =~ /lease $search {(.+?)}/is; # find the first mat +ch print $lease,"\n"; __DATA__ lease 10.2.56.40 { starts 3 2003/08/20 10:37:28; ends 3 2003/08/20 22:37:28; hardware ethernet 00:20:af:52:12:0f; uid 01:20:af:52:12:0f; client-hostname "Telephone"; } lease 10.2.56.75 { starts 3 2003/08/20 09:57:11; ends 3 2003/08/20 21:57:11; hardware ethernet 00:ef:18:ae:52:83; uid 01:00:ef:18:ae:52:83; client-hostname "beast"; } lease 10.2.56.77 { starts 2 2003/08/19 21:13:05; ends 3 2003/08/20 21:13:05; hardware ethernet 00:02:95:9b:78:18; uid 01:02:0b:95:9b:78:18; }

Re: newb regular expression question
by dragonchild (Archbishop) on Aug 20, 2003 at 15:38 UTC
    Don't use a regex. Parse the file line by line.
    my %leases; my $curr_lease; while (<FH>) { # We're done with the current lease if (/}/) { $curr_lease = ''; } elsif (/^lease (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/ { $current_lease = $1; $leases{$current_lease} = {}; } elsif (/^\s+(starts|ends) (\d) (\d(4)\/\d{2}\/\d{2}) (\d{2}:\d{2}: +\d{2})/) { $leases{$current_lease}{$1} = { num => $2, date => $3, time => $4, }; } elsif (/^\s+(hardware) (\w+) ([^;]+)/) { $leases{$current_lease}{$1} = { type => $2, id => $3, }; } elsif ((my @vals = split) == 2) { $leases{$current_lease}{$vals[0]} = $vals[1]; } else { chomp; print "I don't know what to do with '$_'\n"; } }
    Now, you have all the leases in %leases, keyed by IP number. :-)

    ------
    We are the carpenters and bricklayers of the Information Age.

    The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Re: newb regular expression question
by Theo (Priest) on Aug 21, 2003 at 15:10 UTC
    As another newbie, let me recommend the Llama book, "Learning Perl". It has a good section on regular expressions.

    -ted-