in reply to In search of regex advice
Now a few comments:use strict; use Data::Dumper ; print "1ST TRY\n------\n" ; my @db_list_one = () ; my $text_one = q( SELECT * FROM SYNERGEN.SA_ASSET@/*<DATABASE>*/SGENQA/*</DATABASE>*/ ); my $matches_one = 0 ; my $start = '/\*<DATABASE>\*/'; my $end = '/\*</DATABASE>\*/'; if ( $text_one =~ m/$start(.*?)$end/i ) { push @db_list_one, $1; $matches_one++ ; } print Dumper( \@db_list_one ), "\n\n" ; #--------------------------------------- print "2ND TRY\n------\n" ; my @db_list_two = () ; my $text_two = q( SELECT * FROM SYNERGEN.SA_ASSET@/*<DATABASE>*/SGENQA/*</DATABASE>*/, SYNERGEN.SA_WORK_ORDER@/*<DATABASE>*/SGENTEST/*</DATABASE>*/ + ); my $matches_two = 0 ; while ( $text_two =~ /$start(.*?)$end/gi ) { push @db_list_two, $1; $matches_two++ ; } print Dumper( \@db_list_two ), "\n\n" ;
I place a premium on the human-readable quotient of code. Thus the separating out of the $start and $end vars. That goes a long way toward making the regexen easier to make sense of.
The non-greedy .*? (with the question mark) prevents spanning multiple $start...$end pairs if they occur on one line. It is a bit safer than the dreaded dot-star.
The question you pose about how to capture several strings satisfying the same regex is answered with the use of the /g modifier to the second regex. The context of the while loop conditional puts the regex in a scalar context -- which puts it in "progressive match" mode: it walks through the string returning a true value for each match. This allows you to capture each captured value one-at-a-time inside the while loop.
Update: In a list context, the /g regex returns a list of all the values found. So you can replace your my declaration and the while loop with:
And if you want to make yourself really nuts the next time you come back to this code and try to make sense of it, you could replace the entire mess with:my @db_list3 = $text_two =~ /$start(.*?)$end/gi; my $matches3 = @db_list3;
But that would be tempting the fates wouldn't it. ;-)my $text = 'whatever...'; my ($start, $end) = qw( /\*<DATABASE>\*/ /\*</DATABASE>\*/ ); my $matches = my @db_list = $text =~ /$start(.*?)$end/gi;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: In search of regex advice
by DamnDirtyApe (Curate) on Oct 29, 2001 at 11:20 UTC | |
by blakem (Monsignor) on Oct 29, 2001 at 11:41 UTC | |
by dvergin (Monsignor) on Oct 29, 2001 at 11:38 UTC |