Here's another approach based on the extraction regex used by haj here. The line-by-line while-loop processing approach used in haj's example will scale to handle enormous input files, but if your input files can be guaranteed never to grow larger than, say, a few million lines, it may be easier to "slurp" the data of the entire file into a scalar (i.e., a single string) and process it all at once, as in the example below. (If you are uncertain about the file slurping process, please ask for more info.) This example needs Perl version 5.10+ for the \K regex operator, but this can easily be worked around.
Defining $separator separately allows finer control of this aspect of the match IMHO. Please see perlre, perlretut, and perlrequick. Also see the core module Data::Dumper.c:\@Work\Perl\monks>perl -wMstrict -le "use 5.010; ;; use Data::Dumper qw(Dumper); ;; my $data = qq{Foo bar -baz boff eid- 1234 gkn 12-34_loanmaster\n} . qq{Fizz :faz foz6 eid - 4532 gkn 34-21-hostmasfer\n} . qq{Do :not capture xeid - 999 gkn 34-21-xxx\n} . qq{Also do :not capture eid999 gkn 34-21-xxx\n} . qq{eid 762 biff bam1 zot@\n} ; print qq{[[$data]] \n}; ;; my $separator = qr{ \s* - \s* | \s+ }xms; ;; my $captured_eids = my @EIDs = $data =~ m{ \b eid $separator \K \d+ }xmsg; ;; if ($captured_eids) { print 'captured EID(s): ', Dumper \@EIDs; } else { print 'no EIDs captured'; } " [[Foo bar -baz boff eid- 1234 gkn 12-34_loanmaster Fizz :faz foz6 eid - 4532 gkn 34-21-hostmasfer Do :not capture xeid - 999 gkn 34-21-xxx Also do :not capture eid999 gkn 34-21-xxx eid 762 biff bam1 zot@ ]] captured EID(s): $VAR1 = [ '1234', '4532', '762' ];
Update: For pre-5.10 version Perls, in place of the
m{ \b eid $separator \K \d+ }xmsg
match regex use the work-around (tested)
m{ \b eid $separator (\d+) }xmsg
(no \K operator).
Give a man a fish: <%-{-{-{-<
In reply to Re: How to search an substring and eliminate before and after the substring (updated)
by AnomalousMonk
in thread How to search an substring and eliminate before and after the substring
by Murali_Newbee
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |