Pattern Matching request

Novice_1 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Pattern Matching request by Random_Walk (Prior) on Dec 15, 2004 at 10:06 UTC
if STRUCT marks the start of each record and occurs nowhere else you would be better splitting on STRUCT and just ignoring or shifting off the first element. Split is mostly (always ?) faster than regex. Cheers, R.	[reply]
Re: Pattern Matching request by si_lence (Deacon) on Dec 15, 2004 at 08:39 UTC
I'm not too sure what you want to do. But if you want all matches except the first I would get all the matches and just get rid of the first one. If you want to really match the word "STRUCT" then use something like this: `use strict; use warnings; my $dat="0011222STRUCTdata1...............02121STRUCTdata2..........02 +342STRUCTdata3"; my @m; @m = $dat =~ /STRUCT/g; shift (@m); foreach (@m) {print "$_\n"};` [download] If you want to match the data between the "STRUCT" then try one of these: (excluding the data1 part. If you need it just delete one of the shift) `@m = $dat =~ /(.*?)(?:STRUCT\|$)/g; shift (@m); shift (@m); foreach (@m) {print "$_\n"}; @m = split(/STRUCT/, $dat); shift(@m); shift(@m); foreach (@m) {print "$_\n"};` [download] si_lence	[reply] [d/l] [select]
Re: Pattern Matching request by zejames (Hermit) on Dec 15, 2004 at 08:41 UTC
There are somes things that aren't clear : does the number belong to data, or is it another information ? Here, I've considered that numbers before the STRUCT keyword and data are not the same. Plus I've taken into account the fact that data may contains numbers. What make these numbers different is that they do not precede a STRUCT keyword. To sum up the proposed solution to your problem : match everything and then remove what you are not interested in. # Text wrapped to fit in the screen my $text = "0011222STRUCTdata..........6ab.." . "02121STRUCTdata.........." . "021232STRUCTdata......" . "02342STRUCTdata......"; my @matches = $text =~ m/(\d+) # First match some numbers STRUCT # Then the STRUCT keyword (.+?) # Then the data, that have to # be followed by (?= # (look-ahead assertion) (?:\d+STRUCT # - numbers and 'STRUCT' \| # or $ # - end of line ) )/xg; # Then remove the first two matches that is first numbers and # first data @matches = splice @matches, 2; { local $, = ","; print @matches, "\n"; } [download] The look ahead trick helps to easily manage numbers in the data text. -- zejames	[reply] [d/l]
Re: Pattern Matching request by ysth (Canon) on Dec 15, 2004 at 08:54 UTC
It would be helpful if you said what you want to do for each match. I'd do a scalar context //g match first, loop through the to get the rest (untested): `$data =~ /^\d+STRUCT/g or "warn: gang aft agley"; while ($data =~ /\d+STRUCT/g) { # process match }` [download]	[reply] [d/l]