pattern matching

Spooky has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: pattern matching by FunkyMonk (Bishop) on Apr 10, 2009 at 12:07 UTC
`/[A-Z]{2,5}\d{5}/` See perlretut for a tutorial and perlre for the details	[reply] [d/l]
Re^2: pattern matching by JavaFan (Canon) on Apr 10, 2009 at 13:04 UTC
Curious. You restrict yourself to the Western, unaccented, letters (ASCII), but you allow hundreds of different digits. I would have used either `/[A-Z]{2,5}[0-9]{5}/ # Ascii ranges` [download] or `/\p{Lu}{2,5}\p{Nd}{5}/ # Full Unicode set` [download]	[reply] [d/l] [select]
Re^3: pattern matching by Porculus (Hermit) on Apr 10, 2009 at 13:15 UTC
Premature generalisation is the root of much evil. In practice, since the OP did not mention any character-set complications, it is a reasonable assumption that the task in question does not require matching non-ASCII capital letters or require excluding non-ASCII numerals. So there is no need to worry about Unicode ranges, and distinguishing between `\d` and `[0-9]` is splitting hairs.	[reply] [d/l] [select]
Re^4: pattern matching by JavaFan (Canon) on Apr 10, 2009 at 13:49 UTC
Re: pattern matching by Bloodnok (Vicar) on Apr 10, 2009 at 12:14 UTC
Supplementing FunkyMonks reply, assuming you want a one-liner to scan a file and print any matching line, you want `perl -ne 'print if /[A-Z]{2,5}\d{5}/' some_file` [download] In addition to the references provided by FunkyMonk, also look at perlrun. A user level that continues to overstate my experience :-))	[reply] [d/l]
Re^2: pattern matching by Spooky (Beadle) on Apr 10, 2009 at 12:19 UTC
..excellent ..thanks!	[reply]
Re: pattern matching by leocharre (Priest) on Apr 10, 2009 at 14:06 UTC
I often match things like these into text chunks. The question and answers here are great to validate or check a value, but what if you're fishing for these out of a text chunk? For example.. if the text you are matching into is: `$text = 'ABCDE67890';` Then, yes.. `$text=~/([A-Z]{2,5}\d{5})/ or die; # parenthesis is for "remembering" what we matched, # we can get it with $1 later.. print "got $1";` [download] However.. if your text chunk is: `$text='ADABCDE6789023424'; $text=~/([A-Z]{2,5}\d{5})/ or die; print $1;` [download] # will print ~~'DE67890'~~ ABCDE67890. Thanks gwadej (see below). Then you still match into this. Is this the behaviour you want? I don't know of the context into whuch you are matching.. If it is possible that you want to check the entire string as the pattern.. You need to do this instead: `$text=~/^([A-Z]{2,5}\d{5})$/ or die;` (Having the ^ means start at beginning, having $ marks the end.) If you want to match into a large text chunk such as: This YU123456 is a piece of text and the id code would be AG12345 orAH12345 Then this pattern will match AG12345 but NOT AH12345, and NOT YU12345: `/\b[A-Z]{2}\d{5}\b/` And this pattern will match all of them: `/[A-Z]{2}\d{5}/` Just something to keep in mind.	[reply] [d/l] [select]
Re^2: pattern matching by gwadej (Chaplain) on Apr 10, 2009 at 14:20 UTC
Although I don't disagree with the overall thrust of your argument, there is a tiny mistake. However.. if your text chunk is: `$text='ADABCDE6789023424'; $text=~/([A-Z]{2,5}\d{5})/ or die; print $1; # will print 'DE67890'` [download] It actually would print `ABCDE67890`. Remember the `{2,5}` matches the longest string it can, unless you make it non-greedy. G. Wade	[reply] [d/l] [select]