in reply to Pattern Identification
"fast and efficient" means, of course, Benchmark, which I'll leave up to you. Also, I'm not quite sure exactly how many is a "whole load". At some point this (or anything else, for that matter, everything has a breaking point) will fail.
#!/usr/bin/perl # http://perlmonks.org/?node_id=1200434 use strict; use warnings; use re 'eval'; my $patterns = <<'END'; ^\d{2}.\d{2}.\d{2}$ date ^\d{2}.\d{2}.\d{4}$ date ^[A-Z]{2}\d{9}[A-Z]{2}$ Royal Mail Track & Trace code ^\d{16}$ visa card ^\d{13}$ EAN-13 barcode END my $regex; sub patternidentification { if( not defined $regex ) { ##################### build a single regex just once my $all = join '|', map { /^(\S+)\s++(.+)/ ? "(?:$1(?{'$2'}))" : die "bad pattern $_ +" } split /\n/, $patterns; $regex = qr/$all/; } return /$regex/ ? $^R : "unknown"; } ##################### then try all matches while(<DATA>) { chomp; my $answer = patternidentification($_); print "$_ is a $answer\n"; } __DATA__ 12 12 17 09 30 2O17 09 30 2017 09 30 12017 123123123123123 1231231231231231 12312312312312312 456456456456 4564564564567 45645645645678 QW123456789WQ
Outputs:
12 12 17 is a date 09 30 2O17 is a unknown 09 30 2017 is a date 09 30 12017 is a unknown 123123123123123 is a unknown 1231231231231231 is a visa card 12312312312312312 is a unknown 456456456456 is a unknown 4564564564567 is a EAN-13 barcode 45645645645678 is a unknown QW123456789WQ is a Royal Mail Track & Trace code
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Pattern Identification
by WhiteTraveller (Novice) on Oct 01, 2017 at 12:24 UTC | |
by haukex (Archbishop) on Oct 01, 2017 at 15:47 UTC | |
by AnomalousMonk (Archbishop) on Oct 01, 2017 at 15:27 UTC |