in reply to Using a regex to replace looping and splitting

Here's my take on this general problem:

c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my $rx_intro = qr{ special : \d{4} : }xms; ;; my $rx_key = qr{ \w+ }xms; my $rx_val = qr{ \d+ }xms; ;; my $rx_sep = qr{ : }xms; my $rx_delim = qr{ [|] }xms; ;; my $s = 'special:1001:area_code:617|special:1001:zip_code:02205|special:100 +1:dow:0|special:1001:tod:14'; ;; my $hashref = { $s =~ m{ \G $rx_intro ($rx_key) $rx_sep ($rx_val) (?: $rx_delim | \z) }xmsg }; dd $hashref; " { area_code => 617, dow => 0, tod => 14, zip_code => "02205" }
By defining patterns of the pieces of the string separately, it's easier to play with and adjust for variations in the data: might there be whitespace around the  : or  | delimiters; might  'special' sometimes be  'general' or might the pattern of a key be more complex than  \w+ etc?

Also, defining patterns separately allows one to validate the entire string before trying to extract anything from it. (Working with known-valid data is always nice.) Once a string is known to be in a particular, valid format, it is often quite simple to extract data fields from it.

Update: See also regex is not working as I intended for a recent discussion of what seems a similar problem, although you will have to drill down a way before you get to a specification of the structure of the data that fireblood is working with.


Give a man a fish:  <%-{-{-{-<