Hi AnomalousMonk,

Contrary to Laurent_R's aversion to using a single regex to extract data fields from a record ...

I have no aversion whatsoever for regexes, I actually use them very often and I love them. ;-)

I was only saying that, in that specific case, the use of the split function (which, BTW, uses explicitly a regex in the case in point) would IMHO lead to more concise and probably clearer code. Your suggested code definitely reaches the aims of clarity and ease of maintenance, but not the aim of concision.

If the aim is concision, then the regex could be something like this (tested under the Perl debugger):

DB<17> $line = "ID=1 First=John Last=Doe AGE=42"; DB<18> $word = qr/[a-zA-Z]+/; DB<19> ($id, $first, $last, $age) = $line =~ /^ID=(\d+)\s+First=($wo +rd)\s+Last=($word)\s+AGE=(\d+)\s*$/; DB<20> x ($id, $first, $last, $age) 0 1 1 'John' 2 'Doe' 3 42
or even in one single line:
my ($id, $first, $last, $age) = $line =~ /^ID=(\d+)\s+First=([a-zA-Z]+ +)\s+Last=([a-zA-Z]+)\s+AGE=(\d+)\s*$/;
which is now quite concise, but arguably less clear and maintainable than the simple split I originally suggested. Admittedly, the above regex does a bit more data validation than the split version, but whether you actually need validation or not depends on the situation (essentially: where is the input data coming from?), sometimes you don't need (e.g. you produced the data yourself and you really know what it looks like), sometimes you do, but it can be difficult to figure out how extensive your validation process should be. May be the $word regex definition should be something like this:
$word = qr/[A-Z][a-z]+/;
or maybe simply:
$word = qr/[a-z]+/i;
Notice that this is opening an entirely different subject. Well, I'll leave it there, as this is getting slightly off-topic.


In reply to Re^4: Having an Issue updating hash of hashes by Laurent_R
in thread Having an Issue updating hash of hashes by perlguyjoe

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.