in reply to Proper use of split
Assuming for the moment that this is not a well-known format like JSON for which a more complete solution exists, I can think of two general approaches for tackling this problem. One is to split the string on commas into a list. The other is to use the “global matching” (g and c) as ultimately described in the section, Using regular expressions in Perl, in perldoc perlretut. Of the two, I rather like the second one best, especially if the data is consistently numeric.
“Global matching” lets you apply a regex more than one time to the same string, so that you can take a “winnowing the wheat from the chaff” approach by using a regular expression that corresponds to the “wheat.” The position of the matching string is established by the pos() function, which has one very important “gotcha”: that the start-position corresponding to “from the start of the string” is undef, not zero. (Uh huh... “ouch! it bit me!”)
As an extemporaneous example, a pattern such as \"([a-z_]+)\"\:([0-9.]+) could be applied and it would return the matched substrings as $1 and $2 ... I repeat, extemporaneous example ... and it would return $1='temp' $2="70.00' the first time, $1='tmode', $2='2' the second time, and so on (if I actually got it right). It would skip over anything that did not match in search of the next thing that did. This can be a useful technique, although as with everything else having to do with regular-expressions it demands rigorous testing. (Beware that if the regular expression does not encompass all of the actual data, any data which doesn’t match will simply be skipped! For example, I had to edit this post to include an underscore-character ...)
u| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Proper use of split
by AnomalousMonk (Archbishop) on Jun 02, 2012 at 17:59 UTC |