http://qs1969.pair.com?node_id=569191


in reply to Parsing semi-erratic text

The data you submitted *does* have newlines.

while (<DATA>) { my ($key, $val) = /^\s*([^:]*?)\s*:\s*(.*?)\s*$/ or next; print("[$key:$val]\n"); }

and

while (<DATA>) { my ($key, $val) = split(/:/, $_, 2); next if not defined $val; s/^\s+//, s/\s+$// for $key, $val; print("[$key:$val]\n"); }

both do the trick.

Replies are listed 'Best First'.
Re^2: Parsing semi-erratic text
by SamCG (Hermit) on Aug 23, 2006 at 19:50 UTC
    Hrmm...perhaps an effect of my cutting and pasting? The body of my email gets read into a variable (so it's like slurping a file). I can't seem to split on newlines, and if I use a regex to count I get only one in each email (which I presume is at the end).

    Thank you for the implicit character class ([^:]) suggestion, by the way. I hate putting .* into regexes, even with the non-greedy modifier.



    -----------------
    s''limp';@p=split '!','n!h!p!';s,m,s,;$s=y;$c=slice @p1;so brutally;d;$n=reverse;$c=$s**$#p;print(''.$c^chop($n))while($c/=$#p)>=1;
      I agree. .* and .*? usually/often assume the data is formatted correctly.