SamCG has asked for the wisdom of the Perl Monks concerning the following question:
I'm trying to parse the bodies of some automated emails. One would think it would be easy, but for some reason the generator of the emails does NOT cut lines using a \n, but by adding spaces. The number of spaces added seems to vary.
Update: Ah, I've found a potential way. while ($bdy=~/(.*?):\s(.*?)\s\s/g) seems to work alright. Comments on this approach?
-----------------
s''limp';@p=split '!','n!h!p!';s,m,s,;$s=y;$c=slice @p1;so brutally;d;$n=reverse;$c=$s**$#p;print(''.$c^chop($n))while($c/=$#p)>=1;
Initially I broke the email into an array by splitting on \s{15,}, but this isn't ideal, and drops some of the values. I'm considering splitting on the colons, but I'm not convinced this is a great idea and seems to lead to more headaches. Any ideas for a somewhat robust, straightforward way to parse this?
Security : BULGY N V- + + + + + + + + + + Item Overridden : Earnings Per Share + + + + + + + + + + Initial Value : (USD) + + + + + + + + + + Current Value : () + + + + + + + + + + Overridden Value : 160 (USD) + + + + + + + + + + Effective : 08/20/1999 through 08/20/2000 + + + + + + + + + + Override Type : Data SecurityID : 1076665 Sedol : 2451234 Cusip : N66696606 ISIN : NL0006122988
-----------------
s''limp';@p=split '!','n!h!p!';s,m,s,;$s=y;$c=slice @p1;so brutally;d;$n=reverse;$c=$s**$#p;print(''.$c^chop($n))while($c/=$#p)>=1;
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Parsing semi-erratic text
by ikegami (Patriarch) on Aug 23, 2006 at 19:39 UTC | |
by SamCG (Hermit) on Aug 23, 2006 at 19:50 UTC | |
by ikegami (Patriarch) on Aug 23, 2006 at 20:04 UTC | |
Re: Parsing semi-erratic text
by GrandFather (Saint) on Aug 23, 2006 at 21:07 UTC |
Back to
Seekers of Perl Wisdom