in reply to Stripping non alphanumeric characters and leaving punctuation characters from a file
Answer to the stated problem; there are character classes in perl for control characters. In unicode, s/\p{IsC}//g; or s/\p{IsCntrl}//g;, in POSIX, s/[[:cntrl:]]//g;.
As for the real problem, it sounds very clunky to parse that information out of a pdf file each run. Why not extract it once and place it in a small db or flat file? [Update] The same objection holds for parsing it from html each time. Scribble the tax rate data and only the tax rate data somewhere you can get it easily.
Perl's binmode instruction may help with your file reading problem.
After Compline,
Zaxo
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Stripping non alphanumeric characters and leaving punctuation characters from a file
by Popcorn Dave (Abbot) on Jun 06, 2003 at 19:32 UTC |