in reply to Re: removing redundantwhitespace
in thread removing redundantwhitespace
It doesn't handle the preservation of line breaks (ie, only removing lines that contain nothing but whitespace). Nor does it remove leading whitespace.
Below is a fairly simple, single-pass regex that handles all but leading spaces on the first line, so I've added a very simple regex before it:
s/^\s+//; s{ [^\S\n]* (?: (\n)\s* | [^\S\n]+ ) }{ $1 || ' ' }gex
OT, but the node title made me wonder if there was a reasonable single-pass regex for removing leading and trailing whitespace while collapsing internal whitespace. I can see a lot of approaches that will work, but most seem to get bogged down in unfortunate complexities. Ignoring warnings lets me do:
s{(?<=(\S))?\s+(?=(\S))?}{length($1.$2)?'':' '}gx
Requiring Perl 5.010 means I don't have to ignore warnings:
s{(?<=(\S))?\s+(?=(\S?))}{length(($1//'').$2)?'':' '}gx
Surely we can do better than that. Oh, again requiring 5.010, I can do this:
s{(^)?\s+(\z)?}{$1//$2//' '}gx
That's not too bad. (:
- tye
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: removing redundantwhitespace (too far)
by ikegami (Patriarch) on Sep 14, 2008 at 15:59 UTC | |
by tye (Sage) on Sep 14, 2008 at 19:04 UTC |