in reply to Text File Parsing.

If it's not just a \cr\lf issue carried along with transferring a file from a DOS/Windows machine to a Unix machine without proper line-end conversion, then the leading whitespace theory must be accurate. As others have mentioned you can deal with that with a s/\s+// substitution regex.

However, it also looks like there may be extra newlines in there as well. You apparently want only single newline characters at the end of each line, so that the text appears "single spaced".

Update: ^\s+ will also wipe out lines that contain only "newline" characters: a desirable side-effect. However, after that, it may be helpful to completely eliminate elements from @array that have now become empty.


Dave


"If I had my life to live over again, I'd be a plumber." -- Albert Einstein

Replies are listed 'Best First'.
Re: Re: Text File Parsing.
by graff (Chancellor) on Nov 25, 2003 at 03:34 UTC
    ... from a DOS/Windows machine to a Unix machine ...

    You mean, from a unix machine to a dos/windows machine -- that's the direction in which this sort of symptom appears, because the dos machine needs both the CR and the LF to display the text properly, and will display text in a manner just like to OP showed if the CR is missing. Meanwhile, unix uses only the LF in its text files, but will usually display a dos file (with the "extra" CR next to each LF) intelligibly.

    The example in the OP was caused either by this issue, or else it results from trying to do something like an X-windows "select/paste" operation from an html browser window to some plain-text window, where the selected lines in the browser happen to be part of a <table>. No way to be sure, given the information originally provided, but the html-paste seems more likely, and removing whitespace is the way to go (rather than adding CR's).