Spooky has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I'm reading what apparently is a binary file -I can view it in emacs and along with valid alpha-numeric characters there are also plenty of unprintable/unreadable characters and spaces (the spaces appear as \x20 when the hexl-mode option is turned on in emacs). When I read this file using Perl I don't want to loose those spaces as they are important in determining field lengths. How do suggest I go about doing this - currently I seem to be dropping those spaces? Thanks.... Ok, here goes - My file, when opened in emacs, looks like this: EWH12345^A%^D 704_Barrington_Ctr Warrington_Pa DAH45678^A^%^D 705_Barrington_Ctr Warminster_Pa When I turn on hexl-mode, the "^A%^D" above is "012504", but the spaces are, as you would guess, a hex 20. My question is, the best way to read something like this and not loose those spaces - thanks!

Replies are listed 'Best First'.
Re: keeping spaces
by ikegami (Patriarch) on Feb 12, 2009 at 18:04 UTC
    None of read, sysread, readline (aka <>) and readpipe (aka qx``) collapse, trim or otherwise remove spaces. The problem is in code you haven't shown, assuming there were spaces in the first place.
Re: keeping spaces
by kyle (Abbot) on Feb 12, 2009 at 18:00 UTC

    Show us the code you're using. Give us some sample input, if possible. Tell us what you expect it to do and what it is doing instead.

Re: keeping spaces
by johngg (Canon) on Feb 12, 2009 at 23:43 UTC

    I wonder if tr might help you. Particularly, the c flag complements the characters specified so, you can get rid of anything that isn't a newline (0x0a) and the range of printable characters from space (0x20) through ~ (0x7e). Also include return (0x0d) if on Windows.

    $ perl -le ' > $str = qq{\x00\x03abc xyz\n\x00\x00\x0912 34 56\x05\n}; > print for map sprintf( q{0x%02x}, ord ), split m{}, $str; > print q{-} x 20; > $str =~ tr{\x0a\x20-\x7e}{}cd; > print for map sprintf( q{0x%02x}, ord ), split m{}, $str;' 0x00 0x03 0x61 0x62 0x63 0x20 0x78 0x79 0x7a 0x0a 0x00 0x00 0x09 0x31 0x32 0x20 0x33 0x34 0x20 0x35 0x36 0x05 0x0a -------------------- 0x61 0x62 0x63 0x20 0x78 0x79 0x7a 0x0a 0x31 0x32 0x20 0x33 0x34 0x20 0x35 0x36 0x0a $

    I hope this is helpful.

    Cheers,

    JohnGG

    Update: Fixed typo.

Re: keeping spaces
by CountZero (Bishop) on Feb 12, 2009 at 19:54 UTC
    currently I seem to be dropping those spaces
    You are not sure? How did you check?

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: keeping spaces
by hbm (Hermit) on Feb 12, 2009 at 18:19 UTC

    I'manticipating"Spooky'sSpaces"tutorial!