Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^2: Module to read - modify - write text files in any unicode encoding

by Rudif (Hermit)
on May 20, 2008 at 22:22 UTC ( #687680=note: print w/replies, xml ) Need Help??


in reply to Re: Module to read - modify - write text files in any unicode encoding
in thread Module to read - modify - write text files in any unicode encoding

ikegami

Your code just works, also when I apply it to UTF-8.

Apart from the lack in symmetry in my IO layers, that you pointed out, I found another source of my confusion, which you probably noticed, but you did not comment on :

my_hexdump() based on Data::Hexdump that I was using in tests is wrong - on Windows.
Deep inside, Data::Hexdump reads the file without applying '<:raw', like you do. So, when reading the UTF-8 or plain ASCII sequence "\r\n", it converts it to "\n".

In addition, I was using hdump.pl to dump my test files. It agreed with my_hexdump(), but they were both wrong!.

Here is a correct file hexdump, based on your code :

sub hexdump { my $file = shift; open(my $fh, '<:raw', $file) or die; local $/; my $data = <$fh>; (my $dump = uc unpack 'H*', $data) =~ s/(..)/$1 /g; return $dump; }
Thank you for the insight.

Rudif

  • Comment on Re^2: Module to read - modify - write text files in any unicode encoding
  • Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://687680]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (3)
As of 2023-01-28 07:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?