Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^2: Module to read - modify - write text files in any unicode encoding

by Rudif (Hermit)
on May 20, 2008 at 22:22 UTC ( [id://687680]=note: print w/replies, xml ) Need Help??


in reply to Re: Module to read - modify - write text files in any unicode encoding
in thread Module to read - modify - write text files in any unicode encoding

ikegami

Your code just works, also when I apply it to UTF-8.

Apart from the lack in symmetry in my IO layers, that you pointed out, I found another source of my confusion, which you probably noticed, but you did not comment on :

my_hexdump() based on Data::Hexdump that I was using in tests is wrong - on Windows.
Deep inside, Data::Hexdump reads the file without applying '<:raw', like you do. So, when reading the UTF-8 or plain ASCII sequence "\r\n", it converts it to "\n".

In addition, I was using hdump.pl to dump my test files. It agreed with my_hexdump(), but they were both wrong!.

Here is a correct file hexdump, based on your code :

sub hexdump { my $file = shift; open(my $fh, '<:raw', $file) or die; local $/; my $data = <$fh>; (my $dump = uc unpack 'H*', $data) =~ s/(..)/$1 /g; return $dump; }
Thank you for the insight.

Rudif

  • Comment on Re^2: Module to read - modify - write text files in any unicode encoding
  • Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://687680]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2024-04-16 03:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found