http://qs1969.pair.com?node_id=687680


in reply to Re: Module to read - modify - write text files in any unicode encoding
in thread Module to read - modify - write text files in any unicode encoding

ikegami

Your code just works, also when I apply it to UTF-8.

Apart from the lack in symmetry in my IO layers, that you pointed out, I found another source of my confusion, which you probably noticed, but you did not comment on :

my_hexdump() based on Data::Hexdump that I was using in tests is wrong - on Windows.
Deep inside, Data::Hexdump reads the file without applying '<:raw', like you do. So, when reading the UTF-8 or plain ASCII sequence "\r\n", it converts it to "\n".

In addition, I was using hdump.pl to dump my test files. It agreed with my_hexdump(), but they were both wrong!.

Here is a correct file hexdump, based on your code :

sub hexdump { my $file = shift; open(my $fh, '<:raw', $file) or die; local $/; my $data = <$fh>; (my $dump = uc unpack 'H*', $data) =~ s/(..)/$1 /g; return $dump; }
Thank you for the insight.

Rudif