The node A UTF8 round trip with MySQL and specifically Re: A UTF8 round trip with MySQL seems to have arrived at a coincidental time for me. With all this conflicker talk someone here ran an nmap scan and it crashed my perl server:
utf8 "\x80" does not map to Unicode at Queue.pm line 835, <GEN6234> li +ne 1. Malformed UTF-8 character (unexpected continuation byte 0x80, with no +preceding start byte) in pattern match (m//) at Queue.pm line 836, <G +EN6234> line 1. utf8 "\xD7" does not map to Unicode at Queue.pm line 835, <GEN6238> li +ne 1. utf8 "\xA4" does not map to Unicode at Queue.pm line 835, <GEN6239> li +ne 1. Malformed UTF-8 character (overflow at 0xcd0b2000, byte 0x00, after st +art byte 0xff) in subroutine entry at /usr/lib/perl5/5.8.8/i386-linux-thread-multi/Data/Dumper.pm line 179, +<GEN6239> line 1. Malformed UTF-8 character (overflow at 0xcd0b2000, byte 0x00, after st +art byte 0xff) in subroutine entry at /usr/lib/perl5/5.8.8/i386-linux-thread-multi/Data/Dumper.pm line 179, +<GEN6239> line 1. Malformed UTF-8 character (overflow at 0xcd0b2000, byte 0x00, after st +art byte 0xff) in subroutine entry at /usr/lib/perl5/5.8.8/i386-linux-thread-multi/Data/Dumper.pm line 179, +<GEN6239> line 1. Malformed UTF-8 character (overflow at 0xcd0b2000, byte 0x00, after st +art byte 0xff) in subroutine entry at /usr/lib/perl5/5.8.8/i386-linux-thread-multi/Data/Dumper.pm line 179, +<GEN6239> line 1. Malformed UTF-8 character (overflow at 0xcd0b2000, byte 0x00, after st +art byte 0xff) in subroutine entry at /usr/lib/perl5/5.8.8/i386-linux-thread-multi/Data/Dumper.pm line 179, +<GEN6239> line 1. Malformed UTF-8 character (overflow at 0xcd0b2000, byte 0x00, after st +art byte 0xff) in subroutine entry at /usr/lib/perl5/5.8.8/i386-linux-thread-multi/Data/Dumper.pm line 179, +<GEN6239> line 1. utf8 "\x80" does not map to Unicode at Queue.pm line 835, <GEN6242> li +ne 1. utf8 "\xE0" does not map to Unicode at Queue.pm line 835, <GEN6245> line 1. Segmentation fault
The code in question accepts UTF8 encoded data (well it is supposed to be encoded) from a socket and has set :utf8 I/O layer on the socket. The code generating the warnings is reading from the said socket.
There are a few things about this and the nodes quoted I don't understand.
As I quick test I got hold of a jpg file (obviously not utf8 encoded) and did:
use strict; use warnings; my $fh; open ($fh, "<:utf8", "schema.jpg"); my $img = ''; while (<$fh>) { $img .= $_; }
which takes 0.123s to run and outputs a lot of warnings. Changing to use :encoding(UTF8) takes 27s and outputs hundreds of warnings.
In reply to :utf8 I/O layer vs encoding(UTF8), segfault and speed by mje
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |