Hi everyone,

@McA

I've already suposed if the problem was the borders of the buffer. But if were it, only the words in the border of the 4k buffer would be affected. Instead of that, every single accented letter is trunked, even if it was in the byte 4k + (4k/2), who is suposed to be in the middle of the buffer.

The use of binmode($fh, ":utf8"); on the filehandle before put him on the array made the algorithm gives no result at all. I'll have to do some tests before have a better explanation of what happened.

@Anonymous Monk

I've tested both, as Random Walk explained and got the same result:

my @list; while(my $l = <$fh>){ push(@list,split(/ /,$l)); }

Also checked the version of CGI.pm. It was 3.52. I updated this on cpan using the command 'r CGI.pm' for version 3.63. Unfortunately, it doesn't solved the problem. I checked the POST_MAX too and it was equal to -1. I think that means unlimited, right?

I have tried other files too, and they didn't work if they have more than 4k.

@Random Walk

Thanks for the explanation. Tried this and didn't work :/ The code I've tried is above.

@Another Anonymous Monk

I've thought this but I don't have any idea on how I could do that. I know a few ways of read files on Perl, but since I've been working with CGI I don't find any other way to do this reading. Any tip?

Thank you all guys,

Vieira.


In reply to Re: Problem with utf8 after nearly 4096 bytes by gvieira
in thread Problem with utf8 after nearly 4096 bytes by gvieira

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.