Hello.

I was trying to solve a problem about UTF-8, and went with solution which is too slow. I want to ask suggestions how to optimize my code, or suggestions about better approaches. Maybe with using pack/unpack, but I've found perlpacktut not very clear for me.

I have no idea how to write a problem statement from .pdf paper, so I made some screenshots.
How would you solve this problem?

Test case #1:
C1 B3 E0 81 B3 F0 80 81 B3 F8 80 80 81 B3 FC 80 80 80 81 B3 80 C1 B3 E0 81 B3 F0 80 81 B3 F8 80 80 81 B3 FC 80 80 80 81 B3 E0 AF B5
Test case #2:
FF 00 FF D1 81 F0 80 91 81 00 D1 81 FF F0 80 91 81 FF FF 00 D1 81 F0 80 91 81
Output for TC #1:
73 73 73 73 73 73 73 73 73 73 BF5
Output for TC #2:
441 441 0 441 0 441 441
My code which solves two tests correctly, but it is slow for bigger tests (can't see them), and maybe it is incorrect(?):
#!/usr/bin/perl use warnings; use strict; $\ = $/; my $debug = 0; $_ = do { local $/; <> }; @_ = split; $_ = join '', map { sprintf "%08b", eval "0x$_" } @_, 'FF'; my @data = reverse <DATA>; chomp @data; y/x/./ for @data; my @rxs = join '|', @data; my @r; my @R; while( /\G(?:@rxs|.{8})/g ){ my $c = $&; $debug and print "c:$c"; $c =~ /@rxs/ or do { push @R, [ @r ] if @r > 2; @r = (); $debug and print " -F"; next; }; my $x = $c =~ s/.{8}/ ($&) =~ s!^1+0!!r /ger; $debug and print "x:",$x; push @r, ~~ reverse $x; } my @acc; my @ACC; for my $R (@R){ @acc = (); for my $r ( @{ $R } ){ my $acc = ''; while( $r =~ /.{1,4}/g ){ $acc .= ( 0 .. 9, 'A' .. 'F' )[ eval "0b" . reverse $& ]; } $acc = reverse $acc; $acc =~ s/^0+\B//; push @acc, $acc; } push @ACC, "@acc"; } print for @ACC; __DATA__ 0xxxxxxx 110xxxxx10xxxxxx 1110xxxx10xxxxxx10xxxxxx 11110xxx10xxxxxx10xxxxxx10xxxxxx 111110xx10xxxxxx10xxxxxx10xxxxxx10xxxxxx 1111110x10xxxxxx10xxxxxx10xxxxxx10xxxxxx10xxxxxx

In reply to contest problem about UTF-8 by rsFalse

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.