zentara has asked for the wisdom of the Perl Monks concerning the following question:

This is a side track I got onto, while tring to eliminate the need for placing a bunch of hex numbers in the DATA section in Tk Symbola Font viewer. The hex numbers rise in value from the initial data point to the final data point, but there are gaps at various places. I can detect the gaps, but I'm not sure of the best way of generating an array of multiple hex ranges to replace it.

I wonder if anyone has seen this problem before.

What I would like to do, is eliminate the DATA section, and replace it with a series of hex ranges, like

#pseudo code my @hex = (0x00a0 .. $somehex, $somehex .. $nextbreak, etc etc)
but generated from the code.

For the amount of time I spent on this, I could have just done it manually. :-)

Here is as far as I got. Has anyone seen a module that handles creating hex ranges from DATA? Most of the range modules go in the other direction. Or is there a better way to do this? The problem is compounded by the difficulty of getting the range operator to recognize hex strings.

#!/usr/bin/perl use warnings; use strict; my @hexv; while (<DATA>) { chomp; my @vals = split ' ', $_; @hexv = (@hexv,@vals) } my ($current, $next); my $endex = scalar @hexv - 1; # try to find where gaps in the array are # count in hex by 0x1, and see if consecutive values # exist in the data, if not define the gaps foreach my $index ( 0 .. $endex ){ last if $index == $endex; $current = sprintf "%05x", hex $hexv[$index]; $next = sprintf "%05x", hex $hexv[$index + 1]; #print "$current $next "; if ( (hex $current) + 0x1 != hex $next ){ print "$current to $next Not Ok\n" } } # now I don't know the best way to make a set of ranges to emulate the # data in the DATA section. Should I subtract from the @fullarray, or # just build a set of ranges? __END__ 00A0 00A1 00A2 00A3 00A4 00A5 00A6 00A7 00A8 00A9 00AA 00AB 00AC 00AD 00AE 00AF 00B0 00B1 00B2 00B3 00B4 00B5 00B6 00B7 00B8 00B9 00BA 00BB +00BC 00BD 00BE 00BF 00D7 00E6 00E7 00F0 00F7 00F8 0127 0131 014B 0153 0192 +019B 01C0 01C1 01C2 01C3 0237 0238 0239 023C 0240 0250 0251 0252 0253 0254 +0255 0256 0257 0258 0259 025A 025B 025C 025D 025E 025F 0260 0261 0262 0263 +0264 0265 0266 0267 0268 0269 026A 026B 026C 026D 026E 026F 0270 0271 0272 +0273 0274 0275 0276 0277 0278 0279 027A 027B 027C 027D 027E 027F 0280 0281 +0282 0283 0284 0285 0286 0287 0288 0289 028A 028B 028C 028D 028E 028F 0290 +0291 0292 0293 0294 0295 0296 0297 0298 0299 029A 029B 029C 029D 029E 029F +02A0 02A1 02A2 02A3 02A4 02A5 02A6 02A7 02A8 02A9 02AA 02AB 02AC 02AD 02AE +02AF 02B0 02B1 02B2 02B3 02B4 02B5 02B6 02B7 02B8 02B9 02BA 02BB 02BC 02BD +02BE 04D6 04D7 04D8 04D9 04DA 04DB 04DC 04DD 04DE 04DF 04E0 04E1 04E2 04E3 +04E4 04E5 04E6 04E7 04E8 04E9 04EA 04EB 04EC 04ED 04EE 04EF 04F0 04F1 04F2 +04F3 04F4 04F5 04F6 04F7 04F8 04F9 04FA 04FB 04FC 04FD 04FE 04FF 0500 0501 +0502 225B 225C 225D 225E 225F 2260 2261 2262 2263 2264 2265 2266 2267 2268 +2269 226A 226B 226C 226D 226E 226F 2270 2271 2272 2273 2274 2275 2276 2277 +2278 2279 227A 227B 227C 227D 227E 227F 2280 2281 2282 2283 2284 2285 2286 +2287 2288 2289 228A 228B 228C 228D 228E 228F 2290 2291 2292 2293 2294 2295 +2296 2297 2298 2299 229A 229B 229C 229D 229E 229F 22A0 22A1 22A2 22A3 22A4 +22A5 22A6 22A7 22A8 22A9 22AA 22AB 22AC 22AD 22AE 22AF 22B0 22B1 22B2 22B3 +22B4 22B5 22B6 22B7 22B8 22B9 22BA 22BB 22BC 22BD 22BE 22BF 22C0 22C1 22C2 +22C3 22C4 22C5 22C6 22C7 22C8 22C9 22CA 22CB 22CC 22CD 22CE 22CF 22D0 22D1 +22D2 22D3 22D4 22D5 22D6 22D7 22D8 22D9 22DA 22DB 22DC 22DD 22DE 22DF 22E0 +22E1 22E2 22E3 22E4 22E5 22E6 22E7 22E8 22E9 22EA 22EB 22EC 22ED 22EE 22EF +22F0 22F1 22F2 22F3 22F4 22F4 22F5 22F6 22F7 22F8 22F9 22FA 22FB 22FC 22FC +22FD 22FE 22FF 2300 2301 2302 2303 2304 2305 2306 2307 2308 2309 230A 230B +230C 230D 230E 230F 2310 2311 2312 2313 2314 2315 2316 2317 2318 2319 231A +231B 276C 276D 276E 276F 2770 2771 2772 2773 2774 2775 2776 2777 2778 2779 +277A 277B 277C 277D 277E 277F 2780 2781 2782 2783 2784 2785 2786 2787 2788 +2789 278A 278B 278C 278D 278E 278F 2790 2791 2792 2793 2794 2795 2796 2797 +2798 2799 279A 279B 279C 279D 279E 279F 27A0 27A1 27A2 27A3 27A4 27A5 27A6 +27A7 27A8 27A9 27AA 27AB 27AC 27AD 27AE 27AF 27B0 27B1 27B2 27B3 27B4 27B5 +27B6 27B7 27B8 27B9 27BA 27BB 27BC 27BD 27BE 27BF 27C0 27C1 27C2 27C3 27C4 +27C5 27C6 27C7 27C8 27C9 27CA 27CC 27CE 27CF 27D0 27D1 27D2 27D3 27D4 27D5 +27D6 27D7 27D8 27D9 27DA 27DB 27DC 27DD 27DE 27DF 27E0 27E1 27E2 27E3 27E4 +27E5 27E6 27E7 27E8 27E9 27EA 27EB 27EC 27ED 27EE 27EF 27F0 27F1 27F2 27F3 +27F4 2E0C 2E0D 2E0E 2E0F 2E10 2E11 2E12 2E13 2E14 2E15 2E16 2E17 2E18 2E19 +2E1A 2E1B 2E1C 2E1D 2E1E 2E1F 2E20 2E21 2E22 2E23 2E24 2E25 2E26 2E27 2E28 +2E29 2E2A 2E2B 2E2C 2E2D 2E2E 2E2F 2E30 2E31 3008 3009 300A 300B 300C 300D +300E 300F 3010 3011 3014 3015 3016 3017 3018 3019 301A 301B 301C 301D 301E +301F 306E 4DC0 4DC1 4DC2 4DC3 4DC4 4DC5 4DC6 4DC7 4DC8 4DC9 4DCA 4DCB 4DCC +4DCD 4DCE 4DCF 4DD0 4DD1 4DD2 4DD3 4DD4 4DD5 4DD6 4DD7 4DD8 4DD9 4DDA 4DDB +4DDC 4DDD 4DDE 4DDF 4DE0 4DE1 4DE2 4DE3 4DE4 4DE5 4DE6 4DE7 4DE8 4DE9 4DEA +4DEB 4DEC 4DED 4DEE 4DEF 4DF0 4DF1 4DF2 4DF3 4DF4 4DF5 4DF6 4DF7 4DF8 4DF9 +4DFA 4DFB 4DFC 4DFD 4DFE 4DFF 4E2D 4E2D FE00 FE10 FE11 FE12 FE13 FE14 FE15 +FE16 FE17 FE17 FE18 FE18 FE19 FE20 FE21 FE22 FE23 FE24 FE25 FE26 FE30 FE31 +FE32 FE33 FE34 FE35 FE36 FE37 FE38 FE39 FE3A FE3A FE3B FE3B FE3C FE3C FE3D +FE3E FE3F FE40 FE41 FE42 FE43 FE44 FE45 FE46 FE47 FE48 FE49 FE4A FE4B FE4C +FE4D FE4E FE4F FE61 FFF9 FFFA FFFB FFFC FFFD 1D000 1D001 1D002 1D003 1D004 +1D005 1D006 1D007 1D008 1D009 1D00A 1D00B 1D00C 1D00D 1D00E 1D00F 1D010 1D011 1D01 +2 1D013 1D014 1D015 1D016 1D017 1D018 1D019 1D01A 1D01B 1D01C 1D01D 1D01E 1D01F 1D020 1D02 +1 1D022 1D023 1D024 1D025 1D026 1D027 1D028 1D029 1D02A 1D02B 1D02C 1D02D 1D02E 1D02F 1D03 +0 1D031 1D032 1D033 1D034 1D035 1D036 1D037 1D038 1D039 1D03A 1D03B 1D03C 1D03D 1D03E 1D03 +F 1D040 1D041 1D042 1D043 1D044 1D045 1D046 1D047 1D048 1D049 1D04A 1D04B 1D04C 1D04D 1D04 +E 1D04F 1D050 1D051 1D052 1D053 1D054 1D055 1D056 1D057 1D058 1D059 1D05A 1D05B 1D05C 1D05 +D 1D05E 1D05F 1D060 1D061 1D062 1D063 1D064 1D065 1D066 1D067 1D068 1D069 1D06A 1D06B 1D06 +C 1D06D 1D06E 1D06F 1D070 1D071 1D072 1D073 1D074 1D075 1D076 1D077 1D078 1D079 1D07A 1D07 +B 1D07C 1D07D 1D07E 1D510 1D511 1D512 1D513 1D514 1D516 1D517 1D518 1D519 1D51A 1D51B 1D51 +C 1D51E 1D51F 1D520 1D521 1D522 1D523 1D524 1D525 1D526 1D527 1D528 1D529 1D52A 1D52B 1D52 +C 1D52D 1D52E 1D52F 1D530 1D531 1D532 1D533 1D534 1D535 1D536 1D537 1D538 1D539 1D53B 1D53 +C 1D53D 1D53E 1D540 1F623 1F624 1F625 1F628 1F629 1F62A 1F62B 1F62D 1F630 1F631 1F632 1F63 +3 1F635 1F636 1F637 1F638 1F639 1F63A 1F63B 1F63C 1F63D 1F63E 1F63F 1F640 1F645 1F646 1F64 +7 1F648 1F649 1F64A 1F64B 1F64C 1F64D 1F64E 1F64F 1F680 1F681 1F682 1F683 1F684 1F685 1F68 +6 1F687 1F688 1F689 1F68A 1F68B 1F68C 1F68D 1F68E 1F68F 1F690 1F691 1F692 1F693 1F694 1F69 +5 1F696 1F697 1F698 1F699 1F69A 1F69B 1F69C 1F69D 1F69E 1F69F 1F6A0 1F6A1 1F6A2 1F6A3 1F6A +4 1F6A5 1F6A6 1F6A7 1F6A8 1F6A9 1F6AA 1F6AB 1F6AC 1F6AD 1F6AE 1F6AF 1F6B0 1F6B1 1F6B2 1F6B +3 1F6B4 1F6B5 1F6B6 1F6B7 1F6B8 1F6B9 1F6BA 1F6BB 1F6BC 1F6BD 1F6BE 1F6BF 1F6C0 1F6C1 1F6C +2 1F6C3 1F6C4 1F6C5 1F700 1F701 1F702 1F703 1F704 1F705 1F706 1F707 1F708 1F709 1F70A 1F70 +B 1F70C 1F70D 1F70E 1F70F 1F710 1F711 1F712 1F713 1F714 1F715 1F716 1F717 1F718 1F719 1F71 +A 1F71B 1F71C 1F71D 1F71E 1F71F 1F720 1F721 1F722 1F723 1F724 1F725 1F726 1F727 1F728 1F72 +9 1F72A 1F72B 1F72C 1F72D 1F72E 1F72F 1F730 1F731 1F732 1F733 1F734 1F735 1F736 1F737 1F73 +8 1F739 1F73A 1F73B 1F73C 1F73D 1F73E 1F73F 1F740 1F741 1F742 1F743 1F744 1F745 1F746 1F74 +7 1F748 1F749 1F74A 1F74B 1F74C 1F74D 1F74E 1F74F 1F750 1F751 1F752 1F753 1F754 1F755 1F75 +6 1F757 1F758 1F759 1F75A 1F75B 1F75C 1F75D 1F75E 1F75F 1F760 1F761 1F762 1F763 1F764 1F76 +5 1F766 1F767 1F768 1F769 1F76A 1F76B 1F76C 1F76D 1F76E 1F76F 1F770 1F771 1F772 1F773

I'm not really a human, but I play one on earth.
Old Perl Programmer Haiku ................... flash japh

Replies are listed 'Best First'.
Re: Crazy Golf : creating hex ranges from non-consecutive hex data values
by kennethk (Abbot) on Jul 31, 2011 at 15:34 UTC
    Rather than fighting with hex, it almost always easier to work with integers internally and let Perl worry about formatting on input/output.
    #!/usr/bin/perl use warnings; use strict; my @values = map hex, grep length, map split, <DATA>; my @ranges = [$values[0], $values[0]-1]; $_ == $ranges[-1][1]+1 ? $ranges[-1][1]++ : push @ranges, [$_,$_] for +@values; print "my \@hex = (\n"; printf "\t0x%05x .. 0x%05x,\n", @$_ for @ranges; print "\t);\n"; __END__ 00A0 00A1 00A2 00A3 00A4 00A5 00A6 00A7 00A8 00A9 00AA 00AB 00AC 00AD 00AE 00AF 00B0 00B1 00B2 00B3 00B4 00B5 00B6 00B7 00B8 00B9 00BA 00BB +00BC 00BD 00BE 00BF 00D7 00E6 00E7 00F0 00F7 00F8 0127 0131 014B 0153 0192 +019B

    where I've truncated the DATA section for brevity. The above outputs

    my @hex = ( 0x000a0 .. 0x000bf, 0x000d7 .. 0x000d7, 0x000e6 .. 0x000e7, 0x000f0 .. 0x000f0, 0x000f7 .. 0x000f8, 0x00127 .. 0x00127, 0x00131 .. 0x00131, 0x0014b .. 0x0014b, 0x00153 .. 0x00153, 0x00192 .. 0x00192, 0x0019b .. 0x0019b, );

    If the assumption of being well-ordered fails, inserting a sort {$a<=>$b}, before the maps will fix you. Repeat values are a bit more annoying - they won't result in invalid output, but they will create unnecessary breaks and overlapping ranges.

    The range-generation algorithm is fairly simplistic - keep an AoA of start and end points, and if the next point is one larger increment the last range; otherwise, create a new one.

Re: Crazy Golf : creating hex ranges from non-consecutive hex data values
by Anonymous Monk on Jul 31, 2011 at 13:36 UTC
Re: Crazy Golf : creating hex ranges from non-consecutive hex data values
by davido (Cardinal) on Aug 01, 2011 at 08:28 UTC

    As you've discovered, it's easier to create a range based on integers than hex strings. But it's also easy to convert your strings of hex digits into numbers. They need literally as hex digits, which can be stored internally as integers.

    Once your strings of hex digits are converted to integers, it becomes easy to use Number::Range to detect the ranges. (I'm using a portion of the __DATA__ segment from your Tk/Unicode post as a data set.)

    use strict; use warnings; use Number::Rangify qw/rangify/; use v5.12; while( <DATA> ) { say "@{[$_->Size]}" for rangify( map hex, split ); } __DATA__ 00A0 00A1 00A2 00A3 00A4 00A5 00A6 00A7 00A8 00A9 00AA 00AB 00AC 00AD 00AE 00AF 00B0 00B1 00B2 00B3 00B4 00B5 00B6 00B7 00B8 00B9 00BA 00BB +00BC 00BD 00BE 00BF 00D7 00E6 00E7 00F0 00F7 00F8 0127 0131 014B 0153 0192 +019B 01C0 01C1 01C2 01C3 0237 0238 0239 023C 0240 0250 0251 0252 0253 0254 +0255

    You can use the above snippet to convert your giant list of hex fields into smaller list containing one range per row (as base ten integers), in the format of:

    160 191 215 215 230 231 240 240 247 248 295 295 305 305 331 331

    With shell redirection you can dump it to a file, and use that as the starting point for the __DATA__ segment of your new script. That will reduce your __DATA__ segment from 5000+ individual string representations of hex down to about 120+ base ten integer ranges.

    If you need to expand it again, you can do that easily enough as follows. This is NSFW, as I'll explain below:

    use v5.12; use strict; use warnings; my @cp_ints = map{ eval join '..', split } <DATA>; my @cp_hexes = map sprintf( "%#x", $_ ), @cp_ints; __DATA__ 160 191 215 215 230 231 240 240 247 248 295 295 305 305

    Notice that I took a short by using eval EXPR, which is only safe if you're in control of your input DATA.

    I used eval to build ranges without explicitly splitting into variables that could sit on either side of a real range operator. In other words, I could have eliminated the eval with a more conventional construct like this:

    while( <DATA> ) { my ( $low, $high ) = split; push @cp_ints, $low .. $high; }

    But I was in a "play with eval" mood. The latter form is better for a number of reasons, but why not have fun while we're playing?

    Anyway, with the two snippets you can take your original list of 5000+ items, reduce it to about 123 ranges, and then later reconstitute it back to your original 5000+ items.

    Hope this helps.


    Dave

Re: Crazy Golf : creating hex ranges from non-consecutive hex data values
by zentara (Cardinal) on Jul 31, 2011 at 15:45 UTC
Re: Crazy Golf : creating hex ranges from non-consecutive hex data values
by jwkrahn (Abbot) on Aug 01, 2011 at 04:43 UTC
    my @hexv; while (<DATA>) { chomp; my @vals = split ' ', $_; @hexv = (@hexv,@vals) }

    @hexv = (@hexv,@vals) Really!?    Haven't you heard of push?

    And it doesn't have to be that complicated:

    my @hexv; while (<DATA>) { push @hexv, split; }

    Or even:

    my @hexv = map split, <DATA>;


    my $endex = scalar @hexv - 1;

    More correctly written as:

    my $endex = $#hexv;


    foreach my $index ( 0 .. $endex ){ last if $index == $endex;

    Why not just:

    foreach my $index ( 0 .. $endex - 1 ){
      Thanks for pointing out all my code sloppiness jwkrahn. Would you believe that I purposely left that in there, just to see who would correct me? :-) But thanks for pointing it out, so that anyone coming upon this node in the future, won't pick up bad sloppy practices.

      I do indeed appreciate the pursuit of perfection.


      I'm not really a human, but I play one on earth.
      Old Perl Programmer Haiku ................... flash japh

      @hexv = (@hexv,@vals) Really!? Haven't you heard of push?

      Lets see, perfectly legal syntax, perfectly understandable, and its faster than push

        Faster?

        $ perl -le' use Benchmark qw/ cmpthese /; my @data = "a" .. "z"; cmpthese -4, { copy => sub { my @array; @array = ( @array, @data ) for 1 .. 10; r +eturn @array }, push => sub { my @array; push @array, @data for 1 .. 10; return @a +rray }, } ' Rate copy push copy 908/s -- -91% push 9660/s 964% --

        Can you prove it?