in reply to hexdump -C

The use of (?{}) doesn't help.

s/\G(?|(\0{16})+(?{'*'})|(.{1,16})(?{...}))/$^R\n/sgr What you used. s/\G(\0{16})|\G.{1,16}/($1?"*":...)."\n"/segr Could be used. s/\G.{1,16}/($&eq"\0"x16?"*":...)."\n"/segr What I used.

Combining this change and tybalt89's changes, we get

#!/usr/bin/perl use v5.36; use Path::Tiny qw( path ); sub hexdump( $data ) { $data =~ s/\G.{1,16}/ ( $& eq "\0"x16 ? "*" : sprintf "%08X %-50s|%s|", $-[0], "@{[unpack q{(H2)8a0(H2)8}, $&]}", $& =~ y{ -~}{.}cr ) . "\n" /segr . sprintf "%08x\n", length $data } print hexdump( path( $0 )->slurp );

Replies are listed 'Best First'.
Re^2: hexdump -C
by NERDVANA (Priest) on Oct 14, 2025 at 05:15 UTC
    Actually it was solving the problem that the run of \0 lines needs to be replaced by a single '*', not one '*' per every 16 zeroes.

    But, as I went to create an example for you, I realized I misunderstood the '*' behavior. It prints one full line of 00 00 00 ... and then the '*' means "repeat the previous line". I'd only ever seen it replace zeroes with '*' because that's the most likely repetition to appear in data files.

    $ perl -E 'say "\0"x64 ."A"x64' | hexdump -C 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |......... +.......| * 00000040 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 |AAAAAAAAA +AAAAAAA| * 00000080 0a |.| 00000081

    Combining everyone's ideas, I now get:

    sub hexdump($data) { $data =~ s/\G(.{1,16})(\1+)?/ sprintf "%08X %-50s|%s|\n%s", $-[0], "@{[unpack q{(H2)8a0(H2)8},$1]}", $1 =~ y{ -~}{.}cr, "*\n"x!!$+[2] /segr . sprintf "%08X", $+[0] }
      That is so cool! I modified it slightly so it also runs on TinyPerl 5.8:

      sub hexdump { my $s; $_[0] =~ s/\G([\0-\xff]{1,16})/ $s = $1; $s =~ y|\x20-\x7e|.|c; spri +ntf("%08X %-50s|%s|\n", $-[0], "@{[unpack q{(H2)8a0(H2)8}, $1]}", $s +);/ge; return $_[0]; }

      I don't understand why I had to do this: ([\0-\xff]{1,16}) instead of (.{16}) This latter one won't capture anything, which is weird. The second thing I don't understand is why this works:

      "@{[unpack q{(H2)8a0(H2)8}, $1]}"

        Maybe cause you were missing the /s? (It was also missing in the original code.)

        a0 returns a zero-length string. [ ] creates an anon array and returns a reference to is. @{ ... } is the referenced array. Interpolating an array in a string joins elements with spaces by default.