harangzsolt33 has asked for the wisdom of the Perl Monks concerning the following question:

I have a question. What does this line do?:

goto &{$_[0]->{state} = \&stateReadLit};

Why is there an & in front, and what the heck does the { } do there? I understand the basic syntax where goto is followed by a label that contains a string made up of A-Z characters, followed by a semicolon. That means jump to that label. But I have never seen this sort of convoluted code before! I understand what goto does. I understand $_[0] and I guess I know what the \ character does there, but it's super weird the way it's written all together.

I picked this line from a larger Perl script (about 24KB total size) which I inserted below. This script is a perl decompressor algorithm written in pure perl which expands some raw data which was compressed using the zlib deflate algorithm. I want to study how this works, and I am trying to understand it. But there are some lines in this code which look totally Chinese to me.

Edit: I'm trying to figure out how this readmore tag works, but it doesn't seem to work for me. Even though I insert it, it still shows the entire code block when I click on my post. Sorry.

#!/usr/bin/perl -w use strict; use warnings; my $COMPRESSED = get_compressed_data(); my $origsize = length($COMPRESSED); my $output = ''; my $Z_STREAM_END = 1; my $Z_OK = 0; my $MAX_WBITS = 16; my ($static_lit_code, $static_dist_code, @lit_base, @dist_base); my @lit_extra = (-1, 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2, 3,3,3,3,4,4,4,4,5,5,5,5,0,-2,-2); my @dist_extra = (0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8, 9,9,10,10,11,11,12,12,13,13,-1,-1); my @alpha_map = (16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15); undef $/; my $bufsize; for $bufsize (length ($COMPRESSED), 4096, 16, 1) { my $prefix = " $bufsize, "; my ($inflater, $status) = inflateInit(-WindowBits => -$MAX_WBITS); my $input_copy = $COMPRESSED . "N"; while (length $input_copy) { my $bit = substr($input_copy, 0, $bufsize); substr($input_copy, 0, $bufsize) = ''; my $outbit; ($outbit, $status) = $inflater->inflate($bit); die "$prefix inflate status '$status'" unless $status == $Z_OK || +$status == $Z_STREAM_END; die "$prefix inflate undefined" unless defined $outbit; $output .= $outbit; if ($status == $Z_OK) { die "$prefix inflate not all input consume +d" if length $bit; } elsif ($status == $Z_STREAM_END) { last; } } } my $newsize = length($output); print $output; print "\n\n", '-' x 70; print "\n\nInflate: $origsize bytes -> $newsize bytes\n\n"; exit; #################################################################### sub total_in { $_[0]->{isize}; } sub total_out { $_[0]->{osize}; } sub _reset_bits_have { my $self = shift; $self->{val} = $self->{have} += 0; } sub inflateInit { my %args = @_; die "Please specify negative window size" unless $args{-WindowBits} && $args{-WindowBits} < 0; my $self = bless { isize=>0, osize=>0, result=>"", huffman=>"", type0length=>"", state=>\&stateReadFinal }; $self->_reset_bits_have; wantarray ? ($self, $Z_OK) : $self; } sub inflate { $_[0]->{input} = \$_[1]; my ($return, $status); $_[0]->{izize} += length $_[1]; if (&{$_[0]->{state}}) { # Finished, so flush everything $return = length $_[0]->{result}; $status = $Z_STREAM_END; } else { die length ($_[1]) . " input remaining" if length $_[1]; $return = length ($_[0]->{result}) - 0x8000; $return = 0 if $return < 0; $status = $Z_OK; } $_[0]->{izize} -= length $_[1]; $_[0]->{osize} += $return; wantarray ? (substr ($_[0]->{result}, 0, $return, ""), $status) : substr ($_[0]->{result}, 0, $return, ""); } # Public interface ends here # get arg bits (little endian) sub _get_bits { my ($self, $want) = @_; my ($bits_val, $bits_have) = @{$self}{qw(val have)}; while ($want > $bits_have) { # inlined input read my $byte = substr ${$_[0]->{input}}, 0, 1, ""; if (!length $byte) { @{$self}{qw(val have)} = ($bits_val, $bits_have); return; } $bits_val |= ord($byte) << $bits_have; $bits_have += 8; } my $result = $bits_val & (1 << $want)-1; $bits_val >>= $want; $bits_have -= $want; @{$self}{qw(val have)} = ($bits_val, $bits_have); return $result; } # Get one huffman code sub _get_huffman { my ($self, $code) = @_; $code = $self->{$code}; my ($bits_val, $bits_have, $str) = @{$self}{qw(val have huffman)}; do { if (--$bits_have < 0) { # inlined input read my $byte = substr ${$_[0]->{input}}, 0, 1, ""; if (!length $byte) { # bits_have is -1, but really should be zero, so fix in save @{$self}{qw(val have huffman)} = ($bits_val, 0, $str); return; } $bits_val = ord $byte; $bits_have = 7; } $str .= $bits_val & 1; $bits_val >>= 1; } until exists $code->{$str}; defined($code->{$str}) || die "Bad code $str"; @{$self}{qw(val have huffman)} = ($bits_val, $bits_have, ""); return $code->{$str}; } # construct huffman code sub make_huffman { my $counts = shift; my (%code, @counts); push @{$counts[$counts->[$_]]}, $_ for 0..$#$counts; my $value = 0; my $bits = -1; for (@counts) { $value *= 2; next unless ++$bits && $_; # Ton used sprintf"%0${bits}b", $value; $code{reverse unpack "b$bits", pack "V", $value++} = $_ for @$_; } # Close the code to avoid infinite loops (and out of memory) $code{reverse unpack "b$bits", pack "V", $value++} = undef for $value .. (1 << $bits)-1; @code{0, 1} = () unless %code; return \%code; } sub prepare_tables { my $length = 3; for (@lit_extra) { push @lit_base, $length; $length += 1 << $_ if $_ >= 0; } # Exceptional case splice(@lit_base, -3, 3, 258); my $dist = 1; for (@dist_extra) { push @dist_base, $dist; $dist += 1 << $_ if $_ >= 0; } splice(@dist_base, -2, 2); } sub stateReadFinal { my $bit = _get_bits($_[0], 1); if (!defined $bit) { # STALL return; } $_[0]->{final} = $bit; goto &{$_[0]->{state} = \&stateReadType}; } sub stateReadType { my $type = _get_bits($_[0], 2); if (!defined $type) { # STALL return; } $_[0]->{type} = $type; if ($type) { prepare_tables() unless @lit_base; if ($type == 1) { $_[0]->{lit} = $static_lit_code ||= make_huffman([(8)x144,(9)x112, (7)x24, (8)x8]); $_[0]->{dist} = $static_dist_code ||= make_huffman([(5)x32]); # This is the main inflation loop. goto &{$_[0]->{state} = \&stateReadLit}; } elsif ($type == 2) { goto &{$_[0]->{state} = \&stateReadHLit}; } else { die "deflate subtype $type not supported\n"; } } goto &{$_[0]->{state} = \&stateReadUncompressedLen}; } sub stateReadUncompressedLen { # Not compressed; $_[0]->_reset_bits_have; # inlined input read $_[0]->{type0length} .= substr ${$_[0]->{input}}, 0, 4 - length $_[0]->{type0length}, + ""; if (length $_[0]->{type0length} < 4) { # STALL return; } my ($len, $nlen) = unpack("vv", $_[0]->{type0length}); $_[0]->{type0length} = ""; $len == (~$nlen & 0xffff) || die "$len is not the 1-complement of $nlen"; $_[0]->{type0left} = $len; goto &{$_[0]->{state} = \&stateReadUncompressed}; } sub stateReadUncompressed { # inlined input read my $got = substr ${$_[0]->{input}}, 0, $_[0]->{type0left}, ""; $_[0]->{result} .= $got; if ($_[0]->{type0left} -= length $got) { # Still need more. # STALL return; } if ($_[0]->{final}) { # Finished. return 1; } # Begin the next block goto &{$_[0]->{state} = \&stateReadFinal}; } sub stateReadHLit { my $hlit = _get_bits($_[0], 5); if (!defined $hlit) { # STALL return; } $_[0]->{hlit} = $hlit + 257; goto &{$_[0]->{state} = \&stateReadHDist}; } sub stateReadHDist { my $hdist = _get_bits($_[0], 5); if (!defined $hdist) { # STALL return; } $_[0]->{hdist} = $hdist + 1; goto &{$_[0]->{state} = \&stateReadHCLen}; } sub stateReadHCLen { my $hclen = _get_bits($_[0], 4); if (!defined $hclen) { # STALL return; } $_[0]->{alphaleft} = $_[0]->{hclen} = $hclen + 4; # Determine the code length huffman code $_[0]->{alpha_raw} = [(0) x @alpha_map]; goto &{$_[0]->{state} = \&stateReadAlphaCode}; } sub stateReadAlphaCode { my $alpha_code = $_[0]->{alpha_raw}; while ($_[0]->{alphaleft}) { my $code = _get_bits($_[0], 3); if (!defined $code) { # STALL return; } # my $where = $_[0]->{hclen} - $_[0]->{alphaleft}; $alpha_code->[$alpha_map[$_[0]->{hclen} - $_[0]->{alphaleft}--]] + = $code; } $_[0]->{alpha} = make_huffman($alpha_code); delete $_[0]->{alpha_raw}; # Get lit/length and distance tables $_[0]->{code_len} = []; goto &{$_[0]->{state} = \&stateBuildAlphaCode}; } sub stateBuildAlphaCode { my $code_len = $_[0]->{code_len}; while (@$code_len < $_[0]->{hlit}+$_[0]->{hdist}) { my $alpha = _get_huffman($_[0], 'alpha'); if (!defined $alpha) { # STALL return; } if ($alpha < 16) { push @$code_len, $alpha; } elsif ($alpha == 16) { goto &{$_[0]->{state} = \&stateReadAlphaCode16}; } elsif ($alpha == 17) { goto &{$_[0]->{state} = \&stateReadAlphaCode17}; } else { goto &{$_[0]->{state} = \&stateReadAlphaCodeOther}; } } @$code_len == $_[0]->{hlit}+$_[0]->{hdist} || die "too many codes" +; my @lit_len = splice(@$code_len, 0, $_[0]->{hlit}); $_[0]->{lit} = make_huffman(\@lit_len); $_[0]->{dist} = make_huffman($code_len); delete $_[0]->{code_len}; goto &{$_[0]->{state} = \&stateReadLit}; } sub stateReadAlphaCode16 { my $code_len = $_[0]->{code_len}; my $bits = _get_bits($_[0], 2); if (!defined $bits) { # STALL return; } push @$code_len, ($code_len->[-1]) x (3+$bits); goto &{$_[0]->{state} = \&stateBuildAlphaCode}; } sub stateReadAlphaCode17 { my $code_len = $_[0]->{code_len}; my $bits = _get_bits($_[0], 3); if (!defined $bits) { # STALL return; } push @$code_len, (0) x (3+$bits); goto &{$_[0]->{state} = \&stateBuildAlphaCode}; } sub stateReadAlphaCodeOther { my $code_len = $_[0]->{code_len}; my $bits = _get_bits($_[0], 7); if (!defined $bits) { # STALL return; } push @$code_len, (0) x (11+$bits); goto &{$_[0]->{state} = \&stateBuildAlphaCode}; } sub stateReadLit { while (1) { my $lit = _get_huffman($_[0], 'lit'); if (!defined $lit) { # STALL return; } if ($lit >= 256) { if ($lit_extra[$lit -= 256] < 0) { die "Invalid literal code" if $lit; if ($_[0]->{final}) { # Finished. return 1; } # Begin the next block goto &{$_[0]->{state} = \&stateReadFinal}; } $_[0]->{litcode} = $lit; # BREAK goto &{$_[0]->{state} = \&stateGetLength}; } $_[0]->{result} .= chr $lit; # Back to the main inflation loop # goto &stateReadLit; # ie loop } } sub stateGetLength { my $lit = $_[0]->{litcode}; my $bits = _get_bits($_[0], $lit_extra[$lit]); if (!defined $bits) { # STALL return; } $_[0]->{length} = $lit_base[$lit] + ($lit_extra[$lit] && $bits); goto &{$_[0]->{state} = \&stateGetDCode}; } sub stateGetDCode { my $d = _get_huffman($_[0], 'dist'); if (!defined $d) { # STALL return; } $_[0]->{dcode} = $d; goto &{$_[0]->{state} = \&stateGetDistDecompress}; } sub stateGetDistDecompress { my $d = $_[0]->{dcode}; die "Invalid distance code" if $d >= 30; my $bits = _get_bits($_[0], $dist_extra[$d]); if (!defined $bits) { # STALL return; } my $dist = $dist_base[$d] + ($dist_extra[$d] && $bits); # Go for it my $length = $_[0]->{length}; if ($dist >= $length) { my $section = substr ($_[0]->{result}, -$dist, $length); $_[0]->{result} .= $section; } else { my $remaining = $length; while ($remaining) { my $take = $dist >= $remaining ? $remaining : $dist; $_[0]->{result} .= substr($_[0]->{result}, -$dist, $take); $remaining -= $take; } } # Back to the main inflation loop goto &{$_[0]->{state} = \&stateReadLit}; } #################################################################### # This is a test data sub get_compressed_data { return "\xAD\x58\x5D\x73\x1B\xB9\x11\x7C\x26\x7E\x05\x8A\x2F\x67\x55\x51". "\xAC\x5C\xEE\x72\x77\xC9\x3D\x31\x92\x7C\x62\x45\x96\x58\x94\x74". "\x17\x3F\x82\xBB\x20\x09\x6B\x17\xD8\x00\x58\xD2\xCC\xAF\x4F\xCF". "\x60\xBF\x48\x91\xAE\xA4\x2A\xAE\xB2\xBD\xDC\x05\x06\x83\x9E\x9E". "\x9E\x01\x84\x10\x62\x34\x1A\xC9\x97\xAD\x96\xE3\x99\x8F\x26\x44". "\x93\xC9\x07\x93\x69\x1B\xF4\x98\xBF\x8D\x16\x5E\xAB\x72\x55\x68". "\x21\x68\x94\xB1\x51\xDB\x28\xDD\x5A\xC6\xAD\x09\x32\x77\x59\x5D". "\xD2\x0B\x3C\x47\x27\x43\x54\x51\xE3\x8B\x96\x99\xB3\xB9\x89\xC6". "\xD9\x20\x6B\x9B\x6B\x2F\xF7\x5B\x93\x6D\xA5\x12\x0B\x95\xBD\xA9". "\x8D\x96\xA5\x3A\xC8\x15\x8D\xAB\x8C\xCE\x27\x32\xD4\xF8\x1A\xB7". "\x2A\xF2\xEC\x1B\x57\x1D\xBC\xD9\x6C\xA3\xBC\x77\x05\xCD\x2E\x15". "\x16\xC6\xDF\x20\x83\x2B\xB5\x08\x1A\x1E\x29\x9B\x69\x72\x44\xB5". "\x7E\x63\xCD\xE8\x5D\x21\xDD\x0E\x33\xC8\x4C\xAE\x77\xBA\x70\x55". "\xD9\x79\xAC\x65\x95\x96\x9F\x08\xF8\x53\x68\xB9\x31\x3B\x63\x37". "\xFC\xA5\x0E\xDA\x87\x93\x61\xFC\x9C\x1C\xC1\xEE\x30\x42\x2A\x9B". "\xCB\x1C\xCB\x79\xB3\xAA\xA3\x16\xF4\xBD\xDD\x91\xB1\x52\xC9\xD2". "\x79\x7D\xED\xFC\x75\xA1\x43\x90\x59\x1D\xA2\x2B\x95\x3F\xC8\xB5". "\x0A\x5B\x80\x31\x91\x55\x51\x87\x63\xAB\xA5\x7A\xD3\x02\x18\x07". "\x67\x15\x60\x86\x85\xDC\xAC\x4D\xA6\x18\xBC\xA9\x10\xB7\x7A\x6D". "\x6C\x82\xF2\x6F\x88\xC8\xB8\x59\x6E\x2C\xBD\x5E\x93\xC7\x30\x91". "\x00\x2F\x0A\x9D\xD1\x30\xDA\xC3\x1A\x9B\x0B\x03\x47\x73\xB9\x3A". "\xD0\x30\x31\x3A\x85\x76\x92\xB6\xA4\xBD\xD9\x61\xC9\x9D\x6E\x20". "\x40\x20\xCE\x58\x14\xA3\x0C\x9E\x92\xB9\xB8\xF5\xAE\xDE\x20\x64". "\xFA\x6B\xAC\x55\x71\xE4\xF5\x94\xDC\x7C\x8E\xB0\xAB\x7C\x2E\x7F". "\x87\x93\x78\x39\xF4\x97\x83\xAD\x7A\xE0\xD6\xD2\x44\xB9\x55\x41". "\x5A\x17\x41\x0A\x6D\xC5\x28\xD9\x23\x66\x38\xCF\x9F\xE8\xB5\x6C". "\xDF\x32\xD6\x59\xE6\x7C\xCE\x24\xD8\x9B\xB8\x65\x10\xF6\x26\x6C". "\xC9\xCB\x26\x8A\xEF\x68\x04\x3B\xA1\xD2\x59\xB2\xB1\x02\x37\xF6". "\xEC\xEB\xE9\xB8\x31\xD1\x79\xBF\x75\x9A\x88\x84\x47\xAB\xCA\xB4". "\x66\x02\xBA\x1D\xEC\x3C\xF0\x68\x7F\x05\xB9\x76\x7E\xC8\x1D\xB6". "\xFC\xD9\xD5\x6C\xEC\xE0\xEA\x09\xED\x13\xFF\x7F\xE7\x89\x56\xC6". "\xBE\x11\xF1\xD4\xCA\xD5\x91\x4D\xD2\x2F\x18\xE8\x42\x86\xDF\x62". "\xC4\x39\xB6\x18\xD8\x5B\xF6\x3C\x69\x27\xAD\xB5\x6E\xFC\x45\x60". "\xC8\x61\xAC\x21\x33\x65\xE5\x17\xB0\xCF\xAC\x0F\xD2\xD9\x14\xF9". "\x95\x0A\x86\xA3\x8B\xCD\x18\x85\xF9\x21\x4E\x64\x5E\x57\x45\x13". "\x36\x99\x6D\x95\xDF\xE8\x30\x91\xD1\x94\x9C\x59\x95\x76\x55\x41". "\xC4\xDE\xB9\x62\x87\x58\x88\x11\x71\x25\x38\x98\x9C\x4A\xF9\x01". "\x9B\x03\xE2\x45\xD1\x84\x0D\x01\xFE\x57\x6D\x3C\x91\xC3\x75\x8B". "\x9B\xD8\x10\xF4\x1C\xF3\xB0\x4D\x98\x2A\x0E\x3D\x87\xCB\x8A\x77". "\x4E\x4F\x65\x0D\xD6\x1F\x24\x88\x58\x90\x5B\x58\x3B\x50\x7E\x29". "\xFF\xA6\x63\x22\x68\x89\x35\xB0\xAE\x4A\xB0\x03\x87\xE9\x15\x61". "\xF4\xD1\x6B\x0D\x9B\xB3\x9D\x32\x05\x21\x35\xC6\x86\x95\x0D\x69". "\x8E\x75\x34\x90\xF0\x4A\xBB\xCD\xBB\xB0\x99\xA8\x4B\x31\x32\x31". "\xE8\x62\x0D\x08\xB6\x89\xDE\x5B\xED\x3B\xAD\xC2\xC4\xD0\x81\x41". "\x74\xD8\x02\x8E\xA2\xD5\x0F\x9A\x3F\x15\xA3\x79\x94\xAA\x00\x44". "\x83\x35\x3D\x18\x07\x95\xB3\xB1\x93\x17\x1A\xCB\x56\x81\x56\x97". "\xA3\x78\x2B\x46\x49\x2C\x69\x4C\x00\xE9\x86\x32\x8A\x77\x34\x3E". "\xD3\x86\x57\x8F\xE0\xC3\xF7\x53\x49\x31\x20\x43\xA4\x22\x9C\xC9". "\x50\x34\x3C\xEC\xF1\x0A\x5C\x58\x21\xB0\x65\xD2\xD8\x6E\xED\xE0". "\x6A\x8F\x9C\xC1\xB6\xCB\xE6\x95\x38\xCD\xD5\x4E\xDE\xDB\x0C\xA5". "\x04\x23\xA2\x7A\x4D\xCE\xB2\x26\x40\xCA\xBC\xDB\x99\x9C\xB5\x00". "\x9B\x04\xEB\x44\x4B\x26\x38\x00\x56\x34\xEB\x39\xC4\xDC\x58\x48". "\x44\x9F\x38\xE0\x0B\x4A\x4C\x60\x7F\x55\x08\x2E\x33\xAC\x29\x80". "\x22\x2B\x14\xB8\xE7\x49\xF7\xFE\xDC\x6F\x4E\x55\x15\x02\xBA\xAA". "\x41\x76\xF3\x95\x08\x5A\x39\x1F\xD5\xCA\x14\x44\x10\x7E\xC5\xA6". "\x1C\x45\xEB\x58\x3F\x05\x0B\x1B\x45\xD9\xBB\x92\xDD\x59\xD4\x2B". "\xF8\x28\x6F\x1D\x95\x14\xCA\xB8\xEE\xCB\x29\x3F\x41\xF1\x59\x8B". "\x80\x18\x2A\x4F\x23\x5F\x84\x71\xD8\xD2\x4E\x41\x75\xFC\xCB\xD5". "\xCC\x06\x40\xC2\x39\x00\x8B\xA7\xB8\x62\x57\x3F\xF4\xBB\x62\x77". "\xA1\x58\x8D\xE4\x1F\x08\x42\xCF\x28\xBD\xC3\x9F\xE4\xCE\x1E\x68". "\xC5\x1E\x76\xD1\xC2\x8E\xAF\x28\x5D\xA0\x1D\x7D\x2A\x8D\xA5\x6A". "\x97\x10\xA6\x79\x5A\xC1\x59\x70\xDD\x32\xD7\xA9\xE6\x51\x91\x26". "\xD2\x6E\xDD\x9E\x60\x43\x25\x84\xB2\xB2\x68\x34\xA3\xD8\x30\x0D". "\x4D\xB5\xE1\x5D\x9C\x51\xF5\x39\x2D\xA1\x45\x51\x3E\x3D\xDE\xB5". "\x34\x5A\xA3\x60\xB8\x3D\x4C\xA3\x4E\x49\xFC\x51\x57\x28\x77\x0A". "\x6E\xF0\xC6\x8E\xE2\xD2\x8A\xE9\xBB\x68\x0C\x40\x21\x46\xE3\x57". "\xC9\xB6\x4E\x93\xBA\xE9\x18\xA8\x30\x1C\x40\x87\xC0\x3B\x0A\xCA". "\xE4\x27\xEB\x40\x59\x5E\x83\xB6\x9A\xD5\x9A\x9D\x02\x24\x90\xA9". "\x9D\x2A\x08\x27\x92\xC2\xBA\xE4\x22\x43\xAE\xB6\xB9\x7C\x6C\xC3". "\x71\x61\x57\x5F\x30\x48\xF9\x6C\x0B\x3E\xB1\xA5\x80\x2C\xEE\xBC". "\xA8\x6B\xAC\x31\xAD\xEB\x29\xFE\x63\x73\x70\x4B\x35\x70\x9C\xAF". "\x45\xF0\xCC\xD8\xAC\xA8\xF3\x64\xED\x32\x46\xE7\xB3\x53\x0F\xEA". "\x02\xCD\x5F\x5D\x71\x7F\xD2\x7B\x8F\x80\xB5\xF4\x61\x8D\xA5\x1C". "\x36\xB6\x25\x99\x47\x0A\x25\xD5\x27\xD0\xFD\x46\x59\xF3\xEF\xB6". "\x78\x93\xB9\xEC\x0A\xC9\x4E\x75\x8F\x89\x67\x9D\xBD\x0E\xAD\x1B". "\xFA\xAB\xCE\xEA\x48\x41\xA0\x6E\x8C\x57\xA4\x81\xD4\x0C\x72\x25". "\x40\x12\xAC\x11\xD4\xC8\x76\xB8\x34\x9F\x9B\x3A\x69\x3A\x42\x16". "\x71\x56\x4C\xE4\x4F\x4B\xB5\x23\xE2\xA5\xB0\xC9\xA0\x2B\xE5\x49". "\x5B\x4A\x65\xA9\xE3\xA8\x68\x67\x24\xDE\xCC\xF0\x0B\x1E\x36\x9D". "\x0C\xA8\xEA\x8B\x03\x1B\x6A\x3B\xD6\xC0\xFC\x47\x79\x02\x56\xDC". "\x97\x74\x3A\x70\x26\x6B\x79\xE2\x55\xA2\x64\x52\x99\xBE\x4E\x13". "\x3B\xBC\xA7\xCC\x49\x66\xBB\x66\xE4\x9D\x9E\x08\xF1\x63\x9F\xFD". "\x03\xDD\xE7\xA6\xC1\xBB\x8D\x57\x65\x38\x97\xFA\x6E\xF5\x05\x7D". "\x18\x70\xCD\x49\x4D\xC5\x60\x77\x24\xE2\x67\x84\xF8\x42\x82\xCA". "\x33\x09\x3A\x70\x43\x5D\x64\xDA\x30\xE2\x14\x98\xC2\xAC\x3C\x77". "\xB5\xD4\x13\x4E\xD8\x54\x74\x1B\xCD\xC0\xF0\xF6\xA1\x48\xD1\xD7". "\x59\xA2\xF1\x87\x86\xC7\xC3\xB8\x51\xD8\xBA\x2C\xBC\xA2\x0C\xDB". "\x53\xA5\x6D\x4C\xC9\x8D\x8E\xDF\x0A\x05\xB8\x4E\xDD\x5F\x59\x11". "\x37\xB9\xD1\x1F\x46\xA3\x0B\x40\x09\x62\x40\x0E\xAF\xD1\xB2\xE6". "\x0C\x57\x53\xFA\xDC\x3A\xAD\x33\xE8\xDD\x79\xCE\xFB\x14\xEC\x93". "\x81\xCB\xEA\xC5\x34\x38\xFA\xC0\xC9\x90\x18\x7C\x8E\x78\x8D\xC3". "\x44\x3A\x6D\x33\xEE\x28\x86\xD0\x00\xAF\x63\x70\x26\x1D\xB6\x7D". "\x42\x1D\xE1\xDB\x82\xF7\x5F\x00\xF7\xFF\xE0\xF0\x5F\x7A\x0E\xA7". "\x16\x0A\xC4\xF1\x67\xDB\x52\xCE\x4E\x0A\xD1\xD1\x42\x0D\xC1\xDB". "\x83\x20\xAA\xEC\xA9\x39\xCC\xA0\xD9\xA9\x24\x39\x17\x92\xA1\x50". "\x57\x54\xF6\xBF\x31\x9F\xC5\xA7\x75\xA9\x5D\xFF\x38\x99\xB8\xBB". "\xC3\x94\x7B\xB7\xA7\x5E\x79\x22\x0E\xE7\xD2\xF1\xA4\xF2\x6E\x36". "\x5E\x6F\x48\x79\x18\x96\x84\xDD\x07\xD4\x9C\x60\x56\x88\x2D\xB5". "\xAA\xDA\xA3\x85\x29\xAE\xFA\x2C\x46\x45\x80\x5A\xB1\xB3\x2A\xF5". "\xAF\x83\x29\x72\x38\x25\xB8\x75\xDC\x2B\x50\xFF\x08\xA4\xB3\x29". "\x4D\xFB\x53\x39\xDC\x8E\x26\x9C\xB8\x49\xBD\xB1\xC0\xA4\x1C\xA4". "\xA0\x45\x99\xCA\x6E\x6F\x07\xE8\xE0\xC8\xCC\xF6\xFA\x59\xDF\x05". "\x3E\xCE\xFB\xCA\xEB\xD8\x64\xAE\xB1\x82\x0A\x64\xAF\x30\x8D\x2D". "\xB0\x12\x05\xAD\x48\xA7\x96\xAB\x5F\x93\x99\xD4\xFC\x34\x6D\x0F". "\xF8\xA8\x73\x99\x7A\x74\xCA\xE4\xB6\xBF\x6C\xD1\x3B\xD3\x32\xB6". "\x2D\x7F\x81\xE5\x2F\x69\x8F\x18\x7A\x68\xB8\xE0\xF0\x46\x60\x04". "\x5C\xFC\x69\xCA\x57\x17\x21\xF3\xA6\x8A\x67\xB4\x89\x49\x53\x98". "\xE4\x98\xB1\x38\x5C\x50\x92\x50\xB1\x67\xA8\xF8\xBD\x40\x63\x4B". "\x1F\x3A\xFD\xBF\x28\xC5\x6D\x08\x6A\x3A\xD4\xA3\xBF\xC2\xEE\xE9". "\x64\x5F\x14\xA2\xEF\xDA\x07\xA7\xC3\xE3\xD9\xE9\xB4\x43\x67\x4E". "\x6A\x08\x5C\x77\xBC\xDC\xA0\x39\xA1\xB2\x46\xFD\x9C\x2E\x93\x6C". "\x34\xE7\x8D\x80\x94\x1B\x70\xA5\x38\x1C\x7D\xED\x68\x99\xB7\xE9". "\x7A\x94\x14\xF3\x75\xEA\x4D\x5A\x6C\xB0\xE9\x63\x68\xC0\xB9\x6F". "\xDA\x90\x3B\x1C\x12\xD3\x81\xE1\x9A\xB6\x8A\x31\x63\xEC\xB3\x2E". "\xAB\x31\x19\xC3\x33\xD1\x84\x0E\x57\x38\x1B\xE4\x8C\x56\x42\x95". "\xCF\xB5\x62\x85\x9E\x1F\x6B\x0D\xA8\x64\x4A\x86\x21\x52\xBB\x79". "\xAA\x08\xA9\x8D\xB2\x69\x4C\x22\x96\xB0\xDA\x70\xAA\xBD\x27\xD8". "\x39\x3D\xE9\xA2\x64\x49\x2B\x98\x99\x38\x4F\x51\x74\x64\x1F\x9D". "\xC1\xF9\x85\x1D\x5E\xA0\xA3\x40\xAC\xAB\x6D\x90\x3F\x30\xB6\x3F". "\x9E\xAF\xA6\x82\xE2\xEE\x35\x68\x18\x48\xC1\x5B\x6F\x4F\x37\x97". "\xBC\xBB\x74\x9A\x12\x7D\xC3\xF6\xF3\x54\xDE\xC0\xCA\xCA\x3B\x3A". "\xEE\x36\x9A\xCF\x05\xCD\xAB\x46\x20\x2A\x43\x88\x0F\xC7\x50\x27". "\xC0\x65\xA0\x80\x50\xD7\x54\x2A\xAE\x7A\x7E\xAF\xF8\xF8\xD0\x64". "\x80\x7D\xE3\x83\x0A\x9F\xAA\x4F\x5A\x09\x9F\x5A\x4F\xA1\xCB\xBA". "\x50\xDC\xBF\xF6\x0B\xD0\xE4\x9D\xF2\x26\xD5\xB4\xA6\xF6\xB7\x8B". "\xC9\x9C\xEE\xA1\xDA\x8B\xA4\x01\x4D\x12\xD6\xCD\x15\xC0\xE0\x00". "\xD4\xEA\xDF\xFB\x24\x50\x7C\x03\xA2\xC5\xA0\x0F\xC7\xB8\x94\x9E". "\x9C\xA7\x7D\x5C\xE4\x4F\x47\x01\x01\xFC\x47\x1E\x37\x91\x49\x07". "\x97\x63\x77\xFB\x23\x53\x0A\xE4\xDE\xD5\x94\x4D\x8A\x9A\xE4\x74". "\x29\xB1\xC6\x51\x82\xFD\x40\x0A\x20\xB0\x1C\xAA\x08\x82\xF4\x37". "\x39\xAD\x31\x44\xEC\x97\xA9\x9C\xF5\x4A\x76\xF6\x80\x2C\xD5\x20". "\x59\x8F\x39\x8A\x91\xAA\x80\x2B\x41\x54\xDA\x97\x26\x52\xC2\xBD". "\xD7\x42\x72\xED\x5D\xFB\x17\x3A\xBD\xFB\x35\x8D\x34\x61\x92\x8E". "\x6C\xD6\xF1\x4D\x27\x30\x83\xBD\xB2\xE2\x3B\xD8\x52\xE5\xBA\xBD". "\x54\x3C\xAB\xF5\x6B\x45\x2D\xC7\xCE\x50\x21\xD2\xA2\xB9\x77\xD1". "\x08\x3C\x5D\x7F\xB6\x31\xBF\xB0\x0D\xA8\xCA\x33\x31\x9F\xFC\x1C". "\x46\x5D\xFC\x0F\xE9\x09\x28\xFF\x9A\x44\x9B\x0F\x17\x97\x2E\xEA". "\xDA\x72\xBE\x62\x54\xF8\x46\x09\x4E\x3A\x1F\x74\xA3\xDE\xA5\x8B". "\xBA\x2D\x78\x41\x9E\x9C\xF2\xB9\x4E\xA4\xB2\xDA\x5D\x5D\x34\xF7". "\x7E\x19\x26\x1B\x98\xD8\x7B\x8A\x02\x0A\x2D\xC5\x23\x34\x2D\xD2". "\xF7\x7F\x82\x6B\xF7\xF3\x67\xB9\x98\xDD\xFC\x63\xF6\xDB\x9D\xA4". "\xC7\xE5\xD3\xEF\xF3\xDB\xBB\x5B\x39\x9E\x3D\xE3\xF7\x58\xCE\x1E". "\x6F\xE5\x1F\xF3\x97\xFB\xA7\xD7\x17\x3C\x7F\x96\x77\xFF\x5C\x2C". "\xEF\x9E\x9F\xE5\xD3\x52\xCC\x3F\x2D\x1E\xE6\x18\xFA\xC7\x6C\xB9". "\x9C\x3D\xBE\xCC\xEF\x9E\x27\x72\xFE\x78\xF3\xF0\x7A\x3B\x7F\xFC". "\x6D\xD2\xCD\x7A\x98\x7F\x9A\xBF\xCC\x5E\xE6\x4F\x8F\x13\x2C\x87". "\x55\xD2\x34\xD1\x4F\x93\x4F\x1F\xE5\xA7\xBB\xE5\xCD\x3D\xFD\xFC". "\xFB\xFC\x61\xFE\xF2\x99\xD7\xFD\x38\x7F\x79\xA4\xB5\x3E\x3E\x2D". "\xE9\x6E\x62\xB6\x7C\x99\xDF\xBC\x3E\xCC\x96\x72\xF1\xBA\x5C\x3C". "\x3D\xDF\x4D\xD3\x95\x3D\xC1\x7B\x87\x93\xFD\x7F\x00"; } #################################################################### __END__ Compress::Zlib::Perl Partial Pure perl implementation of Compress::Zlib SYNOPSIS use Compress::Zlib::Perl; ($i, $status) = inflateInit(-WindowBits => -MAX_WBITS); ($out, $status) = $i->inflate($buffer); DESCRIPTION This a pure perl implementation of Compress::Zlib's inflate API. Inflating deflated data Currently the only thing Compress::Zlib::Perl can do is inflate compre +ssed data. A constructor and 3 methods from Compress::Zlib's interface are replicated: inflateInit -WindowBits => -MAX_WBITS Argument list specifies options. Expects that the option -WindowBits i +s set to a negative value. In scalar context returns an C<inflater> object; +in list context returns this object and a status (usually C<Z_OK>) inflate INPUT Inflates this section of deflate compressed data stream. In scalar con +text returns some inflated data; in list context returns this data and an o +utput status. The status is C<Z_OK> if the input stream is not yet finished, C<Z_STREAM_END> if all the input data is consumed and this output is t +he final output. inflate modifies the input parameter; at the end of the compressed str +eam any data beyond its end remains in I<INPUT>. Before the end of stream +all input data is consumed during the C<inflate> call. This implementation of C<inflate> may not be as prompt at returning da +ta as Compress::Zlib's; this implementation currently buffers the last 32768 + bytes of output data until the end of the input stream, rather than attempti +ng to return as much data as possible during inflation. total_in Returns the total input (compressed) data so far total_out Returns the total output (uncompressed) data so far Ton Hospel wrote a pure perl gunzip program. Nicholas Clark, (nick@talking.bollo.cx) turned it into a state machine and reworked the decompression core to fit Compress::Zlib's interface. COPYRIGHT AND LICENSE Copyright 2004 by Ton Hospel, Nicholas Clark This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Replies are listed 'Best First'.
Re: Weird syntax. What does this goto statement do?
by Athanasius (Archbishop) on Dec 30, 2023 at 09:19 UTC

    Hello harangzsolt33,

    From goto:

    The goto &NAME form is quite different from the other forms of goto. In fact, it isn't a goto in the normal sense at all, and doesn't have the stigma associated with other gotos. Instead, it exits the current subroutine (losing any changes set by local) and immediately calls in its place the named subroutine using the current value of @_. This is used by AUTOLOAD subroutines that wish to load another subroutine and then pretend that the other subroutine had been called in the first place (except that any modifications to @_ in the current subroutine are propagated to the other subroutine.) After the goto, not even caller will be able to tell that this routine was called first.

    NAME needn't be the name of a subroutine; it can be a scalar variable containing a code reference or a block that evaluates to a code reference.

    I’ve seen this in the documentation, but never had any need to use it.

    Hope that helps,

    Athanasius <°(((><contra mundum סתם עוד האקר של פרל,

      > but never had any need to use it.

      In most cases it's used in combination with AUTOLOAD because after delegating the call you don't want to return to the AUTOLOAD routine, hence a call frame must be skipped.

      Some use it to implement continuations or coroutines, but it's not very performant compared to languages like LISP.

      Similarly you can implement case/switch mechanism using it with named subroutines instead of labels, but without much of a spead gain compared to a classical dispatch table.

      Edit

      The author in question is implementing a state machine via dispatch tables and doesn't want the call stack to fill up with every state switch.

      He could also have used classic goto LABLE; , but (obviously) using subs gives him more flexibility in maintenance than spaghetti code would.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      see Wikisyntax for the Monastery

        Some years ago, i used the goto when i needed a throw method with a stack trace that starts where the method was called. I used Exception::Class exceptions, and the method looked something like this:

        sub throw { my $self = shift; my %args = @_; @_ = ('Local::Exception', # ... more params taken from %args and from the object ... ); goto &Exception::Class::Base::throw; }

        You said "coroutines", but I think you meant "tail call elimination"?

        I used it in Karel to implement WHILE and REPEAT (basically, if the loop should continue, the step that evaluated the condition is directly followed by the step that runs the first command in the loop body).

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      Thank you! That's amazing..
Re: Weird syntax. What does this goto statement do?
by eyepopslikeamosquito (Archbishop) on Dec 30, 2023 at 09:40 UTC

      There seems to be a common theme of golf from eyepopslikeamosquito

      How about some scores...

      use List::Util qw(shuffle); @n=qw(Andrew Bob Charlie David);sub h{my$n=pop;$n=~s/([aeiou])/lc$1/eg +;$n}print"$_\t"for shuffle map{h($_)}@n;print"\n",map{$_."\t".int(ran +d(10)+70)."\t".int(rand(10)+70)."\t".int(rand(10)+70)."\t".int(rand(1 +0)+70)."\n"}@n;

      Wow, interesting!

      No, the only reason I included the entire block of code is because originally, it was longer and it did not run at first (on TinyPerl 5.8), so I made some changes to it, and now it works on TinyPerl 5.8. I tried to cut it down to minimum size, so only the inflater is included, nothing extra. In addition, I inserted a block of sample compressed code so it runs. I wanted to make sure I include something that runs without errors, not just a line of code without any context. Btw I tried to use the readmore tag, and it doesn't seem to work. I don't know why.

        The readme tag works when the node is embedded in another node (e.g. in the Monastery Gates or as a reply to another node). When viewing the node itself, the tag is always expanded. I configured a different CSS style for it so I can detect it even when expanded.

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: Weird syntax. What does this goto statement do?
by ikegami (Patriarch) on Dec 30, 2023 at 14:07 UTC

    &{ ... } is a dereference.

    goto &... is a sub call identical to &..., except that the current stack frame is removed from the call stack. This makes the the caller of the sub the caller of the current sub. In this case, I suspect it's used to prevent the call stack from endlessly growing.

      Wouldn't it be easier to just write: goto &stateReadLit; ? There appears to be a sub by the name stateReadLit, so if he wants to jump there, why not just goto and then insert the sub name? But there's also an equal sign in this line and an arrow -> and a bareword "state". I don't even know why this is legal. Nowhere in the script is "state" defined or declared. I mean if it was a scalar variable, I would expect that there would be a my $state = 0; somewhere, but no. The word "state" is just a word that pops up all of a sudden, and I don't know why Perl recognizes a random word that's not a keyword. I mean I looked in Perl functions list, and "state" is not a builtin Perl function.
        I mean I looked in Perl functions list, and "state" is not a builtin Perl function.

        Yes, it is.


        🦛

        Wouldn't it be easier to just write: goto &stateReadLit;?

        All you did was remove the assignment to $_[0]->{state} (then collapsed a remaining reference-dereference). Why do you think removing the assignment is ok?

Re: Weird syntax. What does this goto statement do? (Short answer)
by LanX (Saint) on Dec 30, 2023 at 18:44 UTC
    > goto &{$_[0]->{state} = \&stateReadLit};

    Short answer , the author is implementing a state machine.

    • &{...} is dereferencing a code-ref
    • The subs are the actions per state.
    • The goto &NAME syntax guaranties that the call stack isn't needlessly filled up with every state change. (See caller )
    • The current state is kept as a code-ref in an attribute of the object ( blessed hash) passed as first argument.

    The "weird" code ...

    • goto &{$_[0]->{state} = \&stateReadLit};

    ... could be translated to a more explicit version:

    my ($self,...) = @_; ... my $c_next = $self->{state} = \&stateReadLit; # keep track goto &{$c_next};

    or even

    my ($self,...) = @_; ... $self->{state} = \&stateReadLit; goto &stateReadLit;

    update

    Personally I have some doubt about this design decision, it doesn't seem like the code ref in $obj->{state} is ever used for anything more than a Boolean check.

    If I was storing a state I'd use a readable name not a ref.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

Re: Weird syntax. What does this goto statement do?
by ikegami (Patriarch) on Jan 03, 2024 at 18:55 UTC

    A simple state machine (with information missing):

    INITIAL_STATE SOME_STATE FINAL_STATE +-----+ +-----+ +-----+ start--->| |---+--->| |------->| |--->accept +-----+ | +-----+ +-----+ | | +-------+

    We can imagine a state machine as a bunch of gotos.

    INITIAL_STATE: { ... goto SOME_STATE; } SOME_STATE: { ... if ( ... ) { goto SOME_STATE; } else { goto FINAL_STATE; } } FINAL_STATE: { ... }
    We could replace those gotos with sub calls.
    sub initial_state { ... return some_state(); } sub some_state { ... if ( ... ) { return some_state(); } else { return final_state() } } sub final_state { ... return; } initial_state();

    The stack will keep growing and growing unless we perform tail call elimination, which can be done using goto &SUB

    sub initial_state { ... goto &some_state; } sub some_state { ... if ( ... ) { goto &some_state; } else { goto &final_state; } } sub final_state { ... return; } initial_state();

    This is the approach used by the code in question. goto &sub is slow, though. (Slower than a sub call.) I think we could use a loop to speed things up.

    sub initial_state { ... return \&some_state; } sub some_state { ... if ( ... ) { return \&some_state; } else { return \&final_state; } } sub final_state { ... return undef; } my $state = \&initial_state; while ( $state ) { $state = $state->(); }

    The code in question appears to have some code place to support this. It "returns" the new handler sub (by assigning it to $_[0]->{state}), but it never ends up using it.

      I think tail call elimination aka optimization is wrong here , because it's not optimized in Perl (you noted it's slow)

      One can call it tail-call (without elimination)

      TCE in languages like LISP optimize calls to jumped and can replace the necessity for loop constructs.

      > by assigning it to $_[0]->{state}), but it never ends up using it.

      It does, but only once as Boolean test.

      > goto &sub is slower

      Really? I expected the same...

      But I kind of remember that the implementation is awkwardly cleaning the frame right after it was created...

      > I think we could use a loop

      I had a similar idea, but I would return names not references, and store current-state inside the loop. This would improve readability a lot and allow to easily trace/log the execution.

      Will add tested code tomorrow

        ... To be continued ...

      update
      use v5.14; use warnings; my ($state,$last); sub initial_state { say "*** Initializing"; return "some_state"; } sub some_state { my $in = shift; if ( not $in ) { return "some_state"; } else { return "final_state"; } } sub final_state { say "*** Finalizing"; return undef; } $state = "initial_state"; my @input = (0,0,0,1); my $log = 1; while ( $state ) { $last = $state; my $in = shift @input; no strict 'refs'; $state = $state->($in); say "$last \t--($in)-> \t$state" if $log && $state; }
      -->
      *** Initializing initial_state --(0)-> some_state some_state --(0)-> some_state some_state --(0)-> some_state some_state --(1)-> final_state *** Finalizing

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      see Wikisyntax for the Monastery

        I think tail call elimination aka optimization is wrong here , because it's not optimized in Perl (you noted it's slow)

        A slow tail call elimination is still a tail call elimination. While it may be used as a speed optimization in other languages, it has another benefit: Not generating a new stack frame. This optimization of memory usage could even be considered the main reason for using it (allowing recursion to be used to create loops without extra memory usage). This is the reason for using tail call elimination that's relevant here, as I mentioned.

        I would return names not references

        Also called symbolic references. So yeah, that works. One doesn't even need to turn off strict if you call them as methods ($self->$method()). I was keeping size and complexity minimal.

        It does, but only once as Boolean test.

        ah, I missed that. Still, the gist of what I said is still accurate.