http://qs1969.pair.com?node_id=1189700


in reply to Re^2: Regular Expression Test
in thread Regular Expression Test

Here's the same code with one debug print added to show what's going on at the beginning of each time around the "while 1" loop.

#!/usr/bin/perl # http://perlmonks.org/?node_id=1189605 use strict; use warnings; $| = 1; sub err { die "ERROR: ", s/\G/<@_>/r, "\n" } sub crossmap { my ($left, $right) = @_; [ map { my $prefix = $_ ; map $prefix.$_, @$right } @$left ]; } sub expr { my $answer = [ '' ]; print("\ndebug ", s/\G/<<you are here with [@$answer]>>/r, " end " +), $answer = /\G\|/gc ? [ @$answer, @{ expr() } ] : /\G\d+/gc ? crossmap $answer, [ $& ] : /\G\[\d+\]/gc ? crossmap $answer, [ $& =~ /\d/g ] : /\G\(/gc ? ( crossmap($answer, expr()), /\G\)/gc || err "missing ')'")[0] : return $answer while 1; } while(<DATA>) { chomp; print "$_ => "; my $answer = expr(); /\G\z/gc or err "incomplete parse"; local $" = ','; print "@$answer\n"; } __DATA__ (65|70) (3[678]|4[1678]) 5[45] (6[4569]|7[01])

It prints the following ( showing both where the current position in the input is (the \G) and the contents of the variable $answer at the time).:

(65|70) => debug <<you are here with []>>(65|70) end debug (<<you are here with []>>65|70) end debug (65<<you are here with [65]>>|70) end debug (65|<<you are here with []>>70) end debug (65|70<<you are here with [70]>>) end debug (65|70<<you are here with [65 70]>>) end debug (65|70)<<you are here with [65 70]>> end 65,70 (3[678]|4[1678]) => debug <<you are here with []>>(3[678]|4[1678]) end debug (<<you are here with []>>3[678]|4[1678]) end debug (3<<you are here with [3]>>[678]|4[1678]) end debug (3[678]<<you are here with [36 37 38]>>|4[1678]) end debug (3[678]|<<you are here with []>>4[1678]) end debug (3[678]|4<<you are here with [4]>>[1678]) end debug (3[678]|4[1678]<<you are here with [41 46 47 48]>>) end debug (3[678]|4[1678]<<you are here with [36 37 38 41 46 47 48]>>) end + debug (3[678]|4[1678])<<you are here with [36 37 38 41 46 47 48]>> end + 36,37,38,41,46,47,48 5[45] => debug <<you are here with []>>5[45] end debug 5<<you are here with [5]>>[45] end debug 5[45]<<you are here with [54 55]>> end 54,55 (6[4569]|7[01]) => debug <<you are here with []>>(6[4569]|7[01]) end debug (<<you are here with []>>6[4569]|7[01]) end debug (6<<you are here with [6]>>[4569]|7[01]) end debug (6[4569]<<you are here with [64 65 66 69]>>|7[01]) end debug (6[4569]|<<you are here with []>>7[01]) end debug (6[4569]|7<<you are here with [7]>>[01]) end debug (6[4569]|7[01]<<you are here with [70 71]>>) end debug (6[4569]|7[01]<<you are here with [64 65 66 69 70 71]>>) end debug (6[4569]|7[01])<<you are here with [64 65 66 69 70 71]>> end 64 +,65,66,69,70,71

I hope this is helpful. At each debug output you can see what the /\G.../gc has stepped over and what the new value of $answer is.

My problem with explaining this type of parser is that I have been working with parsers like this for well over a year, and it all comes as second nature to me. If you have more specific questions I'll be willing to take a shot at answering them.