This was a program I wrote for an advanced Perl class. It allows you to see what a regex is going to do on a given line of data. At present it doesn't handle multiple lines of data.

p.s. This is my first CUFP post so go easy... : )

#!/usr/bin/perl use strict; use Tk; use constant WIDTH=>80; =pod =head1 NAME PREP - Practice Regular Expression in Perl! =head2 Version: 1.0 =head2 Date: 4/25/02 =head2 Author: Dave Turner dave@wrightpopcorn.com =head1 SYNOPSYS =head2 This program allows a user who is unfamiliar with regular expre +ssions to test the various regexes and options on a string to see what th +e results are. The regex is entered by the user, and the options are sele +cted via the checkboxes. The m, s, and tr have been implemented. The user is then shown the result of their regex on their input +. If the user has a regex of the flavor m/(www)\.myhost\.com/, the progr +am will list the parenthetical matches in order after any other match h +as been shown. Note: Input is limited to a single line. This program does not +handle variables in the input, nor does it work on multiple lines. Als +o, the 'o' option is not used as this is done as a single line to test not + an entire text file. =head1 KNOWN BUGS =head2 At present, when doing a tr substitution, the checkboxes are no +t being disabled. This causes an error message about a bareword on comp +letion of the program. =cut use constant COLOR => '#FFEDB4'; use constant COLOR2 => '#3396DF'; my ($top, $regex, $user_text, $match_results, $font, $tr); my $options=' '; my $i_status=' '; my $cg_status=' '; my $g_status=' '; my $m_status=' '; my $x_status=' '; my $s_status=' '; my $flag=0; $top=MainWindow->new(); $top->title("PREP - Practice Regular Expressions in Perl!"); $top->minsize( 300, 400 ); $top->configure(-background=>COLOR); &fontsize(12, 'arial', 'bold'); &fontsize(14, 'arial', 'bold'); &fontsize(18, 'arial', 'bold'); my $title=>$top->Label( -text=>"\nPractice Regular Expressions in Perl\ +n", -bg=>COLOR, -font=>'code18', )->pack(); my $regtxt=$top->Label( -text=>'Regular Expression to test', -bg=>COLOR, -font=>'code14', )->pack(); my $reg=$top->Entry(width=>30, -font=>12, -textvariable=>\$regex, )->pack(); my $textlbl=$top->Label(-text=>"Enter a string to parse", -bg=>COLOR, -font=>'code14', )->pack(); my $text=$top->Entry(width=>30, -font=>12, -textvariable=>\$user_text, )->pack(-side=>'top'); my $result=$top->Label(-text=>"Results of regex", -bg=>COLOR, -font=>'code14', )->pack(); my $restxt=$top->Label(width=>WIDTH, -textvariable=>\$match_results, -bg=>'white', -relief=>'groove', -font=>14, -height=>10, -justify=>'left', )->pack(-side=>'top'); my $leave=$top->Button(-text=>'Exit', -command =>[$top=>'destroy'], -font=>'code14', )->pack(-side=>'bottom', -anchor=>'se', -ipadx=>10, -expand =>16); my $frame1=$top->Frame; my $frame2=$top->Frame; my $frame3=$top->Frame; $frame1->pack(); $frame2->pack(); $frame3->pack(); my $test=$top->Button(-text=>'See what it does!', -command =>\&do_it, -font=>'code14', )->pack( -ipadx=>10, -expand =>16); my $i=$frame1->Checkbutton( -bg=>COLOR, -text=>'Ignore Case (i)', -command=>\&global, -activebackground=>COLOR, -onvalue=>'i', -offvalue=>' ', -variable=>\$i_status, -font=>'code12', -justify=>'left', -takefocus=>$tr, )->pack( -side=>'left', ); my $g=$frame1->Checkbutton( -bg=>COLOR, -text=>'Match globally (g)', -command=>\&global, -onvalue=>'g', -offvalue=>' ', -activebackground=>COLOR, -variable=> \$g_status, -font=>'code12', -justify=>'left', )->pack( -side=>'left', ); my $s=$frame1->Checkbutton( -bg=>COLOR, -text=>'Let . match newline (s)', -command=>\&global, -onvalue=>'s', -offvalue=>' ', -activebackground=>COLOR, -variable=>\$s_status, -font=>'code12', -justify=>'left', )->pack( -side=>'left', ); my $x=$frame2->Checkbutton( -bg=>COLOR, -text=>'Ignore whitespace (x)', -command=>\&global, -onvalue=>'x', -offvalue=>' ', -activebackground=>COLOR, -variable=>\$x_status, -font=>'code12', -justify=>'left', )->pack( -side=>'left', ); my $cg=$frame3->Checkbutton( -bg=>COLOR, -text=>"Continue search after failed /g match +(/cg)", -command=>\&global, -onvalue=>'cg', -offvalue=>' ', -activebackground=>COLOR, -variable=>\$cg_status, -font=>'code12', -justify=>'left', )->pack( -side=>'left', ); my $m=$frame2->Checkbutton( -bg=>COLOR, -text=>'Let ^ and $ match next to \n (m)', -command=>\&global, -onvalue=>'m', -offvalue=>' ', -activebackground=>COLOR, -variable=>\$m_status, -font=>'code12', -justify=>'left', )->pack( -side=>'left', ); MainLoop(); sub global{ =head2 We need to check here if we have an m, s or tr substitution + as the number of // is different. With m we're removing everything + after the second / while the s and tr we're taking everything off aft +er the third /. =cut if (substr($regex,1,0) eq 'm'){$regex =~ s/(\/.+\/).+/$1/} else {$regex =~ s/(\/.+\/.+\/).+/$1/}; $options =''; $options = $i_status.$m_status.$x_status. $cg_status.$s_status.$g_status; $options =~ s/ //g; $regex .= $options; } sub do_it{ my @matches=(); =head2 Since we need to run our user's input through the regex here w +e're concatenating the options. The options have already had the wh +ite spaces removed before we concatenate them. =cut $match_results = $user_text; =head2 Here we use the reference to the text that we want to run the +regex on otherwise it will just evaluate both strings as strings and no +t change the value of the variable we're looking at. =cut if (substr($regex,0,1) eq 's'){ eval "\$match_results =~ $regex"; }; =head2 In evaluating the regex for 'm' and 'tr' we want the value of $match_results to receive the changed value of $user_text, so +we're taking the reference to $u\n";er_text. =cut if (substr($regex,0,1) eq 'm'){ $match_results="m// Matched: "; =head2 If we've had a regular expression with grouping, we need to sho +w the matches we made there too. Unless we count the number of matchi +ng parens, which could get odd if there was a match looking for '( +' and ')', we run a simple counter and check if the match variable exists. + If it does, we print it out. If no parenthesis are used, $match_resul +ts holds our match. =cut {eval "\@matches = (\$user_text =~ $regex); \$match_results .= +\$&;"; $match_results .= "\n"; $match_results .= " Variables:\n"if defined($matches[$i]) +; for (my $i = 1; $i <= @matches; $i++) { my $found = $matches[$i - 1]; $match_results .= " \$$i:$found \n" if defined($matc +hes[$i]); } } # end eval block } # end if substr eq 'm' eval "\$match_results =~ $regex" if substr($regex,0,2) eq 'tr'; &chk_res($match_results); } sub fontsize{ my ($size, $family, $type)=@_; $font=$top->fontCreate( 'code'.$size, -family=>$family, -weight=>$type, -size=>$size, ); } sub chk_res{ my $text = shift; my $len = length($text); my $newtext=''; if ($len>80){ for my $i (0.. $len){ $i%79==0?$newtext .=substr($text,0, $i).'\n' :$newtext .= substr($text,0,$i); print "New Text is:$newtext\n"; } } else{ return $text}; return $newtext; }

Some people fall from grace. I prefer a running start...

Replies are listed 'Best First'.
Re: Regex tutor in Perl/Tk
by educated_foo (Vicar) on Jul 01, 2002 at 02:04 UTC
    The idea is definitely cool, but there are a couple bugs in the execution:
    • You set the colors for some elements, but not for all. For people whose color defaults are light-on-dark, it's impossible to read. Luckily, Tk handles standard -fg and -bg options, so this is quickly solved. However, you might want to either completely leave the colors alone (easy), specify them all (medium), or choose your colors to work with the defaults (hard).
    • It's great for seeing what substitutions will do, but doesn't seem to work for regexes. With a string of "abbaaba" and a regex "a+", I get "abbaaba".
    Also, it would be much cooler if you could show what the regex did element by element. For example, you could turn on /x by default, split the RE on whitespace, and show both what each prefix matches, and what each matches with the rest of the RE as a lookahead assertion (untested):
    local $_ = $string_to_match; my @reparts = split ' ', $re; my $pre = ''; while (@reparts) { $pre .= shift @reparts; print "$pre w/ lookahead: ", /$pre (?=@reparts)/x, "\n"; print "Prefix $pre: ", /$pre/x, "\n"; } print "Shebang entier: ", /$pre/x, "\n";
    /s
      Thanks for the feedback!

      I thought that it was at least stable. I'll have to go back and have a look as soon as I finish my current project.

      As far as showing the execution, I believe that BooRadley(?) is either in process or has done something like that. Mine is just a simple tool to check a regex for beginners to test their regexes to see if they're doing what they expect them to do so as to help speed up their regex debugging.

      Actually it's helped me out a few times when I was trying to construct a regex that I wasn't sure of, so hopefully someone else can make use of it too. : )

      Some people fall from grace. I prefer a running start...

(ichi) Re: Regex tutor in Perl/Tk
by ichimunki (Priest) on Jul 01, 2002 at 15:40 UTC
    Just a suggestion (not a criticism!), when you have a lot of Tk widgets that share properties, like you do here with font, justify, offvalues, etc, you might consider simply making an array of those properties and stuffing that into the property list for each widget instead.
    my @props = ( '-color' => COLOR, '-justify' => 'left', '-etc' => 'etc' ); my $button = $root->Button( -text => 'Sample', @props )->pack();
    This keeps your code more concise and makes the differences between the widgets easier to see.
      Thanks for that!

      Every time I think I know Perl... : )

      Some people fall from grace. I prefer a running start...

Re: Regex tutor in Perl/Tk
by Juerd (Abbot) on Jul 01, 2002 at 20:43 UTC

    Your pod is wrong. =head2 is for headers, not content paragraphs, indented code is whitespace, you need a lot more blank lines. Use podselect and podchecker to find out all the problems. Read perlpod for help.

    Pod isn't just how we write our documentation so it's in a standardized format, it's there to allow easy parsing and conversion to other text formats (html, manual pages, plain text).

    =head1 NAME Module::Name - Description =head1 SYNOPSIS Only code is indented What is indented, is rendered as code =head1 DESCRIPTION See how there are blank lines around the =foos. =head2 Some sub-section of DESCRIPTION Strangely, head2s are often not in ALLCAPS. Note how the text body itself is not part of the =head2. These words are not newline terminated in the end result. This is a new paragraph, because of the blank line. =head1 AUTHOR Don't forget to say <who@you.are> =cut
    And when documenting a specific part of the code, add the sub name or some other pointer in the documentation, as code context is lost when pod is converted to the end result.