in reply to Help with Variable Scope

Within the subroutine lex, @output refers to the local lexical declared with my. In argout as Fletch noted it refers to the package variable.

The use of $buffer also refers to the package variable. If you were to use strict you would get error messages like:

Global symbol "$buffer" requires explicit package name at ... Global symbol "@output" requires explicit package name at ... Global symbol "%states" requires explicit package name at ...
To get rid of these messages simply declare the variables at the beginning of your source file with either:
my ($buffer, @output, %states);
or
our ($buffer, @output, %states);
if they need to be visible from outside the current package.
Knowing this would be one thing, but what I really want is that the $buffer, the @input, the @output, and possibly even the $c would be data inside of a "Dfa" object,

To implement a more object-oriented approach like you might be inclined to do in Java, one usually would proceed like this:

package DFA; sub new { my $dfa = {}; bless $dfa, shift; # associates $dfa with the current package $dfa->{buffer} = ''; $dfa->{output} = []; # note: output is now an array reference $dfa->{states} = {}; # note: states is now a hash ref $dfa; # return the new object } ...

And here is an example of how a subroutine like argout would be implemented as method of the DFA class:

sub argout { my $dfa = shift; # $dfa here is equivalent to 'this' in Java my $temp = { type => 'ARG', value => $dfa->{buffer} }; $dfa->{buffer} = ''; push(@{$dfa->{output}}, $temp); } # example of constructing a DFA object and calling argout: my $dfa = DFA->new(); $dfa->argout();
Also note how references to the variables $buffer and @output have changed.

This is just a very cursory overview of how to create a class in perl. For more details, see perldoc perlobj.

Replies are listed 'Best First'.
Re^2: Help with Variable Scope
by tford (Beadle) on May 04, 2008 at 21:17 UTC
    Thanks, pc that helped a lot. In case anybody's wondering what I'm up to, the eventual goal is to write a DFA builder.

    Given a list of variable names and recognized functions (at run time), the DFA that it builds should be able to tokenize mathematical expressions that have been written using implied multiplication. for example, if you have variables called 'si' and 'n', and a function called 'sin(', the expression sin(sin) should be tokenized as if the user had entered sin(si*n). The list of tokens produced might look like the following.

    ARG,sin( ARG,si ARG,n

    I imagine that there is also probably some way to do this using regular expressions, and it would probably involve using look-ahead characters. I think that I might be able to do it faster with recursion, however, and of course the main reason I'm going back to finite state machines is that I'm fascinated with finite state machines! After a lot of consideration, I decided that the DFA object should really only have it's state information as data.

    My original idea was to include the buffer and all that other stuff inside the object itself, and then to have different member functions that knew what to do depending on the current state the machine was in. This was a nice idea, but I'm trying to go for speed here, and when you really think about it, passing a bunch of object references around, and then using the $self->{buffer}, $self->{output} notation all the time is bound to have added overhead. So basically, since there really is only two things the machine can do (collect an input character in the buffer, or tokenize what's already in the buffer), I "hard-coded" those two actions into the main loop of the lex function.

    If anyone's interested, here is some toy code that can only recognize the tokens 'si' and 'sin('

    use strict; use warnings; package DFA; my $dfa = { start =>{nextstate =>{s =>'var1_1orfun1_1'}, entrytoken=>'OTHER'}, var1_1orfun1_1=>{nextstate =>{i =>'var1_2orfun1_2'}}, var1_2orfun1_2=>{exittoken =>{n =>'', default=>'ARG'}, nextstate =>{n =>'fun1_3', s =>'var1_1orfun1_1'}}, fun1_3 =>{nextstate =>{'(' =>'fun1_4'}}, fun1_4 =>{exittoken =>{default=>'ARG'}, nextstate =>{s =>'var1_1orfun1_1'}} }; bless($dfa,"DFA"); sub lex { my $self = shift; my @input = split('',shift); my @output = (); my $buffer = ''; my $currentstate = $self->{start}; for my $c (@input) { #exit action if( my $a = $currentstate->{exittoken} ) { if( my $tt = defined($a->{$c})?$a->{$c}:$a->{default} ) { push(@output,{type=>$tt,value=>$buffer}); $buffer = ''; } }#end exit action if #state transition my $s = $currentstate->{nextstate}->{$c} || 'start'; $currentstate = $self->{$s}; $buffer = $buffer.$c; #entry action if( my $tt = $currentstate->{entrytoken} ) { push(@output,{type=>$tt,value=>$buffer}); $buffer = ''; }#end entry action if }#end for loop #eof exit action if($buffer) { my $a = $currentstate->{exittoken}; my $tt = ($a)?$a->{default}:'OTHER'; push(@output,{type=>$tt,value=>$buffer}); } return @output; }#end function lex my $inputstring = 'sin('; print("an input of $inputstring produced the following output.\n"); for my $tok ($dfa->lex($inputstring)) { print("$tok->{type},$tok->{value}\n"); } $inputstring = 'sisin(si'; print("an input of $inputstring produced the following output.\n"); for my $tok ($dfa->lex($inputstring)) { print("$tok->{type},$tok->{value}\n"); } $inputstring = 'sin(sisin('; print("an input of $inputstring produced the following output.\n"); for my $tok ($dfa->lex($inputstring)) { print("$tok->{type},$tok->{value}\n"); }

    Output:

    an input of sin( produced the following output. ARG,sin( an input of sisin(si produced the following output. ARG,si ARG,sin( ARG,si an input of sin(sisin( produced the following output. ARG,sin( ARG,si ARG,sin(

    I used the logical or (as suggested here) to help deal with default cases and I also tried to use the // operator (as suggested here) to rewrite the expression  my $tt = defined($a->{$c})?$a->{$c}:$a->{default}, but the version of Perl I'm using is too early.

    Thanks again for everyone's help! I feel like I'm making some progress now.