how to empty the built

jesuashok has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: how to empty the built in variable by GrandFather (Saint) on Feb 22, 2007 at 04:22 UTC
Change the regex line to: `next unless $line =~ /\s+<\w+\/?>(.*)<\/\w+>/;` [download] If the regex fails then $1 is not altered (as you noticed). Why did you comment out `use strict;` btw? Your code is fine with strictures enabled. DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re: how to empty the built in variable by bobf (Monsignor) on Feb 22, 2007 at 04:23 UTC
You are not checking if the regex matched before you use the value of `$1`. Changing the guts of your code to the following may give you the desired result. `if( $line =~ /\s+<\w+\/?>(.*)<\/\w+>/ ) { $line = $1; print "$line\n"; } else { print "no match\n"; }` [download] Output: `First_Table no match` [download] Finally, why did you comment out `use strict`, and if you're parsing what appears to be XML or HTML, why not use a parser?	[reply] [d/l] [select]
Re: how to empty the built_in variable by blazar (Canon) on Feb 22, 2007 at 09:24 UTC
`my $line; while ( <DATA> ) { chomp; $line = $_; $line =~ /\s+<\w+\/?>(.)<\/\w+>/; $line = $1; print "$line\n"; }` [download] In addition to the other comments you got, you should as usual declare your lexical variables in the innermost scope as possible; in this case: `while ( <DATA> ) { chomp; my $line = $_; # ...` [download] But... all in all it's strange that you use the implicit `$_` only to assign it to `$line`. You either want `while ( <DATA> ) { chomp; /\s+<\w+\/?>(.)<\/\w+>/ or next; print $1, "\n"; }` [download] or `while ( my $line=<DATA> ) { chomp $line; $line =~ /\s+<\w+\/?>(.*)<\/\w+>/ or next; print $1, "\n"; }` [download]	[reply] [d/l] [select]
Re: how to empty the built_in variable by davorg (Chancellor) on Feb 22, 2007 at 09:37 UTC
In the above code the second line should not have any value in $1 Actually, that's not true. The behaviour that you are seeing is documented in perlre. NOTE: failed matches in Perl do not reset the match variables, which makes it easier to write code that tests for a series of more specific cases and remembers the best match. -- <http://dave.org.uk> "The first rule of Perl club is you do not talk about Perl club." -- Chip Salzenberg	[reply]
Re: how to empty the built_in variable by holli (Abbot) on Feb 22, 2007 at 17:06 UTC
Like Moron said, parsing XML with regexes is a pita and error prone. The code below uses XML::XPath to do the same job yours does. And besides from being safer it's also more readable (and more perlish :-). Note: I added a root element to the data so it becomes valid XML. `use strict; use warnings; use XML::XPath; my $xp = XML::XPath->new(ioref => DATA); #for parsing files: #my $xp = XML::XPath->new(filename => 'test.xml'); print map { $_->string_value, "\n" } grep { $_->string_value } $xp->find('/Root/Table')->get_nodelist; __DATA__ <Root> <Table>First_Table</Table> <Table/> <Table>Second_Table</Table> </Root>` [download] Ouputs: `First_Table Second_Table` [download] holli, /regexed monk/*	[reply] [d/l] [select]
Re: how to empty the built_in variable by Moron (Curate) on Feb 22, 2007 at 13:23 UTC
I suspect you commented out the use strict because you were getting undefined data errors -- in your code $1 is undefined whenever the regexp fails to match. Functionally it looks like you want two loops rather than one. The outer loop should poll the lines of input and the inner loop should exhaustively parse the line. In addition it is normal and advisable to use one regexp per lexical element in a parser, which changes everything of course, so you might as well use something like XML::Twig instead :) -M Free your mind	[reply]