EchoAngel has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks!! I have this code:
foreach $line (@DataFromBatchFile) { chomp($line); print "B |$line|\n"; #$HASHDATAFROMBATCHFILE{$counter}; if ($line =~ m/-(.*) ["]*([\)\(A-Za-z0-9,\.\*]*)\s*["]*\s*/) { print "$counter $1 $2\n"; $HASHDATAFROMBATCHFILE{$counter}{$1} = $2; } if ($line =~ m/\bEND\b/) { $counter ++; } }
For some reason, it doesn't work always. For example my output:
B | -libs "source*.lib" | 1 libs "source*.lib" B | -1dlu | 1 1dlu B | -2dlu | 1 2dlu B | -3dlu "(1,1,3)"| 1 3dlu (1,1,3) B | -filter ASDF| 1 filter ASDF B |END|
what's wrong with the quotes extraction since i don't want them there.

Replies are listed 'Best First'.
Re: Text Extraction Problems
by ikegami (Patriarch) on Oct 06, 2004 at 17:39 UTC

    You'll find this more reliable

    sub dequote { local *_ = \$_[0]; s/^"(.*)"$/$1/; s/\\(.)/$1/g; } if ($line =~ m/ -([^\S"\\]+) # An dashed identifier (?: # optionally followed by \s+ # whitespace ( # and either "(?:[^"\\]|\\.)*" # a quoted string | # or [^\S"\\]+ # a bare identifer. ) )? /x) { my ($opt, $arg) = ($1, $2); dequote($arg); $DATAFROMBATCHFILE[$counter]{$opt} = $arg; } elsif ($line =~ m/\bEND\b/) { $counter ++; }

    You're using a hash as an array again. Are you used to programming in PHP? I fixed it to use an array (meaning I changed {} to []. If the key to a hash is a number, chances are you should be using an array.

Re: Text Extraction Problems
by Sandy (Curate) on Oct 06, 2004 at 17:27 UTC

    #1

    Please comment your code, then it would be easier to know what it is supposed to do (as opposed to what it actually does)

    I'm not sure what it is you are doing with HASHDATAFROMBATCHFILE, but I will ignore that for now.

    #2

    what's wrong with the quotes extraction since i don't want them there.
    My output is different than yours, and appears to work based on the above statement.

    My source code (same as yours, but reads input from __DATA__, and deleted the unused hash)

    #!/usr/bin/perl -w use strict; my $counter; while (my $line = <DATA>) { chomp($line); print "B |$line|\n"; if ($line =~ m/-(.*) ["]*([\)\(A-Za-z0-9,\.\*]*)\s*["]*\s*/) { print "$counter $1 $2\n"; } if ($line =~ m/\bEND\b/) { $counter ++; } } __DATA__ -libs "source*.lib" -1dlu -2dlu -3dlu "(1,1,3)" -filter ASDF END
    My output (using Active State Perl 5.6.1 on WinNT)
    B | -libs "source*.lib"| libs source*.lib B | -1dlu | 1dlu B | -2dlu | 2dlu B | -3dlu "(1,1,3)"| 3dlu (1,1,3) B | -filter ASDF| filter ASDF B |END|
    Sandy
Re: Text Extraction Problems
by Roger (Parson) on Oct 07, 2004 at 13:38 UTC
    Why making it so complicated? You could use a much simpler regular expression and then remove the quotes later.

    #!/usr/bin/perl -w use strict; while (<DATA>) { chomp; print "$_\n"; if (/-(\S+)(\s*(.*))?/x) { my ($p1, $p2) = ($1, $3); if (defined $p2) { $p2 =~ s/^"//; $p2 =~ s/"$//; } print "[$p1][$p2]\n"; } } __DATA__ -libs "source*.lib source*.lib" -1dlu -2dlu -3dlu "(1,1,3)" -filter ASDF