in reply to Re: regular expressions. help
in thread regular expressions. help

Hmmm... That looks like it should be correct, yes (like I said, I'm no good at regular experessions, but it looks right to me), but it's still not working. Let me just post the whole stupid program to give you an idea what I am trying to do:
#!/usr/local/bin/perl $fl = '-?\d+\.\d+'; $evalme = q[ while(<>) { s/^ //; if(^HISTOGRAM OF\s*\*\s*(\w+)$/) { printf ("In loop.\n"); #just here for testing purposes +. write if $header; undef($cache); $header=$1; $varnum=$2; } if($header) { ]; eval <<EOM; $evalme (\$meanH, \$usersH) = (\$1, \$2) if /^GROUP\\s+(\\S+)\\s+( +\\S+)/; (\$mean, \$users) = (\$1, \$2) if /^MEAN\\s+(${fl})\\s+(${ +fl})/; \$levene = \$1 if /\\s+VARIABILITY\\s+${fl}\\s+(${fl})/; \$pooled = \$1 if /\\s+POOLED T\\s+${fl}\\s+(${fl})/; \$separate = \$1 if /\\s+SEPARATE T\\s+${fl}\\s+(${fl})/; \$mann = \$1 if /\\s+MANN-WHIT.\\s+${fl}\\s+(${fl})/; } } EOM write STDOUT; format STDOUT_TOP = | @|||| | @|||| | Levene-P | Pooled-P | Mann-P | Sep +arate $meanH, $usersH ----------+----------+----------+----------+----------+----------+---- +------ . format STDOUT = @<<<<<<<< | @##.#### | @##.#### | @##.#### | @##.#### | @##.#### | @## +.#### $header, $mean, $users, $levene, $pooled, $mann, $se +parate ----------+----------+----------+----------+----------+----------+---- +------ .
It reads through the input file until it finds HISTOGRAM OF and then begins pulling out the data as per above. Does any of it work? Well, I don't know, I still can't get this one stupid thing to work.

Replies are listed 'Best First'.
Re^3: regular expressions. help
by shemp (Deacon) on Jun 29, 2004 at 20:22 UTC
    The (\w+)$ is killing you again. You match 'HISTOGRAM OF', whitespace, asterisk, whitespace, but the rest of your string is not all \w (word chars), and since you added the '$' to match until the end, the \w+ fails to match when it hits whitespace again.

    I cannot stress enough to regex learners that whitespace NEEDS to be treated like all other characters.
      Okay. So I could theoretically get rid of the $ so that it doesn't match until the end, then? Or what would be the best way to get around this?

      Thanks for your help, by the way. This is leaving me more than a bit frazzled.