in reply to Regex keep matching the last possible match (but should get all)
In other words, after meeting the first <TD ALIGN=LEFT, .+ will match everything up to the last extinfo.cgi in your long string.
To see what I mean, put the first .+ between brackets and print $1.
.+ (and its even more treacherous brother .*) will quickly escape your control if you are not careful. A useful technique to control what gets matched is to indicate the character(s) you don't want: [^>]+, means match anything, except the '>' character, or in other words, until the end of the current HTML tag. It prevents the regex quantifiers to run away.
CountZero
A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
My blog: Imperial Deltronics
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Regex keep matching the last possible match (but should get all)
by Anonymous Monk on May 18, 2015 at 12:48 UTC | |
by Corion (Patriarch) on May 18, 2015 at 12:52 UTC | |
by Anonymous Monk on May 18, 2015 at 12:53 UTC |