Re: Parsing out URLs with regex

Replies are listed 'Best First'.
Re: Re: Parsing out URLs with regex by halley (Prior) on May 14, 2003 at 19:41 UTC
Agreed. When working on code for heavy use, don't reinvent the wheel. For learning purposes, though, you don't want (\w) for the maximum number of consecutive word characters, you want (.?) for the minimum number of characters followed by the closing quote. -- `[ e d @ h a l l e y . c c ]`	[reply]
Re^3: Parsing out URLs with regex (diedotstar) by tye (Sage) on May 14, 2003 at 19:46 UTC
Actually, this is a good example of when .? is not the best choice. `[^"]` is a much better idea. You don't want to run into this problem: `$page= '<a href="foo">...' . '<a href="bar" title="baz"><b>Click Here'; $page =~ /<a href="(.?)" title="(.?)"><b>Click Here/i;` [download] where $1 will contain `'foo">...<a href="bar'`. - tye	[reply] [d/l] [select]
Re: Re^3: Parsing out URLs with regex (diedotstar) by halley (Prior) on May 14, 2003 at 19:51 UTC
Whups, didn't see the mandatory title="" in the match. Jumped the gun. -- `[ e d @ h a l l e y . c c ]`	[reply]
Re^5: Parsing out URLs with regex (diedotstar) by tye (Sage) on May 14, 2003 at 19:56 UTC