in reply to Re: Re: extracting web links
in thread extracting web links

probably more, but what about link like

http://user:passwd@site

that become specialy interresting when you consider that user or passwd could containt space (or quote ?),that URL can contain comment like "a > b ?", that quote aren't mandatory and finally consider that regex could be memory/CPU hog , and if you still feel like using regexes test things like
perl -Mre=debug -e '" "=~/href\s*=\s*"*([^"\s]+)"*\s*>/gi' put your URL here ^^
you'll have a better idea of what the regex engine do for you. After that may be your regex be the solution ... but I doubt.
--
dominix

Replies are listed 'Best First'.
Re: Re: Re: Re: extracting web links
by drake50 (Pilgrim) on Dec 27, 2003 at 22:55 UTC
    These are some of the things I was worried about with regex. I also figured that a lot of time and energy has gone into the Parser module and that many people have already reviewed it and given it their blessing. I just wanted to make sure that I was heading in the right direction.
    Thanks for the input!