in reply to REGEX for url
my $line = '<td scope="row"><a href="/Archives/edgar/data/1050122/0000 +92735601000365/0000927356-01-00¡0365-0009.txt">0009.txt</a></td>'; $line =~ s/.*a href="(.*)".*/$1/; print $line;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: REGEX for url
by wrkrbeee (Scribe) on Apr 25, 2016 at 20:52 UTC | |
Thank you for your help! That expression does not seem to bind to anything for me, something else perhaps that I"m doing wrong? Below is a small amount of the code. Thanks again!
| [reply] [d/l] |
by james28909 (Deacon) on Apr 25, 2016 at 20:57 UTC | |
Output:
EDIT: It seems that $/ = "</html>"; manipulates the input record seperator in such a way it does completely break the functionality of the simple regex. Do yu have any links to documentation on this $/ = "</html>"; ? | [reply] [d/l] [select] |
by wrkrbeee (Scribe) on Apr 25, 2016 at 21:28 UTC | |
Not sure if this helps, but the full text block, from <html> through </html> appears below. Just using $/ as a way to indicate the end of a record. I apologize for wasting your time.
| [reply] [d/l] |
by Marshall (Canon) on Apr 25, 2016 at 22:24 UTC | |
by wrkrbeee (Scribe) on Apr 25, 2016 at 21:09 UTC | |
| [reply] |
by NetWallah (Canon) on Apr 25, 2016 at 21:19 UTC | |
by ExReg (Priest) on Apr 25, 2016 at 22:07 UTC | |