in reply to Re^5: regexp over multiple lines
in thread regexp over multiple lines
Well, I've been able to slurp the files and read the data into arrays and it makes the coding much faster and easier. I no longer have to use anchor points and offset values and other messy stuff. I'm not getting the exact results I want yet but I'm getting there.
I have another question:
Is it possible to limit the scope of the regexp I'm using (ie. linit the search to an area I define by a regexp which defines the search block of text)? For example:
<p id=paragraph_1> <a href="http://www.link1.com">Link1</a> <a href="http://www.link2.com">Link2</a> <a href="http://www.link3.com">Link3</a> </p> <p id=paragraph_2> <a href="http://www.link4.com">Link4</a> <a href="http://www.link5.com">Link5</a> <a href="http://www.link6.com">Link6</a> </p>
If I use a regexp to parse the names of the above html links, I'm going to get all of them. What if I only want the ones within the paragraph_2 tags, how would I do that?
Here's an dummy example of code I already have:
local($/, *WEB_DATA);#sets $/ to undef for you and when the scope exits it will revert $/ back to its previous value (most likely "\n") open (WEB_DATA, "<$myFilename.tmp"); my $myData = <WEB_DATA>; close (WEB_DATA); my @linkName = $myData =~ m/regexp for linkName/g;
I'm not sure if I've described it correctly, but what I want if to use a regexp like /<p id=paragraph_1>.+?<\/p>/ to define where I want to look, and another rexexp to define what data I want to parse within this block. I hope that makes sense.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^7: regexp over multiple lines
by ww (Archbishop) on Aug 04, 2011 at 18:21 UTC | |
by liverpaul (Acolyte) on Aug 05, 2011 at 10:24 UTC |