Re: Regex/Pattern Matching

I don't think you'll be able to come up with a general way to do this. You'll have to repeat the work you did for www.ntu.edu.sg/sce/staffacad.asp for each site, creating a sub for each site that extracts and returns all the information.

If your problem with extracting the publication text is stripping out the HTML from the text, look at HTML::Parser.

Best of luck to you.

Comment on Re: Regex/Pattern Matching