I want to write a Perl script that does two things:
1. Parse a set of HTML pages in different directories recursively and extract all <A HREF=.....> </A> tags into separate text files while maintaining the directory structure. The text files, unique to each html pages/file, have to reside in the same directories as the html pages. (Note that tags/links are to html pages).
2. Use the extracted tags from text files and download the html pages and save them to the respective directories.
I have managed to extract the tags but from and into a single file. The directory and file structure needs to be maintained. So theoretically I would enter the home directory and the script should do the link extraction and download recursively.
Thanks in advance for any tips or previously used code snippets.
Edited 2002-06-20 by mirod: changed title (was: Recursive HTML Tax Extraction) and added formating tags
In reply to Recursive HTML Tag Extraction by khanan
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |