in reply to Removing selective tags and content between
To get rid of the other tags you mentioned, try:$webpage =~ s/<head>.+<\/head>//sgi;
Keep in mind that this is a very narrow approach and will mis certain things like <body bgcolor="#FFF000">. A modification tothe regex will fix this though:$webpage =~ s/<html>|<\/html>|<body>|<\/body>//sgi;
There may be many other anomalies that you may have to take into consideration as well. One thing you can count on: you can't count on two people to format a line the same way.$webpage =~ s/<body.+>//sgi;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Removing selective tags and content between
by diamich (Initiate) on Oct 15, 2003 at 15:17 UTC | |
by ChrisR (Hermit) on Oct 15, 2003 at 15:29 UTC | |
by diamich (Initiate) on Oct 15, 2003 at 16:10 UTC | |
by diamich (Initiate) on Oct 15, 2003 at 18:06 UTC |