In the example you give, a regular expression will probably do what you want, because it is very unlikely that a document will contain two TITLE elements. However, in a slightly different example, e.g., if we were looking for certain text in a CAPTION element, then the regular expression that works for your example might fail, if the text in question occurs between two of the elements in question but not within either of them. It is possible to work around that with a much more complicated regular expression, but it's hairy, and it will still fail if the element in question can be nested within itself, either directly or indirectly. In such cases, you really need to use a module that parses the SGML and hands you a DOM. HTML::TreeBuilder and XML::Twig make this sort of thing easy for HTML and XML respectively, and there are various alternatives to them as well. I don't know as much about SGML modules, since I've never worked much with SGML (except for legacy versions of HTML that were SGML-based), but you might check the CPAN.
Of course, if the example you gave is really all you want to do, then you may not need a parser, since the regex will probably be good enough.
In reply to Re: //s modifier
by jonadab
in thread //s modifier
by kettle
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |