in reply to Pair Tag missing

Something like the following would probably work for you. Just handle the error anyway you need to for your implementation.
#!/usr/bin/perl -w use strict; my ($tagOpen, $tagClosed, $parseData); open(FILE, "<", 'htmlFile.html') || die "Error reading file: ($!)\n"; $parseData .= $_."\n" while (<FILE>); close(FILE); my @tags = qw(TITLE AUTHOR H1 H2 P IT); for (@tags) { $tagOpen++ while ($parseData =~ /<$_>/g); $tagClosed++ while ($parseData =~ /<\/$_>/g); error($_, $tagOpen, $tagClosed) unless ($tagOpen == $tagClosed); $tagOpen = $tagClosed = 0; } sub error { my ($tag, $open, $closed) = @_; print "Error found in tag: <$tag> (Open: $open -- Closed: $closed)\ +n"; return; }

Replies are listed 'Best First'.
Re^2: Pair Tag missing
by graff (Chancellor) on May 31, 2005 at 04:11 UTC
    Apart from the problem of having to specify all the possible tag names up front in an array, this approach will fail to pick up on certain problems that are quite common, such as:
    <foo><bar> blah blah </foo></bar>
    In fact, by this approach, a file where open and close tags are positioned randomly will pass just fine, so long as the number of close tags matches the number of open tags for each tag name.
      > Apart from the problem of having to specify all the possible tag names up front in an array, this approach
      > will fail to pick up on certain problems that are quite common, such as:
      >
      > <foo><bar> blah blah </foo></bar>

      True, but again all he said he was looking for was to see if all open tags had a close tag, not if the ordered syntax was correct. Also he stated that he was only searching for a few tags and NOT the entire range of possible HTML tags so the array approach would be sufficient for his implementation, though a bad idea on a large scale parser.

      All html parser modules I am aware of would not do what he is looking for since they basically parse the text inside the tags and do nothing with the actual tags themselves except demilit on them.