| Category: | utilities (XML?) |
| Author/Contact Info | |
| Description: | wc_xmljust count the words in an XML file, excluding all mark-up (and attribute values) You will need pyx (either from XML::PYX or the Python or Java version, it really doesn't matter) installed Adding a character count so it behaves more like the unix wc utility is left as an exercice for the reader. |
#!/bin/perl -w
use strict;
my $nbw=0;
foreach my $file (@ARGV)
{ open( XML, "pyx $file |") or die "cannot open file $file: $!";
while( <XML>)
{ next unless m/^-/; # skip markup
next if( m/^-\\n$/); # skip line returns
my @words= split; # get the words
$nbw+= @words; # get the number of words in the line
}
close XML;
}
print $nbw, " words\n";
|
|
|
|---|
| Replies are listed 'Best First'. |
|---|