comment on

If you are the anonymonk who posted the reply above about Searching XML files, be aware that it would be prudent to use a proper XML parsing module if you are going to be searching for stuff in xml files.

If you are really familiar with and confident about how your xml files are created, and if the xml markup is simple, then sure, you can tailor a regex solution for your data, and it might be more effective/efficient than using a parsing module. But using a parser is not so very complicated (and not so very slow, either).

Here's a demonstration that ought to do what you want in terms of searching for content in xml files; it includes the good suggestions from the previous replies, and adds a few other tweaks as well. Note that we'll filter out all the irrelevant file names during the readdir phase:

#!/usr/bin/perl

use strict;
use XML::Parser;

my ( $path, $pattern ) = @ARGV;
die "Usage: $0  path pattern\n lists files in path that contain patter
+n\n"
    unless ( length($path) and -d $path and $pattern =~ /\S/ );

my $found_files = process_files( $path, $pattern );
print "the following files in $path contain '$pattern'\n",
    join( "\n", @$found_files ), "\n";

sub process_files
{
    my ( $path, $pattern ) = @_;

    my @found = ();
    my $ignore = qr/\.(?:zip|lfa|txt) | UASTG |
                    defines | sccpch | sms81154 | sms97767
                   /x;

    opendir( D, $path ) or die "opendir failed on $path: $!";

    for my $file ( grep { -f "$path/$_" and !/$ignore/} readdir D ) {
        my $nfound = read_file( $path, $file, $pattern );
        push @found, "$path/$file: $nfound" if ( $nfound );
    }
    closedir D;

    return \@found;
}

sub read_file
{
    my ( $path, $file, $pattern ) = @_;
    my $nfnd = 0;
    if ( open my $fh, "$path/$file" ) {
        my $xml = new XML::Parser( Handlers =>
                                   { Char => sub { $nfnd++ if $_[1] =~
+ /$pattern/ }
                                   } );
        $xml->parse( $fh );
    }
    else {
        warn "open failed on $path/$file: $!\n";
    }
    return $nfnd;
}
[download]

Lots of monks like to recommend other XML modules that are more elaborate or "sophisticated" than the basic XML::Parser, but for your particular case (if I understand it right), this one is a pretty good match.

In reply to Re: Help with a faster loop by graff
in thread Help with a faster loop by gzayzay

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.