Dear Monks,

I have a set of search strings e.g. ALPHA, BETA, GAMMA plus a set of exceptions e.g. for ALPHA: "_ALPHA", "#ALPHA", "XL5 ALPHA" with a different set of exceptions for each search string. Note also that the search string may be any substring of one of its exceptions, not just a right substring as in this example.

I have a top directory to a tree containing years of source code and want to scan the lot and produce a report where each line is in the form:

<file>|<line no.>|<text>

where the text at that line in that file has met the above search criteria (i.e. matches bar the exceptions).

Without the exceptions requirement it was easy enough to recurse through the tree opening files and matching patterns against their contents - it's straightforward enough that I have to search each line multiply for the search strings.

But I seem to have a problem arranging the logic when it comes to eliminating the exceptions. For example:

"I AM ALPHA AND #ALPHA"

It contains both an exception and a non-exception so has to pass the criteria for a successful ALPHA search.

Any ideas from pattern-matching gurus would be more than welcome!

Best wishes,

the Moron

Update: here is the code so far:

#!/usr/bin/perl my $start = '/home/moron/ac5'; Init(); Traverse( $start ); close (my $sfh=$_{ FH }{ SUMMARY }); sub Traverse { my $dir = shift; opendir my $dh, $dir or die "$! while opening $dir"; for my $file ( grep !/^\./, readdir $dh ) { my $full = "$dir/$file"; if ( -d $full ) { Traverse( $full ); } else { Process( $full ); } } closedir $dh or die "$! while closing $dir"; } sub Process { my $file = shift; my $itref = $_{ IT }; my $found = 0; open my $fh, "<$file" or die "$! while opening $file"; while( <$fh> ) { KEY: for my $srch ( keys %$itref ) { if ( SmartSearch( $srch ) ) { print join( '|', ( $file, $., $_ ) ) . "\n"; $found++; last KEY; } } } close $fh or die "$! when closing $file"; if ( $found ) { my $sfh = $_{ FH }{ SUMMARY }; print $sfh "$file|$found\n"; } } sub SmartSearch { my $srch = shift; # and this is where I am stuck, except for some horrible ideas of +iterating through the combinations of cases such as exception in fron +t, exception behind, exception alone on the line } sub Init { $_{ IT }{ ALPHA } = 1; $_{ IT }{ BETA } = 1; $_{ IT }{ GAMMA } = 1; open my $sfh, ">SummaryFile.dat"; $_{ FH }{ SUMMARY } = $sfh; $_{ EXC }{ ALPHA }{ '_ALPHA' } = 1; $_{ EXC }{ ALPHA }{ '#ALPHA' } = 1; $_{ EXC }{ ALPHA }{ 'CAW5 ALPHA' } = 1; }
Free your mind

In reply to Pattern matching when there are exception strings by Moron

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.