comment on

I think you can significantly improve performance separating the logic responsible for skipping and counting. The next step is to rework the counting code using builtin features of Perl. Finally you can try to apply this approach in your script with your data. It would be very interesting to know results of this. Try to apply the following code (please be noticed that it is not complete, it just shows the concept of the approach described above). I have commented it enough to understand what happens on each step. Some mandatory parts are omitted to emphasize key moments of the approach. You need add them in a final version before starting your tests.

# initialize the array of patterns
# the same code as you use in your script, just complete the line
my @pat_array = ...;

# this is new hash variable used for counting matches
# it used entirely instead your approach
my %match_count;

# skip first lines
# simple read them and do nothing over them
<LOG_READ> for ( 1..$InStartLineNumber );

# normal work
# read line by line the rest of the file and do something
while ( <LOG_READ> ) {

    # read the line, and store in the variable explicitly
    chomp;
    my $line = $_;

    # walk through the list of patterns
    # test the line for matching each pattern
    # and count every successful match in the hash
    map { $line =~ m/\Q$_\E/ and $match_count{$_}++; } @pat_array;

}

# The rest of code handling with @pat_array and %match_count
[download]

In reply to Re: Multiple patterns match in a big file and track the counts of each pattern matched by siberia-man
in thread Multiple patterns match in a big file and track the counts of each pattern matched by ansh007

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.