in reply to Re: Nested greps w/ Perl
in thread Nested greps w/ Perl

Thank you Ken. I limited the search terms to 500 and that program you wrote, while it works flawlessly, has been running over an hour and still going. I don't know when it will end.

Interestingly enough this command processes all search terms and the complete file in 3 minutes 24 seconds.

time grep -P 'Z' file_to_search | awk '{print $1}' | sort | uniq --count > uniq.count

Replies are listed 'Best First'.
Re^3: Nested greps w/ Perl
by kcott (Archbishop) on Dec 20, 2016 at 22:23 UTC

    You updated your OP since I posted my solution.

    I suspect you don't need that inner (for) loop at all.

    You really need to provide us with a representative sample of your input. You originally posted a search for 100008020, now you seem to be saying that they're not numbers at all but first names. And, if they are indeed names, are there any called Zoë, Zachary, etc.?

    — Ken

Re^3: Nested greps w/ Perl
by hippo (Archbishop) on Dec 21, 2016 at 09:28 UTC
    grep -P 'Z' file_to_search | awk '{print $1}' | sort | uniq --count > uniq.count

    One perlish equivalent is

    perl -ae '$s{$F[0]}++ if $F[1] eq "Z"; END {print "$_ $s{$_}\n" for ke +ys %s}' file_to_search > uniq.count

    See how the timings compare on your platform.


    Edit: (TIMTOWTDI) s/The/One/