Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

MCE_Loop might also be an option:

#!/usr/bin/env perl use warnings; use strict; use MCE::Loop; use feature qw(say); my $cpus = MCE::Util->get_ncpu() || 4; MCE::Loop::init { max_workers => $cpus, }; my %barcode_hash = ( 1 => [ 'AGCTCGTTGTTCGATCCA', 'GAGAGATAGATGATAGTG', 'TTTT_CCCC', 0 +], 2 => [ 'AGCTCGTTGTTCGATCCA', 'GAGAGATAGATGATAGTG', 'TTTT_AAAA', 0 +], 3 => [ 'AGCTCGTTGTTCGATCCA', 'GAGAGATAGATGATAGTG', 'TTTT_BBBB', 0 +], 4 => [ 'AGCTCGTTGTTCGATCCA', 'GAGAGATAGATGATAGTG', 'TTTT_AAAA', 0 +], ); my $barcode_pair_35 = 'TTTT_AAAA'; mce_loop { my ( $mce, $chunk_ref, $chunk_id) = @_; for (@$chunk_ref) { if ( $barcode_hash{$_}[2] eq $barcode_pair_35 ) { say qq(Found $barcode_hash{$_}[2] at $_); } } } keys %barcode_hash; __END__

Update:

Ok, some benchmarking.

Playing around with $size might be worth the effort. Your mileage may vary. I hope i quoted haukex right and jumped to the right conclusions.

#!/usr/bin/env perl use MCE::Loop; use Benchmark qw ( :hireswallclock cmpthese timethese ); use strict; use warnings; use feature qw(say); my $size = 10000; say $size; my $cpus = MCE::Util->get_ncpu() || 4; MCE::Loop::init { max_workers => $cpus, chunk_size => $size }; my $data = [ 'AGCTCGTTGTTCGATCCA', 'GAGAGATAGATGATAGTG', 'TTTT_CCCC', +0 ]; our %barcode_hash = map { $_ => $data } 1 .. 99998; $barcode_hash{99999} = [ 'AGCTCGTTGTTCGATCCA', 'GAGAGATAGATGATAGTG', 'TTTT_AAAA ', 0 ]; $barcode_hash{100000} = [ 'AGCTCGTTGTTCGATCCA', 'GAGAGATAGATGATAGTG', 'TTTT_AAAA ', 0 ]; our $barcode_pair_35 = 'TTTT_AAAA'; my $results = timethese( -10, { 'karl' => 'karl', 'haukex' => 'haukex', } ); cmpthese($results); sub haukex { our %barcode_hash; our $barcode_pair_35; for my $key ( sort keys %barcode_hash ) { 1 if $barcode_hash{$key}[2] eq $barcode_pair_35; } } sub karl { our %barcode_hash; our $barcode_pair_35; mce_loop { my ( $mce, $chunk_ref, $chunk_id ) = @_; for (@$chunk_ref) { 1 if ( $barcode_hash{$_}[2] eq $barcode_pair_35 ); } } keys %barcode_hash; } __END__ haukex 6.74/s -- -47% karl 12.8/s 90% --

Update 2: Shit! If i omit the sort i lose...

Update 3: Slightly different picture with 1.000.000 keys and calculating them before benchmarking:

my $size = 10000; my $cpus = MCE::Util->get_ncpu() || 4; MCE::Loop::init { max_workers => $cpus, chunk_size => $size }; our $max = scalar keys %barcode_hash; sub haukex { our %barcode_hash; our $barcode_pair_35; our $max; for ( 1.. $max ) { 1 if $barcode_hash{$_}[2] eq $barcode_pair_35; } } sub karl { our %barcode_hash; our $barcode_pair_35; our $max; mce_loop { my ( $mce, $chunk_ref, $chunk_id ) = @_; for (@$chunk_ref) { 1 if ( $barcode_hash{$_}[2] eq $barcode_pair_35 ); } } 1..$max; } haukex 2.29/s -- -38% karl 3.67/s 60% --

Update 4: It's worth to install Sereal::Decoder.

Regards, Karl

«The Crux of the Biscuit is the Apostrophe»

Furthermore I consider that Donald Trump must be impeached as soon as possible


In reply to Re: search for particular elements of hash with multiple values by karlgoethebier
in thread search for particular elements of hash with multiple values by pmpmmpmp

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (8)
As of 2024-04-23 12:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found