Reading through the various bits of information posted here, it seems your current process is:

  1. Capture data in 3 raw files
  2. Convert those to 3 CSV files
  3. Extract data from those into 3 hashes
  4. Combine keys with 3 values into 1 new hash

That would seem to be a lot of unnecessary work. Do you have some additional use for the intermediary CSV files? Do you have some additional use for the intermediary hashes?

I don't know what your initial raw data looks like. I dummied up 3 files (pm_1227048_raw_1, pm_1227048_raw_2 & pm_1227048_raw_3) from the data posted in "Re^2: multiple hash compare, find, create". Each contains an exact copy of what's posted there; for example,

$ cat pm_1227048_raw_1 | head -3 a6fbb013-b75f-4dd7-9d1a-24f566020042 => 92.1.3 b6a4c433-72a5-4e1a-b378-4a6b72531ded => 92.1.3 P0118760075 => 92.1.3

Unfortunately, while some keys had 2 associated vales, none had 3 values. I've added an extra step, in the script below, to show the technique. If your real data has keys with 3 values, you can dispense with that extra step. The comments should make this clear.

Here's the example script to show the technique. Each raw data file is parsed once to create one hash. When all data has been collected, one delete statement removes the unwanted key-value pairs.

#!/usr/bin/env perl use strict; use warnings; use autodie; use Data::Dump; my @files = qw{ pm_1227048_raw_1 pm_1227048_raw_2 pm_1227048_raw_3 }; my %trio; for my $file (@files) { open my $fh, '<', $file; while (<$fh>) { chomp; my ($k, $v) = split / => /; push @{$trio{$k}}, $v; } } # This step for demonstration purposes only print "All data:\n"; dd \%trio; # Extra step due to poor input print "Data with 2 or more values:\n"; delete @trio{grep @{$trio{$_}} < 2, keys %trio}; dd \%trio; # Only this step required with better input print "Data with 3 or more values:\n"; delete @trio{grep @{$trio{$_}} < 3, keys %trio}; dd \%trio;

Output:

All data: { "06bbe788-e57d-4eda-98ea-74d8a45a0e56" => ["3.2p10s1"], "08141110-f817-4c16-bf7b-8d0e6696a95b" => ["91.1.2"], "205dae51-ea2e-4db9-ace1-315b940686e6" => ["91.1.2", 829960012005940 +7], "29568879-fcca-4dc6-86be-3c8c86ef26db" => [8497101420498122], "2e530dc0-a164-4c06-ae18-332eb6778ebd" => ["3.2p10s1"], "37d6871a-3abc-44ee-819a-eea33440b0a4" => ["3.1p7s7"], "55ccc30e-3566-4a00-b219-4b084487384c" => ["92.1.3", 849835011259060 +0], "5f356d12-0213-4d5f-8fe7-08fe1d2a35d9" => ["3.2p10s2"], "64bc2611-38a6-4a59-8d80-4f49b7a76f69" => ["92.1.3"], "6ad8af7c-c56b-480f-bed6-9591b80cf634" => ["3.0p9s1"], "71be5e75-edad-4889-9261-e1ffa89e393f" => ["92.1.3", 877310394036574 +5], "97891097-70d7-4273-b1ae-3b88b460d591" => ["3.2p8s3"], "9d986ace-2504-4595-bdbb-1899812e9d54" => ["91.1.2", 877310391004605 +1], "a6fbb013-b75f-4dd7-9d1a-24f566020042" => ["92.1.3", 849910109001824 +0], "a915d30a-541c-4f5e-9b2f-297352f7e19c" => ["92.1.3", 877770318763522 +5], "b2c6e317-2072-4e3a-9278-5f76af49221a" => [8499102590027251], "b4206f77-25e9-4ccd-b434-2237360f1f8c" => ["3.1p10s1"], "b6a4c433-72a5-4e1a-b378-4a6b72531ded" => ["92.1.3"], "c6e0b7c8-4999-4e83-b7d9-c28a62613614" => ["92.1.3", 849574144141455 +8], "c8b2958f-7777-45e2-929a-adbe41f5055f" => ["3.2p10s1"], "dff7f963-ec15-440a-9150-b61f55afe8a4" => ["3.2p9s1"], "e173efe4-76f8-47fa-9923-500a3fe9715d" => ["92.1.0", 877310212021852 +6], "P0107577526" => ["3.2p3s1"], "P0112055731" => ["3.2p10s1"], "P0116761501" => ["3.2p10s1"], "P0118760075" => ["92.1.3", 8495840020455261], "P0127439637" => ["92.1.3", 8155600386311784], "P0127646016" => ["3.2p10s1"], "P0128132579" => ["3.2p10s1"], "P0128193326" => [8993110670064343], "P0130482072" => ["92.1.3", 8499100024861022], } Data with 2 or more values: { "205dae51-ea2e-4db9-ace1-315b940686e6" => ["91.1.2", 829960012005940 +7], "55ccc30e-3566-4a00-b219-4b084487384c" => ["92.1.3", 849835011259060 +0], "71be5e75-edad-4889-9261-e1ffa89e393f" => ["92.1.3", 877310394036574 +5], "9d986ace-2504-4595-bdbb-1899812e9d54" => ["91.1.2", 877310391004605 +1], "a6fbb013-b75f-4dd7-9d1a-24f566020042" => ["92.1.3", 849910109001824 +0], "a915d30a-541c-4f5e-9b2f-297352f7e19c" => ["92.1.3", 877770318763522 +5], "c6e0b7c8-4999-4e83-b7d9-c28a62613614" => ["92.1.3", 849574144141455 +8], "e173efe4-76f8-47fa-9923-500a3fe9715d" => ["92.1.0", 877310212021852 +6], "P0118760075" => ["92.1.3", 8495840020455261], "P0127439637" => ["92.1.3", 8155600386311784], "P0130482072" => ["92.1.3", 8499100024861022], } Data with 3 or more values: {}

— Ken


In reply to Re: multiple hash compare, find, create by kcott
in thread multiple hash compare, find, create by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.