in reply to multiple hash compare, find, create
Reading through the various bits of information posted here, it seems your current process is:
That would seem to be a lot of unnecessary work. Do you have some additional use for the intermediary CSV files? Do you have some additional use for the intermediary hashes?
I don't know what your initial raw data looks like. I dummied up 3 files (pm_1227048_raw_1, pm_1227048_raw_2 & pm_1227048_raw_3) from the data posted in "Re^2: multiple hash compare, find, create". Each contains an exact copy of what's posted there; for example,
$ cat pm_1227048_raw_1 | head -3 a6fbb013-b75f-4dd7-9d1a-24f566020042 => 92.1.3 b6a4c433-72a5-4e1a-b378-4a6b72531ded => 92.1.3 P0118760075 => 92.1.3
Unfortunately, while some keys had 2 associated vales, none had 3 values. I've added an extra step, in the script below, to show the technique. If your real data has keys with 3 values, you can dispense with that extra step. The comments should make this clear.
Here's the example script to show the technique. Each raw data file is parsed once to create one hash. When all data has been collected, one delete statement removes the unwanted key-value pairs.
#!/usr/bin/env perl use strict; use warnings; use autodie; use Data::Dump; my @files = qw{ pm_1227048_raw_1 pm_1227048_raw_2 pm_1227048_raw_3 }; my %trio; for my $file (@files) { open my $fh, '<', $file; while (<$fh>) { chomp; my ($k, $v) = split / => /; push @{$trio{$k}}, $v; } } # This step for demonstration purposes only print "All data:\n"; dd \%trio; # Extra step due to poor input print "Data with 2 or more values:\n"; delete @trio{grep @{$trio{$_}} < 2, keys %trio}; dd \%trio; # Only this step required with better input print "Data with 3 or more values:\n"; delete @trio{grep @{$trio{$_}} < 3, keys %trio}; dd \%trio;
Output:
All data: { "06bbe788-e57d-4eda-98ea-74d8a45a0e56" => ["3.2p10s1"], "08141110-f817-4c16-bf7b-8d0e6696a95b" => ["91.1.2"], "205dae51-ea2e-4db9-ace1-315b940686e6" => ["91.1.2", 829960012005940 +7], "29568879-fcca-4dc6-86be-3c8c86ef26db" => [8497101420498122], "2e530dc0-a164-4c06-ae18-332eb6778ebd" => ["3.2p10s1"], "37d6871a-3abc-44ee-819a-eea33440b0a4" => ["3.1p7s7"], "55ccc30e-3566-4a00-b219-4b084487384c" => ["92.1.3", 849835011259060 +0], "5f356d12-0213-4d5f-8fe7-08fe1d2a35d9" => ["3.2p10s2"], "64bc2611-38a6-4a59-8d80-4f49b7a76f69" => ["92.1.3"], "6ad8af7c-c56b-480f-bed6-9591b80cf634" => ["3.0p9s1"], "71be5e75-edad-4889-9261-e1ffa89e393f" => ["92.1.3", 877310394036574 +5], "97891097-70d7-4273-b1ae-3b88b460d591" => ["3.2p8s3"], "9d986ace-2504-4595-bdbb-1899812e9d54" => ["91.1.2", 877310391004605 +1], "a6fbb013-b75f-4dd7-9d1a-24f566020042" => ["92.1.3", 849910109001824 +0], "a915d30a-541c-4f5e-9b2f-297352f7e19c" => ["92.1.3", 877770318763522 +5], "b2c6e317-2072-4e3a-9278-5f76af49221a" => [8499102590027251], "b4206f77-25e9-4ccd-b434-2237360f1f8c" => ["3.1p10s1"], "b6a4c433-72a5-4e1a-b378-4a6b72531ded" => ["92.1.3"], "c6e0b7c8-4999-4e83-b7d9-c28a62613614" => ["92.1.3", 849574144141455 +8], "c8b2958f-7777-45e2-929a-adbe41f5055f" => ["3.2p10s1"], "dff7f963-ec15-440a-9150-b61f55afe8a4" => ["3.2p9s1"], "e173efe4-76f8-47fa-9923-500a3fe9715d" => ["92.1.0", 877310212021852 +6], "P0107577526" => ["3.2p3s1"], "P0112055731" => ["3.2p10s1"], "P0116761501" => ["3.2p10s1"], "P0118760075" => ["92.1.3", 8495840020455261], "P0127439637" => ["92.1.3", 8155600386311784], "P0127646016" => ["3.2p10s1"], "P0128132579" => ["3.2p10s1"], "P0128193326" => [8993110670064343], "P0130482072" => ["92.1.3", 8499100024861022], } Data with 2 or more values: { "205dae51-ea2e-4db9-ace1-315b940686e6" => ["91.1.2", 829960012005940 +7], "55ccc30e-3566-4a00-b219-4b084487384c" => ["92.1.3", 849835011259060 +0], "71be5e75-edad-4889-9261-e1ffa89e393f" => ["92.1.3", 877310394036574 +5], "9d986ace-2504-4595-bdbb-1899812e9d54" => ["91.1.2", 877310391004605 +1], "a6fbb013-b75f-4dd7-9d1a-24f566020042" => ["92.1.3", 849910109001824 +0], "a915d30a-541c-4f5e-9b2f-297352f7e19c" => ["92.1.3", 877770318763522 +5], "c6e0b7c8-4999-4e83-b7d9-c28a62613614" => ["92.1.3", 849574144141455 +8], "e173efe4-76f8-47fa-9923-500a3fe9715d" => ["92.1.0", 877310212021852 +6], "P0118760075" => ["92.1.3", 8495840020455261], "P0127439637" => ["92.1.3", 8155600386311784], "P0130482072" => ["92.1.3", 8499100024861022], } Data with 3 or more values: {}
— Ken
|
|---|