Hi All,
It's been a decade or so but my love of Perl continues.
The background of the Why? for this question can be found at https://fixcovid19.com/about.html which I recommend you read, particularly if you take any medication. It might just save your life.
The data collection tool to which it applies can be found at https://fixcovid19.com. We are gathering this data because the US, Chinese and Canadian CDC's are not and when the Italians partially gathered this medication data it showed 73% of all COVID-19 deaths occurred in the ~3% of people taking two specific classes of medication. The Turkish data release a few days later found similarly - 68.8% of deaths occurred in this really small group.
The problem to be solved is the safe, deidentified release of IP addresses and Browser strings to fulfil the requirements of HIPAA, GDPR and CCPA. We simply do not gather any other PII (personally identifiable information) so it is impossible for this to leak. Age range, sex, disease severity and outcome and medications are the other data points.
These 2 identifiers will assist researchers in assessing if the crowdsourced data we are gathering is "gamed" or "believable". We have taken steps to make automated submission difficult, but as we hackers know, virtually nothing is impossible if you really put your mind to it...
So the task to hand is to convert an IP and Browser String into a cryptographically secure hash that can not be reversed or revealed with a rainbow table. IP addresses and Browser strings both exist in a small finite search space.
Given this data will be released publicly and is timestamped it is trivial for an attacker to correlate a known IP address and Browser string to these hashes. Given the secret packing data I don't believe this would allow any quantity of computational power to elucidate the packing data and thus create a lookup table, but if that is incorrect I would be good to know now and fix it before our impending first data release.
While I am expert in the field of medicine my knowledge of crypto and how you attack it is less. SHA3 was chosen for its resistance to length extension attacks and the packing data size to give enough random data for the resultant hash to spread evenly across that space. Maybe that's good enough, maybe it can/should/must be done better.
Here is a draft version of those hashing functions. Expert commentary appreciated, particularly from cryptographers.
#!/usr/bin/env perl package SHA3; use strict; use warnings; use Digest::SHA3 qw(sha3_256_hex); use Digest::MD5 qw(md5_hex); use Socket qw( inet_pton AF_INET AF_INET6 ); my @packing = qw( fa13a941b76466850c2558d9ae5d969f e71ab0d8bb54c75b37ad23a449050121 6736564ec6bc9bbc8ba42df565317443 c3e088a5cf247ec0df971c5cb9ee6eec 6cf20d548878cdd82b8f207192f58c80 660a311b8d75d5fb28c73f7e2ec5d25e 377f92899b81ad7c5e1d08b81ccc8904 8e1f27dee8ae3374ae5c462adf37bba5 ccd558ff6b9de48ca22023ead2dbd7a2 ff228ef28ae8544155323180ba070d1b ); print SHA3::sha3_ip('1.2.3.4'), "\n"; print SHA3::sha3_ip('1.2.3.5'), "\n"; print SHA3::sha3_ip('2001:0db8:0000:0000:0000:8a2e:0370:7334'), "\n"; print SHA3::sha3_ip('2001:0db8:0000:0000:0000:8a2e:0370:7335'), "\n"; print SHA3::sha3_bs('Mozilla'), "\n"; print SHA3::sha3_bs('Win32'), "\n"; =head 2 sha3_ip { Expects a dot quad or an IPv6 address and returns a SHA3_256_hex string or null string for invalid input =cut sub sha3_ip { my $ip = shift; my $pack_format; if ( $ip =~ m/^\d+\.\d+\.\d+\.\d+$/ ) { my $bytes = pack("H32 a4 H32", $packing[0], inet_pton( AF_INET +, $ip ), $packing[7]); my $hash = sha3_256_hex($bytes); return $hash; } elsif ( $ip =~ /^([0-9a-f]{0,4}:){0,7}([0-9a-f]{0,4})$/i ) { my $bytes = pack("H32 a16 H32", $packing[0], inet_pton( AF_INE +T6, $ip ), $packing[7]); my $hash = sha3_256_hex($bytes); return $hash; } warn "Invalid IP:$ip\n"; return ''; } =head 2 sha3_bs { Expects a browser string and returns a SHA3_256_hex string or a null string for invalid input =cut sub sha3_bs { my $bs = shift; unless (length $bs > 4 ) { warn "Insufficient data in browser string $bs"; return ''; } my $bytes = pack("H32 H32 H32", $packing[1], md5_hex($bs), $packin +g[8]); my $hash = sha3_256_hex($bytes); return $hash; }
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |