brand new to hashing...i'm trying to make the script search for a partial email address
Hashing (as in perl's hashes) provides for fast lookup using exact matching only; and is entirely the wrong mechanism for partially matching anything.
If your keys are -- like your example "smith" -- actually *always* whole words, then you could index your data by whole surnames:
[0] Perl> @x = split( '[, ]', $_ ), push @{ $bySurname{ $x[0] }{ $x[1] + } }, [ @x[ 2, 3 ] ] for split /\n\s/, <<'END' Smith,John (248)-555-9430 jsmith@aol.com Hunter,Apryl (810)-555-3029 april@showers.org Stewart,Pat (405)-555-8710 pats@starfleet.co.uk Ching,Iris (305)-555-0919 iching@zen.org Doe,John (212)-555-0912 jdoe@morgue.com Jones,Tom (312)-555-3321 tj2342@aol.com Smith,John (607)-555-0023 smith@pocahontas.com Crosby,Dave (405)-555-1516 cros@csny.org Johns,Pam (313)-555-6790 pj@sleepy.com Jeter,Linda (810)-555-8761 netless@earthlink.net Garland,Judy (305)-555-1231 ozgal@rainbow.com END ;; [0] Perl> pp %bySurname;; ( "Jeter", { Linda => [["(810)-555-8761", "netless\@earthlink.net"]] }, "Ching", { Iris => [["(305)-555-0919", "iching\@zen.org"]] }, "Smith", { John => [ ["(248)-555-9430", "jsmith\@aol.com"], ["(607)-555-0023", "smith\@pocahontas.com"], ], }, "Crosby", { Dave => [["(405)-555-1516", "cros\@csny.org"]] }, "Jones", { Tom => [["(312)-555-3321", "tj2342\@aol.com"]] }, "Doe", { John => [["(212)-555-0912", "jdoe\@morgue.com"]] }, "Johns", { Pam => [["(313)-555-6790", "pj\@sleepy.com"]] }, "Hunter", { Apryl => [["(810)-555-3029", "april\@showers.org"]] }, "Garland", { Judy => [["(305)-555-1231", "ozgal\@rainbow.com\n"]] }, "Stewart", { Pat => [["(405)-555-8710", "pats\@starfleet.co.uk"]] }, )
Which would allow you to find all those with "smith" (provided you lc the keys, which I didn't above), but won't let you find those with "jo*" in the name.
For small numbers of lines -- a few thousands or so -- I'd keep them in a single string and using a simple text search.
For a fully wild-card search of many more than that, I'd probably build a 2 or 3 consecutive characters index.
In reply to Re: The Art of Hashing
by BrowserUk
in thread The Art of Hashing
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |