newbie hasher

prodevel has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: newbie hasher by Zaxo (Archbishop) on Nov 20, 2003 at 05:18 UTC
Probably the easiest way is to split each line and explicitly add the pair to the hash, `my %hosts; { local $_; open my $fh, '<', '/path/to/data.file' or die $!; while (<$fh>) { my @pair = split; $hosts{ $pair[0] } = $pair[1]; } close $fh or die $!; }` [download] After Compline, Zaxo	[reply] [d/l]
Re: Re: newbie hasher by ysth (Canon) on Nov 20, 2003 at 06:08 UTC
Though I'm responding to Zaxo, my comment applies to most of the answers I see. With code doing a split, check that `$pair[1] aka $ip` is actually getting set to a defined value. Otherwise, your hash value may be undef on bad input but you won't get a warning about it until you actually try to use the hash value later on. Best to validate as you go: `while (<$fh>) { chomp; my ($host, $ip) = split; warn("bad hosts line: $_"),next if !defined $ip; $hosts{$host} = $ip; }` [download] Note that if you check defined($ip) the chomp is necessary, since otherwise $ip set to the empty string (taken from the empty string following the newline character). If you use a regex to split up the line instead, make sure it actually requires both fields and check if the match succeeds (which gets a little funky if you are using map): `%hosts = map { if (/^(\w+)\s(.+)/) { ($1 => $2) } else { (warn "bad hosts line: $_")[1..0]; } } <FILE>;` [download]	[reply] [d/l] [select]
Re: newbie hasher (No loops!) by BrowserUk (Patriarch) on Nov 20, 2003 at 05:49 UTC
Look Ma. No loops :) `#! perl -slw use strict; use Data::Dumper; my %h = split ' ', do{ local $/; <DATA> }; print Dumper \%h; __DATA__ host1 1.1.1.1 host2 2.2.2.2 host3 3.3.3.3 host4 4.4.4.4` [download] Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail Hooray! Wanted!	[reply] [d/l]
Re: newbie hasher by Anonymous Monk on Nov 20, 2003 at 05:19 UTC
If your lines are simply host and IP separated by a space, then it's pretty simple: `open my $fh, "<input.dat" or die "Couldn't open file: $!"; my %hosts; while (<$fh>) { chomp; my ($host, $ip) = split; $hosts{$host} = $ip; } use Data::Dumper; print Dumper \%hosts;` [download]	[reply] [d/l]
Re: newbie hasher by Roger (Parson) on Nov 20, 2003 at 05:32 UTC
Or you could write a one-liner to transform the input file into a hash. Method 1 - (map with 'short-circuit', now handles empty lines) `use strict; use Data::Dumper; # Uh, well spotted, typo fixed with the extra + # which would have no effect anyway. :-) # ysth suggested that putting () in map shortcircuts # empty lines. It worked. Thanks. :-) # Wow, even better, dropped that testing bit. #my %hostlist = map { /^(\w+)\s(.)/?($1,$2):() } (<DATA>); my %hostlist = map { /^(\w+)\s(.)/ } (<DATA>); print Dumper(\%hostlist); __DATA__ host1 1.1.1.1 host2 1.2.3.5` [download] Method 2 - Better approach `use strict; use Data::Dumper; my %hostlist; { local $/; %hostlist = <DATA> =~ /^(\w+)\s(.)/gm; } print Dumper(\%hostlist); __DATA__ host1 1.1.1.1 host2 1.2.3.5` [download] Updated: added the 3rd method after seen jonadab's suggestion. Here's my solution of capturing a real host file.* Method 3 - Capture from the /etc/host file `use strict; use Data::Dumper; my %hosts; while (<DATA>) { next if /^\s(?:#\|$)/; # ignore comments and empty lines /^([^\s#]+)\s+(.)/; # capture ip address and names $hosts{$_} = $1 foreach split /\s+/, $2; } print Dumper(\%hosts); __DATA__ # IP Masq gateway: 192.168.0.80 pedestrian # Primary desktop: 192.168.0.82 raptor1 # Family PC upstairs: 192.168.0.84 trex tyrannosaur family # Domain servers: 205.212.123.10 dns1 brutus 208.140.2.15 dns2 156.63.130.100 dns3 cherokee` [download] Read more... (830 Bytes)	[reply] [d/l] [select]
Re: Re: newbie hasher by prodevel (Scribe) on Nov 20, 2003 at 05:45 UTC
By no means am I discounting the other methods, but I really like this one-liner due to my haphazzard knowledge of regex: %hostlist = <DATA> =~ /^(\w+)+\s(.*)/gm; This is exactly what I was looking for even though I was no where near explicit about what I wanted. I'm already applying this method to a few more scripts. The map was cool too. I am somewhat curious as to the extensive use of Data::Dumper instead of print-ing %hash? Thanks all!	[reply]
Re: Re: Re: newbie hasher by ysth (Canon) on Nov 20, 2003 at 06:17 UTC
I am somewhat curious as to the extensive use of Data::Dumper instead of print-ing %hash? Just shows the structure better, and shows up any undefs, tabs, newlines, wide characters, unprintable characters, etc. that sneak into your data better.	[reply]
Re^2: newbie hasher by Coruscate (Sexton) on Nov 20, 2003 at 05:43 UTC
~~Look at your first regex again. `/^(\w+)+/` looks quite funny :)~~ aww he decided to remove the double '+'ing :( Anyhow... just another way of writing the regex: `my %hotlist = map { m#\A(\S+)\s+(.*)\z#; $1 => $2 } <DATA>;` [download]	[reply] [d/l] [select]
Re: Re: newbie hasher by davis (Vicar) on Nov 20, 2003 at 09:59 UTC
`my %hostlist = map { /^(\w+)\s(.)/; $1 => $2 } (<DATA>);` [download] should be.... `my %hostlist = map { /^(\w+)\s+(.)/; $1 => $2 } (<DATA>);` [download] (extra + after the whitespace metachar) davis It's not easy to juggle a pregnant wife and a troubled child, but somehow I managed to fit in eight hours of TV a day.	[reply] [d/l] [select]
Re: newbie hasher by davido (Cardinal) on Nov 20, 2003 at 05:35 UTC
Here's another way: `use strict; use warnings; my %hash = map { chomp; split /\s+/, $_, 2 } <DATA>; print "$_, $hash{$_}\n" foreach keys %hash; __DATA__ host1 1.1.1.1 host2 1.2.3.5` [download] Fun huh? ;) Dave "If I had my life to live over again, I'd be a plumber." -- Albert Einstein	[reply] [d/l]
Re: newbie hasher by etcshadow (Priest) on Nov 20, 2003 at 05:24 UTC
`# assuming you've already opened your file... while (my $line = <FILE>) { chomp $line; my ($host,$ip) = split(/\s+/,$line,2); $hosts{$host} = $ip; }` [download] ------------ :Wq Not an editor command: Wq	[reply] [d/l]
Re: Re: newbie hasher by bart (Canon) on Nov 20, 2003 at 21:53 UTC
Close to perfect, IMO, except that I'd add a test for empty lines, like this: `my ($host,$ip) = split(/\s+/,$line,2) or next;` [download]	[reply] [d/l]
Re: newbie hasher by DrHyde (Prior) on Nov 20, 2003 at 09:06 UTC
While I don't doubt that your file looks like that, it's not a regular hosts file. In /etc/hosts, each line has an IP followed by a name (and optionally more names), whereas you have a name followed by an IP. /etc/hosts can also have comments in it. To deal with a traditional /etc/hosts file ... `open(HOSTS, '/etc/hosts') \|\| die("Out of cucumber error\n"); my %hosts; while(<HOSTS>) { s/#.//; next unless /(\S+)\s+(.)/; my $host = $1; push @{$hosts{$host}}, split(/\s+/, $2); } use Data::Dumper; print Dumper(\%hosts);` [download]	[reply] [d/l]
Re: newbie hasher by Coruscate (Sexton) on Nov 20, 2003 at 05:37 UTC
Since nobody else has used map() yet: `open my $hosts, '<', 'hosts.txt' or die "open failed: $!"; my %hosts = map { chomp; split /\s+/ } <$hosts>; close $hosts or die "close failed: $!";` [download] Update: Dope! Two people got maps in before me (somehow) lol	[reply] [d/l]
Re: newbie hasher by jonadab (Parson) on Nov 20, 2003 at 12:12 UTC
`my %hosts = map{ split /\s+/, $_, 2 }<DATA>; # Or substitute your favourite filehandle here. __DATA__ host1 1.1.1.1 host2 1.2.3.5` [download] Note, however, that the format you give is not the standard format for hosts files. The usual format is more like... `# IP Masq gateway: 192.168.0.80 pedestrian # Primary desktop: 192.168.0.82 raptor1 # Family PC upstairs: 192.168.0.84 trex tyrannosaur family # Domain servers: 205.212.123.10 dns1 brutus 208.140.2.15 dns2 156.63.130.100 dns3 cherokee` [download] This is easy enough to read too... `open HOSTS, "</etc/hosts"; # Or "<C:\\WINDOWS\\hosts"; my %hosts = map{ my $ip, $hn, @hn; if (not /^\s#/) { chomp; s/\s#.*$//; ($ip, @hn) = split /\s+/, $_; } map { $_ => $ip }, @hn; }<HOSTS>;` [download] `$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/` [download]	[reply] [d/l] [select]
Re: newbie hasher by Art_XIV (Hermit) on Nov 20, 2003 at 13:58 UTC
If you expect your input to be 'dirty', then regular expressions might serve you better: `use strict; use Data::Dumper; my %hosts; while (<DATA>) { chomp; next if /^#/; $hosts{$1} = $2 if /^(\w+)\s(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/; warn "Garbage at line $. of DATA: $_\n" unless $1 and $2; } print Dumper(%hosts); 1; __DATA__ host1 1.13.10.116 #comment host2 21.90.30.31 #blah blah blah host3 56.87.1.10 host4 13.16.17 #The following hosts are on network B host5 57.98.18.10 #yadda yadda yadda host6 106.10.3.12` [download] Splits would be more efficient if you expect the input to be clean, though. Hanlon's Razor - "Never attribute to malice that which can be adequately explained by stupidity"	[reply] [d/l]


good chemistry is complicated, and a little bit messy -LW
	PerlMonks