promoting array to a hash

sleepingsquirrel has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: promoting array to a hash by Zaxo (Archbishop) on Jun 13, 2004 at 05:05 UTC
I don't see anything wrong with what you have, but you may be looking for a hash slice. Leaving out the grep selection stuff, `my %words; while (<>) { @words{ split } = (); }` [download] I'm not sure what you mean by sorted here, hashes don't support any stable order. After Compline, Zaxo	[reply] [d/l]
Re^2: promoting array to a hash by sleepingsquirrel (Chaplain) on Jun 13, 2004 at 05:29 UTC
Yeah, there's nothing wrong with my code above, its just that I was wondering how to get rid of the unnecessary temporary variable `%words`. For example the following snippet... `@a = keys (a=>1,b=>2,c=>3);` [download] ...produces the following error... `Type of arg 1 to keys must be hash (not list), blah, blah, blah` [download] ...but I'm willing to bet that there is some syntax to fix the problem. `#This doesn't work @a = keys %{(a=>1,b=>2,c=>3)};` [download]	[reply] [d/l] [select]
Re^3: promoting array to a hash by Zaxo (Archbishop) on Jun 13, 2004 at 05:41 UTC
Oh, Ok, you almost have it, `@a = sort keys %{{a=>1,b=>2,c=>3}};` or in terms of your original problem, `@a = sort keys %{{ map {$_ => undef} map {split} <> }};` Notice the replacement of parens with curlies. That makes the hashlike list into a hash reference to its contents, and the outer %{} dereferences it. I agree with your desire to avoid temporary variables, I try to do that, too, in perl. After Compline, Zaxo	[reply] [d/l] [select]
Re^4: promoting array to a hash by sleepingsquirrel (Chaplain) on Jun 13, 2004 at 06:07 UTC
Re: promoting array to a hash by dragonchild (Archbishop) on Jun 13, 2004 at 05:08 UTC
`sub unique { my %x;@x{@_}=@_;values %x} my @sorted_unique = sort unique (split ' ', do { local $\=undef;<> });` [download] In other words, use the hash. ------ We are the carpenters and bricklayers of the Information Age. Then there are Damian modules.... sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon.* - flyingmoose I shouldn't have to say this, but any code, unless otherwise stated, is untested	[reply] [d/l]
Re^2: promoting array to a hash by BrowserUk (Patriarch) on Jun 13, 2004 at 05:24 UTC
A slightly quicker version of dragonchilds `unique()` sub. `sub uniq2{ my %x; @x{ @_ } = (); keys %x }` [download] Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail	[reply] [d/l] [select]
Re^3: promoting array to a hash by saskaqueer (Friar) on Jun 13, 2004 at 08:02 UTC
very important update: Please see Re^4: promoting array to a hash by BrowserUk for why the following code is horrifically wrong. Of course, that said, it requires one very simple `s!my!our!` to correct. I know that the map vs. slice benchmark has been done before, but just to do it again as a reminder :) #!perl -w use strict; use Benchmark ':all'; my @unsorted = map { join('', map { ('a'..'z','A'..'Z',0..9)[rand 62] } 1..50) } 1..5000; sub uniq_dragonchild { my %x; @x{@_} = @_; values %x } sub uniq_BrowserUk { my %x; @x{@_} = (); keys %x } sub uniq_Zaxo { keys %{ { map { $_ => undef } @_ } } } cmpthese( timethese(-60, { uniq_dragonchild => 'my @x = uniq_dragonchild(@unsorted)', uniq_BrowserUk => 'my @x = uniq_BrowserUk(@unsorted)', uniq_Zaxo => 'my @x = uniq_Zaxo(@unsorted)' } ) ); __END__ C:\>uniq.pl Benchmark: running uniq_BrowserUk, uniq_Zaxo, uniq_dragonchild for at +least 60 C PU seconds... uniq_BrowserUk: 64 wallclock secs (63.19 usr + 0.02 sys = 63.20 CPU) +@ 421025.4 1/s (n=26610069) uniq_Zaxo: 59 wallclock secs (60.08 usr + 0.03 sys = 60.11 CPU) @ 18 +6939.31/s (n=11237109) uniq_dragonchild: 64 wallclock secs (63.05 usr + 0.02 sys = 63.06 CPU +) @ 399674 .64/s (n=25204682) Rate uniq_Zaxo uniq_dragonchild uniq_Bro +wserUk uniq_Zaxo 186939/s -- -53% + -56% uniq_dragonchild 399675/s 114% -- + -5% uniq_BrowserUk 421025/s 125% 5% + -- [download]	[reply] [d/l] [select]
Re^4: promoting array to a hash by Roy Johnson (Monsignor) on Jun 13, 2004 at 13:58 UTC
Re^5: promoting array to a hash by dragonchild (Archbishop) on Jun 14, 2004 at 01:35 UTC
Re^5: promoting array to a hash by BrowserUk (Patriarch) on Jun 14, 2004 at 12:02 UTC
Re^4: promoting array to a hash by BrowserUk (Patriarch) on Jun 14, 2004 at 11:59 UTC
Re^5: promoting array to a hash by saskaqueer (Friar) on Jun 15, 2004 at 05:30 UTC
Re^3: promoting array to a hash by dragonchild (Archbishop) on Jun 14, 2004 at 01:30 UTC
It will be slightly quicker. However, it is of less use. My version will DWIM references while yours won't. ------ We are the carpenters and bricklayers of the Information Age. Then there are Damian modules.... sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon.* - flyingmoose I shouldn't have to say this, but any code, unless otherwise stated, is untested	[reply]
Re^4: promoting array to a hash by BrowserUk (Patriarch) on Jun 14, 2004 at 06:03 UTC
Re^5: promoting array to a hash by dragonchild (Archbishop) on Jun 14, 2004 at 11:25 UTC
Some notes below your chosen depth have not been shown here
Re: promoting array to a hash by ambrus (Abbot) on Jun 13, 2004 at 10:22 UTC
Now Zaxo has solved your original problem, but let me have a different question about your code. Do you realize that `grep /^[a-z]+$/, (split /\s/, join(" ",<>))` will return only those words that appear without punctation in the text? For example, if you input "hello, world" to that program, it will output only world, as split splits it to "hello," and "world" but `/^[a-z]+$/` does not match the first one. If that's what you want, ok. If you want to match those words with punctation too, you should do something like `grep /^[a-z]+$/, (join(" ",<>)=~/(\w+)/g)` [download] instead of the above grep. This makes the code look like this: `print "$_\n" for sort keys %{{ map {$_, 1} grep /^[a-z]+$/, (join(" ",<>)=~/(\w+)/g) }};` [download] or, more simply, `print "$_\n" for sort keys %{{ map {$_, 1} join(" ",<>)=~/\b[a-z]+\b/g }};` [download] Also, instead of eliminating the temp hash, one could use a temp hash but eliminate map, which is IMO more elegant. (Update: I now see this has been borught up before.) `my %hash; $hash{$_}++ for join(" ",<>)=~/\b[a-z]+\b/g; print "$_\n" for sort keys %hash;` [download]	[reply] [d/l] [select]
Re: promoting array to a hash by hsinclai (Deacon) on Jun 13, 2004 at 05:07 UTC
With even numbered elements, I thought you could just assign it: `use strict; my @friends = ("noc", "john", "brightland", "christine", "marsh", "bra +ndon"); # create hash from array my %friends = @friends; foreach my $entry (keys %friends) { print "Company $entry has buddy $friends{$entry}\n"; } __OUTPUT__ Company brightland has buddy christine Company marsh has buddy brandon Company noc has buddy john` [download] IIRC, for an uneven number of array element, the last pair in the hash is assigned with an empty value	[reply] [d/l] [select]
Re^2: promoting array to a hash by davido (Cardinal) on Jun 13, 2004 at 05:33 UTC
What does your answer have to do with the question? He's asking about using the keys of a hash to generate a list of words with duplicates filtered out. Hashes are good for this. You're talking about assigning array elements to hash key/value pairs. Hashes are good for that too, but those are two different, mostly unrelated subjects. Dave	[reply]
Re^3: promoting array to a hash by hsinclai (Deacon) on Jun 13, 2004 at 14:22 UTC
I totally missed the point, sorry for posting that.	[reply] [d/l] [select]
Re: promoting array to a hash by Jasper (Chaplain) on Jun 14, 2004 at 12:38 UTC
If all you are doing is printing a list of unique words from stdin, why not save a lot of wasted code and do: `print "$_\n" for sort <> =~ /\b(\S+)\b(?!.*\b\1\b)/g` [download] That is, use a negative lookahead to check the word doesn't appear again. Saves you joining, splitting, grepping, and mapping :). I have not benchmarked it, though.	[reply] [d/l]
Re^2: promoting array to a hash by sleepingsquirrel (Chaplain) on Jun 14, 2004 at 17:10 UTC
Benchmarking is worthwhile in this instance. The regex backtracking turns an Nlog(n) problem (assuming the sort dominates) into an N^2 problem. Here's the result of applying the two algorithms to the Net-Howto (which is 100 times smaller than the data set I initially used). `greg@spark:~/test$ cat sleepingsquirrel #!/usr/bin/perl print "$_\n" for sort keys %{{map {$_,()} grep /^[a-z]+$/, (split /\s/ +, join(" ",<>))}}; greg@spark:~/test$ time sleepingsquirrel Net-HOWTO >words.txt real 0m0.178s user 0m0.158s sys 0m0.016s greg@spark:~/test$ cat jasper #!/usr/bin/perl $/=undef; print "$_\n" for sort <> =~ /\b([a-z]+)\b(?!.\b\1\b)/sg greg@spark:~/test$ time jasper Net-HOWTO >words2.txt real 1m8.477s user 1m8.471s sys 0m0.003s` [download] ...only about 350x slower. YMMV	[reply] [d/l]