comment on

what I meant was that this bit of code in the program(in the subroutine process)
foreach $key(keys %tagcorpus){
    print "\n$key";
    return($key);
}
[download]

The reason for the difference between the for version of the loop and the while version is that in the for version, the keys operator generates a list of the keys at the top of the loop, and then gives you the first one of these. Your code print out that (first) key and the returns it to the caller. This terminates the for loop having only iterated once and ignoring all the other keys.

You always get the same value, because the next time you call the sub, you re-enter the for loop at the top and call the keys function again. It re-builds the list of all the keys and then gives you the first one again. Hence, you always get the same (first) key returned from the sub, each time you call it.

However, in the while each version of the loop, the each function works in a completely different way. each acts as an iterator. That is to say, the first time you call each it gives you the first key/value pair, but it also remembers which ones it gave you internally, and the next time you call it, it gives you the next key/value pair and again remembers which one it gave you. It continues to give you the 'next' key/value pair each time you call it until it has given you them all, at which point it will give you undef to indicate to you that it has reached the end of the list. The next time after that, it will again give you the 'first' key.

There are a couple of caveats with this.

'First' and 'next' in the above descriptions do not relate to any concept of first & next as applied to the order in which you created the keys, nor to any ordering that you should try and predict or that your code should rely upon as it can, and does, vary from version to version of perl.
If you use the each function to step part way through the list of key/value pairs of a given hash and then call the either the keys or values function on that same hash somewhere in else in your code, it will reset the 'memory' of the each iterator and then next time you call it, it will start from the beginning of the list again.

You should also not modify the hash whilst iterating over it with each as this will invalidate the iterators memory of how far it got, but it won't notice and it won't warn you. Eg.

undef %h; 
@h{'a'..'m'}=1..13;

print each %h, ' - ' for 1..7; 
# Gives "e 5  - a 1  - m 13  - d 4  - j 10  - l 12  - c 3  -"
# The first seven key/value pairs in some order

print each %h, ' - ' for 1..7; 
# Gives "k 11  - h 8  - b 2  - g 7  - f 6  - i 9  -  -"  
# The last 6 + an empty pair to indicate the end of the list


undef %h; @h{'a'..'m'}=1..13
print each %h, ' - ' for 1..7;
# Gives "e 5  - a 1  - m 13  - d 4  - j 10  - l 12  - c 3  -" The firs
+t seven as before

 @h{'n'..'z'} = 14..26; # Now modify the hash by adding some new stuff

# Now continue iterating them from whre we were before
print each %h, ' - ' for 1..7; 
# Gives "p 16  - k 11  - h 8  - g 7  - f 6  - t 20  - i 9  -" 
# Looks good, no empty pair so it knows there are more
# and nothing is duplicated...

print each %h, ' - ' for 1..7; # Print the next batch
# "e 5  - n 14  - v 22  - m 13  - s 19  - l 12  - c 3  -"  Whoops!!
# Even though we haven't had the empty pair to indicate the of the lis
+t
# Were starting to see some elements being repeated.
# We've already seen e, m, j, l & c in the first batch. 

print each %h, ' - ' for 1..7;
# Gives "p 16  - b 2  - q 17  - z 26  - o 15  -  - w 23  -" NOTE the e
+mpty pair!
# Now we have reached the end of the list, wrapped and are starting ag
+ain
# But the first one we get this time is 'w' rather than 'e' that we go
+t first time
# And it isn't 'a' in either case.
[download]

None of this is a bug! This is all expected behaviour and the anomoly that you perceive is your misunderstanding of the way things work. I'd recommend that you review your understanding by (re-)reading the documentation of perlfunc:each, perlfunc:keys and perlfunc:values.

A word of caution. If you ever find yourself returning from the middle of a loop within a subroutine, especially unconditionally, you should probably look twice at your code and think if the way you are approaching the problem is the best way. Ocassionally it will be, but most times this is an indication that you should re-think the sub.

Of course the words in the sentence are not properly ordered.For this I think we need to extract the keys of %tagcorpus in the insertion order and then write onto the file.This is what I need to do now :)

This is going to be a problem. There isn't any easy way of retrieving the keys or values from a standard perl hash "in there insertion order", and the need to do so usually indiactes that you are using the wrong data structure. However, there are cases (and this may be one of them, but its hard to tell from the snippets of code that you have posted:), where you want the fast lookup afforded by the hash, but you also need to do insertion order retrieveal. In this case, there is a module Tie::IxHash that will allow you to do this. I suggest you follow that link and decide for yourself if this may be useful to you.

Examine what is said, not who speaks.

"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

In reply to Re: Re: Re: Re: Re: File write by BrowserUk
in thread File write by perl_seeker

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.