Re: Slow at sorting?

I have been thinking about the pros and cons of using hash or array.

So I made myself some data with a script approximating something like orbitals example, using the rand()-funktion on CD#, pdf-name, unknown digits in parenthesis and the different strings in brackets.
I know it's not the real thing but for what I trying to find out it seems OK to me.

The two approaches I've tested is either using a hash, straightforward and easy on the programmer, see my previous post. Or using an array and tucking the filepos to the right side, see petrals post.

The tests were run on a measly Pentium 133 with 64MB RAM and the results are as of the table below(YMWV).

The empty cells indicate heavy use of virtual memory.
Impatience being a virtue I only gave it double the expected time before I hit Ctrl-C.

kLines Filesize Hash Array

MB sec sec

100 6,16 11 12

150 9,24 16 18

175 10,78 19 22

190 11,704 22 24

200 12,32 23 25

210 12,936 24 26

220 13,552 25 27

230 14,168 62 29

250 15,4 31

300 18,48 37

350 21,56 44

400 24,64 107

500 30,8

My interpretation is that if you got the memory then the hash method is slightly faster, but using the array method will take you about twice as far in term of possibel file sizes.

Thinking about it, it seems rather logical considering that the hash is both key and value whilst array is value only.

It's also nice to see that both variations behave linearly with increasing volume until VM sets in, just as it's written in A Fresh Look at Efficient Perl Sorting by Uri Guttman and Larry Rosler.

Comment on Re: Slow at sorting?

kLines	Filesize	Hash	Array
	MB	sec	sec
100	6,16	11	12
150	9,24	16	18
175	10,78	19	22
190	11,704	22	24
200	12,32	23	25
210	12,936	24	26
220	13,552	25	27
230	14,168	62	29
250	15,4		31
300	18,48		37
350	21,56		44
400	24,64		107
500	30,8