comment on

I have been thinking about the pros and cons of using hash or array.

So I made myself some data with a script approximating something like orbitals example, using the rand()-funktion on CD#, pdf-name, unknown digits in parenthesis and the different strings in brackets.
I know it's not the real thing but for what I trying to find out it seems OK to me.

The two approaches I've tested is either using a hash, straightforward and easy on the programmer, see my previous post. Or using an array and tucking the filepos to the right side, see petrals post.

The tests were run on a measly Pentium 133 with 64MB RAM and the results are as of the table below(YMWV).

The empty cells indicate heavy use of virtual memory.
Impatience being a virtue I only gave it double the expected time before I hit Ctrl-C.

kLines Filesize Hash Array

MB sec sec

100 6,16 11 12

150 9,24 16 18

175 10,78 19 22

190 11,704 22 24

200 12,32 23 25

210 12,936 24 26

220 13,552 25 27

230 14,168 62 29

250 15,4 31

300 18,48 37

350 21,56 44

400 24,64 107

500 30,8

My interpretation is that if you got the memory then the hash method is slightly faster, but using the array method will take you about twice as far in term of possibel file sizes.

Thinking about it, it seems rather logical considering that the hash is both key and value whilst array is value only.

It's also nice to see that both variations behave linearly with increasing volume until VM sets in, just as it's written in A Fresh Look at Efficient Perl Sorting by Uri Guttman and Larry Rosler.

In reply to Re: Slow at sorting? by guha
in thread Slow at sorting? by orbital

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.

kLines	Filesize	Hash	Array
	MB	sec	sec
100	6,16	11	12
150	9,24	16	18
175	10,78	19	22
190	11,704	22	24
200	12,32	23	25
210	12,936	24	26
220	13,552	25	27
230	14,168	62	29
250	15,4		31
300	18,48		37
350	21,56		44
400	24,64		107
500	30,8