in reply to Re: Re: sorting a vec
in thread sorting a vec

Your welcome, but if you gave us a little more information, we could probably suggest a better solution.

For example, how many bits are in each of your integers?

How many do you need to sort?

Storing a million integers in a standard perl array and sorting them is an almost everyday procedure. Even on my 233Mhz laptop it takes less than five minutes using perls builtin sort. So, unless you have a very large number, or the integers are very large, the standard sort would probably suffice.

If you have a very large number of small integers, say 100_000_000 x 8-bits or less, or 1_000_000 x say 128 bit, this could still be done with a custom sort routine without too much trouble.

Knowing what your requirements are would allow someone here to suggest a solution.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!
Wanted!

Replies are listed 'Best First'.
Re: Re: Re: Re: sorting a vec
by Anonymous Monk on Nov 04, 2003 at 05:43 UTC
    I am going to be loading 50 million integers into the vec so I am using a bit parameter of 32 for my vec. Right now I am trying to figure out how to efficiently sort a randomly assigned vec containing integers from 1 - 50 mil. After that, I would then like to expand this sort to beable to sort this vec based on the integer values that exists in another vec. Why am I doing this? I am working on implementing suffix arrays in Perl so that I can expand a statistics package called nsp that currently uses hash tables. The problem with the hash table is that after a certain load it runs out of memory. Therefore, I am trying to see if there exists other data structures that will hold more data efficiently. So I am converting words to integers and then loading them into a suffix array. Sorry about the lack of information. I am not use to posting and am not certain where the point of too much information is. I hope this helps clarify more, and again thanks for the advice.

      There is no way to use the perl's built-in sort to achieve your goal.

      If your storing 32-bit integers in a string, then you effectively have a C-style array of ints. It would therefore be fairly easy (assuming that you compile your own perl) to use Inline::C to get at the base address of that C-style array, ie. the base address of the scalar, and use your C-RTL's qsort() routine to do the sorting. This would be far quicker than any Perl solution.

      If you use a binary distribution of perl, then it would still be possible, though awkward, to use the C-RTL qsort(). Getting the base address of the scalar can be done with pack, and (assuming AS perl), it should be possible to use Win32::API to get the entrypoint for qsort(). The problem then would be providing a callback address for the comparison function. Possibly possible :), but awkward and fragile. Mapping the pointers the qsort function gives to the compare function from one string to valid pointers into another would be possible, but messy.

      I had a play with implementing a pure perl qsort using vec to get at the individual integers. It works, and could be used to sort one string based upon the contents of another (although I'm not entirely sure I understand what you mean by that), but it wouldn't be quick by any stretch of the imagination.

      I tried it on 100,000 ints and it took a little under 5 minutes, which is an order of magnitude slower than using sort on a standard array. For contrast, the C-RTL qsort() function will do 1,000,000 in 4 seconds. I'm afraid that this is one of those things that there is simply no substitute for getting closer to the metal and using C or assember.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
      Hooray!
      Wanted!

        How do you do a perl quicksort ?