in reply to Re^2: Strategy for randomizing large files via sysseek
in thread Strategy for randomizing large files via sysseek

Are the initial files random, or are they already in some sorted order? If they are already random, just concatenate them together into one giant load file. Once loaded you can pull them out based on rownum so they will come out in the exact order they went into the table. Not truly random, but random as the initial files.

If the initial files are ordered, the above solution doesn't work. For a mySQL based table, you could do something like

SELECT * FROM BLA ORDER BY RAND()

although I'm not sure of exactly how random the results will be.

Replies are listed 'Best First'.
Re^4: Strategy for randomizing large files via sysseek
by RiotTown (Scribe) on Sep 09, 2004 at 15:23 UTC
    Should have read just a bit further on the mySQL page about RAND()

    Note that RAND() in a WHERE clause is re-evaluated every time the WHERE is executed. RAND() is not meant to be a perfect random generator, but instead a fast way to generate ad hoc random numbers that will be portable between platforms for the same MySQL version.