For bulk processing like this, I'd sidestep DBI and use command lines tools.
Create a file called: getnames.sql containing:
select UserName from table where UserName <> '' and UserName not like +'deleted%'
And two perl scripts. 1) partNames.pl:
#! perl -w use strict; my %fhs; while( <> ) { my $char = substr $_, 0, 1; if( not exists $fhs{ $char } ) { open $fhs{ $char }, '>', "names.$char" or die $!; } print { $fhs{ $char } } $_; }
And 2) uniqNames. pl:
#! perl -w use strict; for my $fname ( glob 'names.*' ) { my %uniq; open my $fh, '<', $fname or die $!; while( <$fh> ) { print unless exists $uniq{ $_ }; $uniq{ $_ } = 1; } }
Then run the following pair of commands:
mysql --quick < getnames.sql | perl partNames.pl perl uniqNames.pl > uniqedNames.txt
The first line dumps the names and pipes them to the perl script that then writes them to a set of files (called names.?, where ? is the first character of the name).
The second script reads those files one at a time and writes the unique names to stdout where they are directed into a file called uniqedNames.txt.
The whole (untested) process should take less time than sorting the 34 million names -- inside the DB or with your system sort utility. I'd hazard a guess that it would take less than 20 minutes.
In reply to Re: Fetch Mysql huge database
by BrowserUk
in thread Fetch Mysql huge database
by locked_user ym_chaitu
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |