I have to 'standardize' 1 million-ish records with demographic data that sit (alas) in SQL server.
The 'old-way' processes 6000 records/per hour on a good day.
Because I know Perl is the right way to go, but think that SQL server is a cow, I'm using this baby-Perl code within an activeX script within a DTS package to work with the list of words to find and replace (250 lines. example: ADMINISTRATION=ADMIN) and text file of demographics (1000000 lines example: C00006 DR. TOM SMITH VETERANS ADMINISTRATION BUFFALO NEW YORK) outside of the db environment. It's still too slow, though.
Target speed: 1 million records in an hour!
#*********************** # Perl ActiveX Script #*********************** sub Main() { use Win32; #allows for MsgBox functionality use OLE; $file1 = "c:/foo/sub_list.txt"; $file2 = "c:/foo/cust_names_city.txt"; $file3 = ">c:/foo/cust_names_city_new.txt"; open (SUBLIST, $file1) || die MsgBox("Can't open file"); open (CUSTLIST, $file2) || die MsgBox("Can't open file"); open (NEWCUSTLIST, $file3) || die MsgBox("Can't open file"); @pairs = <SUBLIST>; @names = <CUSTLIST>; foreach $pair (@pairs) { ($key,$value) = split (/=/, $pair); chomp ($value); foreach $name (@names) { $name =~s /\s$key/$value/ig; @newrecords = (@newrecords,"$name"); } } foreach $newrecord (@newrecords) { print NEWCUSTLIST $newrecord; } close (SUBLIST); close (CUSTLIST); close (NEWCUSTLIST); return 0; # DTSTaskExecResult_Success; } ******************************************
This humble and frustrated initiate thanks you in advance for your help!
guopan@earthlink.net
In reply to Simply Too Slow Find and Replace by guopan
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |