I need help with this.
I need to build a cross reference table from one file with say distributor ID and customer ID.
I have another file that I need to check whether the distributor ID and customer ID (from above cross reference table I built) combination already exists in this new file and create a new file only if the xref table I built doesn't have this record.
I used grep and since the files I am dealing with are huge, it is taking 5-6 hrs to complete the grep.
Is there a better way to do this? Any help is appreciated.
here is the sample code I have:# Now read the values from XREF into an array open XREF, "<$xref" or warn $!; @xreflines=<XREF>; #remove the blank lines from the araary @xreflines = grep /\S/, @xreflines; print LOG "XREF lines are read into an array $now \n"; close XREF; #now read the good customer file created from VIPOUT and check against + the XREF file and see whether the distributor/customer combination a +lready exists. If not create a new file and this is the one that gets + loaded into BW open CUSTOMERFILE, "<$out_file" or warn $!; @lines=<CUSTOMERFILE>; close CUSTOMERFILE; open XFILE, ">>$goodfile" or warn $!; foreach $line (@lines) { ($DISTID1,$CUST1,$junk)=split('\;',$line); chomp $DISTID1; chomp $CUST1; print LOG "\$xrefvalue is $xrefvalue \n"; $x=grep /$xrefvalue/, @xreflines; if ( $x == 0 ) { print LOG "\$x value is $x \n"; print LOG "\$line is $line \n"; print XFILE "$line \n"; } } close XFILE;
Here is the same xref file: (combination of distributor ID & customer ID)
3036802849
3036802842
3036802854
3036802856
30368021983
3036802882
30368021703
3036802258
30368026951
30368022425
30368025243
and the data from the customer file:
3696693;5308;;BJS BREWHOUSE;2631 EDMONDSON RD;452091910;CINCINNATI;OH;;;;;US0109;;;;;;;;;;;110207;;;;;;;;;3;;;;;;;;;
1871781;01800;;BRADYS;25 UNION ST;045382116;BOOTHBAY HARBOR;ME;;;;;US0109;;;;;;;;;;;110207;;;;;;;;;3;;;;;;;;;
3172110;59475;;ARRIVEDERCI ITALIAN CUISINE;8900 E PINNACLE PEAK RD STE D1;852553647;SCOTTSDALE;AZ;;;;;US0109;;;;;;;;;;;110207;;;;;;;;;3;;;;;;;;;
3172110;26154;;KINGS MINI MART;4150 N 35TH AVE;850173858;PHOENIX;AZ;;;;;US0109;;;;;;;;;;;110207;;;;;;;;;3;;;;;;;;;
2534996;5830;;CAZADOREZ MEXICAN RESTAURANT;3900 N HARRISON ST;748041427;SHAWNEE;OK;;;;;US0109;;;;;;;;;;;110207;;;;;;;;;3;;;;;;;;;
2534996;3473;;ROUTE 66 NAMAN LIQUOR;4301 N SARA RD UNIT 106;730993223;YUKON;OK;;;;;US0109;;;;;;;;;;;110207;;;;;;;;;3;;;;;;;;;
1871316;8P670;;H L PENINSULA PEARL;1590 BAYSHORE HWY;940101601;BURLINGAME;CA;;;;;US0109;;;;;;;;;;;110207;;;;;;;;;3;;;;;;;;;
3138693;46526;;K & N MARKET;464 N BAILEY ST;480654710;ROMEO;MI;;;;;US0109;;;;;;;;;;;110207;;;;;;;;;3;;;;;;;;;
1870957;044816-SAME;;SHELL GAS;1752 WEBSTER AVE;104577341;BRONX;NY;;;;;US0109;;;;;;;;;;;110207;;;;;;;;;3;;;;;;;;;
first 2 fields are distributor ID and customer ID.
if the distributor ID and customer ID matches from xref file, then I need to ignore it else, I need to create a new file. The output is the same line from the customer file.
In reply to best way to use grep by vbynagari
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |