in reply to best way to use grep

print LOG "\$xrefvalue is $xrefvalue \n";
$x=grep /$xrefvalue/, @xreflines;

I don't see where  $xrefvalue is being declared and initialized. I assume it's being built from  $DISTID1 and  $CUST1 in something like the way that Laurent_R shows here:
    my $xrefvalue = "$DISTID1$CUST1";
If so, there's a potential problem because two records like
    3696693;5308;;BJS BREWHOUSE;2631 EDMONDSON RD;...
    369669;35308;;HORSESHOE ROAD INN;12 3RD ST;...
will have the same  $xrefvalue value, "36966935308", unless there is some unstated rule that tells you this can never happen.

Better, IMHO, to use a non-numeric separator to guarantee unambiguous cross-ref values:
    my $separator = ';';
    ...
    my $xrefvalue = "$DISTID1$separator$CUST1";
A semicolon seems nice because the CSV field separator is already a semicolon.

The advice to build a cross-ref lookup hash seems very, very good. I imagine the rest of the code might look a lot like the code in Laurent_R's post except the split statement could be
    my ($dist, $cust) = (split $separator, $line)[0,1];
or
    my ($dist, $cust) = split $separator, $line, 3;
You don't say how big your database is, but a hash could accommodate tens of millions of cross references in system memory for very fast lookup; much more than that and you're looking at a database. (Other approaches, like using Text::CSV_XS or a regex field extractor, or perhaps emulated multidimensional hash keys might be better (or sexier), but let's just take one step at a time!)


Give a man a fish:  <%-{-{-{-<