in reply to Get data from a file to search and replace within a second file
First a few suggestions:
#!/usr/bin/perl use strict; use warnings; # Fake up a couple of files my $file_a = <<TXT; a1\tb1\tc1\td1 a2\tb2\tc2\td2 a3\tb3\tc3\td3 TXT my $file_b = <<TXT; starting form the top of the file I need 1. to get the the value in th +e 3rd column (c1) 2. search in a second file (file_B.txt, not tab delimited and quite me +ssy) all the matches for it. 3. when a match is found, I would like to append to the current value +(c1), the value of 4th column (d1) in file_A.txt, separated by a space. 4. go back to the first file (file_A.txt), get the the value in the 3r +d column in the second row (c2) and do another round of search and insert the v +alue of d2 in the second file (file_B.txt). TXT # Now the 'real' work - \$file_b treats $file_b as a file open my $inB, '<', \$file_b or die "Can't open file_b: $!"; my $fileBStr = do {local $/; <$inB>}; # Slurp in all of file_b close $inB; open my $inA, '<', \$file_a or die "Can't open file_a: $!"; while (<$inA>) { chomp; my @parts = split /\t/; next if @parts < 4; $fileBStr =~ s/\b $parts[2] \b/$parts[2] $parts[3]/xgm; } close $inA; print $fileBStr;
Prints:
starting form the top of the file I need 1. to get the the value in th +e 3rd column (c1 d1) 2. search in a second file (file_B.txt, not tab delimited and quite me +ssy) all the matches for it. 3. when a match is found, I would like to append to the current value +(c1 d1), the value of 4th column (d1) in file_A.txt, separated by a space. 4. go back to the first file (file_A.txt), get the the value in the 3r +d column in the second row (c2 d2) and do another round of search and insert th +e value of d2 in the second file (file_B.txt).
Reading the file you are editing into memory is fine unless its size is hundreds of megabytes. For very large files you probably need to turn the loop inside out - read all the edit information from file a and store that in memory, then read file b a line at a time and apply all the edits to the current line before saving it and moving on to the next.
|
|---|