Re: compare 2 files and print to a third
by Anonymous Monk on Nov 05, 2003 at 21:55 UTC
|
Doesn't this node remind you of a Monty Python movie?
Headmaster: All right, settle down, settle down.
Now before I begin the lesson will those of you
who are playing in the match this afternoon move your clothes
down on to the lower peg immediately after lunch before you
write your letter home, if you're not getting your hair cut,
unless you've got a younger brother who is going out this
weekend as the guest of another boy, in which case collect his
note before lunch, put it in your letter after you've had your
hair cut, and make sure he moves your clothes down onto the
lower peg for you. Now...
Wymer: Sir?
Headmaster: Yes, Wymer?
Wymer: My younger brother's going out with Dibble this weekend,
sir, but I'm not having my hair cut today sir, so do I move my
clothes down or...
Headmaster: I do wish you'd listen, Wymer, it's perfectly simple.
If you're not getting your hair cut, you don't have to move
your brother's clothes down to the lower peg, you simply
collect his note before lunch after you've done your scripture
prep when you've written your letter home before rest, move
your own clothes on to the lower peg, greet the visitors, and
report to Mr Viney that you've had your chit signed. ...
| [reply] |
Re: compare 2 files and print to a third
by Roger (Parson) on Nov 05, 2003 at 22:49 UTC
|
Provided that the files are not overly large, the following algorithm should suffice -
Step 1 - open the first file, while reading each line, extract the $login from the line, and then add the entry $login => $rest_of_the_line in a hash table.
Step 2 - open the second file, while reading each line, extract the $login from the line, and then look up the $login in the hash table built from the first file.
Step 3 - if a match is found, then do compare/combine/whatever with the data stored in the hash table and the data held in current line. Output the result to the third file.
Ok, sounds like you need some code...
use strict;
use IO::File;
my %data; # hash to store data from first file
my $f = new IO::File "first.txt", "r" or die "Can't open file 1";
while (<$f>) {
chomp;
my ($login) = /^(\w+):/;
$data{$login} = $_;
}
undef $f;
my $f = new IO::File "second.txt", "r" or die "Can't open file 2";
my $o = new IO::File "third.txt", "w" or die "Can't create file 3";
while (<$f>) {
chomp;
my ($login,$info) = /^(\w+)(:.*)/;
if ($data{$login}) {
print $o $data{$login} . $info, "\n";
}
}
undef $f;
undef $o;
And the files -
---- first.txt ----
0001:rec1:rec2:rec3
0002:recA:recB:recC
0003:recX:recY:recZ
---- second.txt ----
0001:rec4:rec5:rec6
0002:recD:recE:recF
The output file -
---- third.txt ----
0001:rec1:rec2:rec3:rec4:rec5:rec6
0002:recA:recB:recC:recD:recE:recF
| [reply] [d/l] [select] |
Re: compare 2 files and print to a third
by sauoq (Abbot) on Nov 05, 2003 at 22:54 UTC
|
Fellow monks, I have a question
I looked for a question mark. I didn't find one. In fact, terminal punctuation seemed rather scarce altogether in your post. Would you please rephrase your question using multiple sentences and, perhaps, paragraphs? It is very difficult to interpret as it stands.
-sauoq
"My two cents aren't worth a dime.";
| [reply] |
Re: compare 2 files and print to a third
by QM (Parson) on Nov 05, 2003 at 23:05 UTC
|
First, see How do I post a question effectively? to avoid comments as above.
Assuming that you have files that resemble this:
some_unique_key other stuff I want to grab
another_key more other stuff that's cool
wow_how_many_keys_are_there lot's - why do you ask?
...and you want to append data from file2 onto the end of data from file1, paired up by keys, you might try something like this:
#!/your/perl/here
use strict;
use warnings;
my %file1;
my $key;
my $value;
usage() unless @ARGV == 3;
my $output_file = pop @ARGV;
# read first file
while (<>)
{
chomp;
($key,$value) = split / /, $_, 2
or warn "Bad data on line $. in file $ARGV, ";
$file1{$key} = $value
or warn "Bad data on line $. in file $ARGV, ";
}
continue
{
# reset line numbers for warning messages
# end loop
if ( eof ) # note special form of eof
{
close ARGV;
last;
}
}
while (<>)
{
chomp;
($key,$value) = split / /, $_, 2
or warn "Bad data on line $. in file $ARGV, ";
if ( exists( $file1{$key} ) )
{
$file1{$key} .= ' ' . $value
or warn "Bad data on line $. in file $ARGV, ";
}
else
{
warn "Can\'t find key matching <$key> (line $.) "
. "in file <$ARGV>, ";
$file1{$key} = $value
or warn "Bad data on line $. in file $ARGV, ";
}
}
continue
{
last if ( eof ) # note special form of eof
}
open( OUT, ">", $output_file )
or die "Error opening $output_file for writing, ";
foreach my $k ( keys %file1 )
{
print OUT "$k $file1{$k}\n";
}
sub usage
{
die "Need 3 filenames on the command line.\n"
. "First 2 files are merged by key into the 3rd file.\n";
}
__END__
which with inputs of file1:
one one one one one
two two two two two
three three three three
and file2:
one ONE ONE ONE ONE
two TWO TWO TWO TWO
three THREE THREE THREE
yielded:
three three three three THREE THREE THREE
one one one one one ONE ONE ONE ONE
two two two two two TWO TWO TWO TWO
Update: doesn't handle duplicate keys in the same file. You'd need to create a parallel hash for counters, or complicate the %file1 hash.
-QM
--
Quantum Mechanics: The dreams stuff is made of
| [reply] [d/l] [select] |
Re: compare 2 files and print to a third
by allolex (Curate) on Nov 05, 2003 at 23:18 UTC
|
I took a liberty and rephrased your question. I hope I got it right.
I am wondering how to take 2 different files whose lines have a unique key called $login. This key is at the beginning of the line and it is the only piece of information common to the records. I would like to go through the file, line by line, and concatenate the contents of all the lines with the same keys, which I will output to a third file.
To do this, you will want to read in both files, and isolate $login and use it as a hash key. (So far, so good.) So use a regular expression or split (or something) and do
if ( exists ( $hash{$login} ) ) {
$hash{$login} .= $delimiter . $_;
}
else {
$hash{$login} = $_;
}
Then, once you are done, you will want to print out your hash.
foreach my $key (keys %hash) {
print "$key : $hash{$key};
}
| [reply] [d/l] [select] |
|
|
key1 other stuff
key2 other stuff
...
File 2 key2 more stuff
key1 more stuff
...
Desired result: File3 key1 other stuff more stuff
key2 other stuff more stuff
I tried this ....
but it's not working because ....
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!
Wanted!
| [reply] [d/l] [select] |
|
|
| [reply] |
Re: compare 2 files and print to a third
by Anonymous Monk on Nov 05, 2003 at 21:54 UTC
|
Uhh, what have you tried so far? | [reply] |
Re: compare 2 files and print to a third
by graff (Chancellor) on Nov 06, 2003 at 06:21 UTC
|
Look at this utility I posted a while back -- it does something quite close to what you want (and some other related stuff besides, which you might find helpful). | [reply] |
Re: compare 2 files and print to a third
by ysth (Canon) on Nov 05, 2003 at 22:47 UTC
|
I think your overall structure will depend on what you want to do with the edge cases: key in file 1 but not file 2, or vice versa, and whether the files are in the same order.
You will want to either read one record at a time from both files together (which makes for a slightly messy loop structure) or read both files into one or two hashes, keeping a separate array of the keys if you need to preserve the original order(s).
If you are downvoting this for other than lack of interest, thanks for sending me a message or reply to let me know why so I can do better (or just stay quiet) next time.
| [reply] |