Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: matching information from two files and printing off the results -help needed

by graff (Chancellor)
on May 08, 2009 at 01:53 UTC ( [id://762738]=note: print w/replies, xml ) Need Help??


in reply to matching information from two files and printing off the results -help needed

I'm guessing that the code you posted is doing a lot more work than it needs to do for the problem at hand -- or rather, you've written a lot more code than you needed to. It's a bit too messy for me to go into in detail (senseless use of multiple blank lines, random indentation, etc), so let me start over...
#!/usr/bin/env perl use strict; use warnings; my $Usage = "$0 file1 file2 > file2.new\n"; die $Usage unless ( @ARGV==2 and -f $ARGV[0] ); my ( $file1, $file2 ) = @ARGV; my %edit; open( IN, "<", $file1 ) or die "open failed on $file1: $!\n"; while (<IN>) { my ( $type, $line ) = split; my $offset = ( $type eq '1gtiA' ) ? 0 : 1; $edit{$line}[$offset]++; } close IN; my @item_line = ( 0, 0 ); open( TBL, "<", $file2 ) or die "open failed on $file2: $!\n"; while (<TBL>) { my @edit_field = ( /(\S)(\S)\s*$/ ); my $changed = 0; for my $f ( 0, 1 ) { if ( $edit_field[$f] ne '-' ) { $item_line[$f]++; if ( exists( $edit{$item_line[$f]}[$f] )) { $edit_field[$f] = lc $edit_field[$f]; $changed++; } } } s/\S\S(\s*)$/join('',@edit_field,$1)/e if $changed; print; }
(updated to include "or die ..." on the open() calls, as per normal practice.)

I put your two sample data files into "f1" and "f2", saved the script shown here as "j.pl" and ran it like this in a bash shell:

j.pl f1 f2 > f3
and the contents of f3 matched what you posted as the desired output.

Update to add some commentary on your code: Sorry about dissing your indentation -- when I put your code into emacs, it was fine -- alas, you have to remember, when posting code here, that mixing tabs and spaces for indentation creates an unattractive appearance inside our beloved code tags; convert tabs to 8-space sequences before posting.

Apart from that, after removing all the unnecessary (commented out and blank) lines, it took a while to figure out which file name should be given first on the command line (i.e. what do the names "scorecons" and "csa" have to do with the file contents, and which is which anyway?) The code I posted would be improved if the $Usage message were rephrased to make this clear -- in my version, what you posted as "file1" should be the first file named on the command line. (But I think that's clear from the code itself, whereas it's a lot harder to tell in the code you posted -- all the more reason to make sure you provide a "$Usage" synopsis that is easy to get.)

You seem to be doing stuff that has no bearing at all on the task as you described it. On top of that, I didn't see anything in your code that would actually print out an edited version of file2 (which was supposed to be the task, right?). A "work in progress", indeed...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://762738]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2024-04-25 12:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found