Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Matching and combining two text files

by GrandFather (Saint)
on Jan 23, 2012 at 04:27 UTC ( #949304=note: print w/replies, xml ) Need Help??


in reply to Matching and combining two text files

The standard technique for looking stuff up is to use a hash:

use strict; use warnings; my $file1 = <<FILE1; parcel# 12345 doc num 123 doc num 456 doc num 789 parcel# 67890 doc num 342 doc num 657 doc num 876 FILE1 my $file2 = <<FILE2; doc num 342 data data data data data data data data doc num 657 data data data data data data data data doc num 876 data data data data data data data data doc num 123 data data data data data data data data doc num 456 data data data data data data data data doc num 789 data data data data data data data data FILE2 my %docs; my $currParcel; open my $f1In, '<', \$file1; while (<$f1In>) { chomp; next if ! $_; if (/parcel#\s+(\d+)/) { $currParcel = $1; next; } next if ! defined $currParcel || ! /^doc num (\d+)/; $docs{$1} = $currParcel; } close $f1In; open my $f2In, '<', \$file2; while (<$f2In>) { chomp; next if ! /doc num\s+(\d+)\s+(.*)/; if (! exists $docs{$1}) { warn "Parcel not known for $1\n"; next; } print "parcel# $docs{$1} doc num $1 $2\n"; } close $f2In;

Prints:

parcel# 67890 doc num 342 data data data data data data data data parcel# 67890 doc num 657 data data data data data data data data parcel# 67890 doc num 876 data data data data data data data data parcel# 12345 doc num 123 data data data data data data data data parcel# 12345 doc num 456 data data data data data data data data parcel# 12345 doc num 789 data data data data data data data data

However this task looks like it should really be using a database. If there are more than a few hundred entries in the files and the data is likely to be referenced more than a small number of times a database will make your life much happier (eventually).

True laziness is hard work

Replies are listed 'Best First'.
Re^2: Matching and combining two text files
by koolgirl (Hermit) on Jan 23, 2012 at 04:36 UTC

    Thanks, GrandFather, I suspected as much, about the hash, but my experience is a bit limited with them, as such, I had a hard time envisioning how to match up the keys/values, although now it seems obvious. Yes, the company I'm working for is using a db, I'm actually writing the code to put it there (create a .csv out of all collected data), unfortunately in doing so, I have to deal with about a half a million records, even a small chunk of that to work on and test, is mind boggling.

    Half of the time, since I began working as a Perl programmer *sniff....koolgirl's growing up...*, I feel brilliant, the other half of the time, I feel like a complete dumb a$#. I guess it evens out eventually?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://949304]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2022-09-28 18:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I prefer my indexes to start at:




    Results (124 votes). Check out past polls.

    Notices?