Inner file:Problem:Consists roughly of 154 records. If I could match 154 against outer file, I would be very happy.Outter file:Consists of about 256 records which contain various info about a userThose two files are comma delimited and have some other info than names as well. They very in their field number.
The only thing these two files have in common are some similarities in their names. To make things more difficult, the inner file is maintained manually so a human error factor has to be taken into account. Also, the outter file name might also have a middle name.Sudo code:To make sure that I can achieve the maximum number of hits, I split the names of the inner file into:
so I can catch:lastname substr[$firstname, 0, 2]but not
- Smith Steve
- Smith Stephan
and so on.
- Smith Agnes
- Smith John
- Smith Adam
Only important elements are left below to make it short:
So, in theory this code should work (I know it should because it was working at some point) and from what I wrote above, doesn't it look like I understand what I want to do?@file1 = <PHONEBOOK>; @file2 = <MATCHED>; foreach (@file1) { @record = split(',', $_); # so I can get name only $fullname = tr/()//d; # I don't want ( or ) in it $fullname = s/\s+/ /g; # Substitute one or more spaces anywhere wi +th one space only (per space matched) # Now, I have $fullname with info I want to match against. foreach (file2) { @fileds = split(',', $_); # to get name only my ($lname, $fname) = split(/\s+/, $fields[5]); # so I can do +a substring on the first name my $name = "$lname " . substr($fname, 0, 2); $name =~ tr/a-z/A-Z/; # You can tell me to use uc() function b +ut for now I will use what I know if ($fullname =~ m/^$name/g) { # at this point if a string from my inner loop matches the + one from the outter, print it out } } }Using hashes here would be nice if each file only had one elment per line where an element from line 1 in outter file would be a key for a value of element from line 1 in the inner file, which is not the case here.
In reply to Re: Re: Comparing two files by bman
in thread Comparing two files by bman
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data! Titles consisting of a single word are discouraged, and in most cases are disallowed outright. Read Where should I post X? if you're not absolutely sure you're posting in the right place. Please read these before you post! — Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
For: Use: & & < < > > [ [ ] ] Link using PerlMonks shortcuts! What shortcuts can I use for linking? See Writeup Formatting Tips and other pages linked from there for more info.