Dear All,

Please find the script below which although doesnt have immediate errors..is not what I was hoping it would acheive!

#perl -w print("Importing two data files \n"); # Lets open up file A which reads data from names_ticker to FILE_A open(FILE_A, "< Names_Ticker.txt") or die "Couldn't open Names_Ticker for reading: $!\n"; @fileA=<FILE_A>; close(FILE_A); print "@fileA \n"; # Lets open up file B which reads data from names_offer to FILE_B open(FILE_B, "< Names_Offer.txt") or die "Couldn't open Names_Offer for reading: $!\n"; @fileB=<FILE_B>; close(FILE_B); print "@fileB \n"; #Print the lists generated with commas... sort the format out! #sub commify_series { # my $sepchar = grep(/,/ => @_) ? ";" : ","; # (@_ == 0) ? '' : # (@_ == 1) ? $_[0] : # (@_ == 2) ? join(" and ", @_) : # join("$sepchar ", @_[0..($#_-1)], "and $_[-1]"); # } # #foreach $aref(@fileB) { # print "This lists contains: " . commify_series(@aref) . ".\n" +; # #} print("Comparing company names between two files\n"); # Essentially we want to find elements that are in one array but not a +nother, # that is find elements in @fileA that arent in @fileB. # Solution? build hash keys of @fileB to use as a lookup table. Then c +heck each # element in @fileA to see if it is in @fileB %seen = () ; # lookup table to test memebership of @fileB @aonly = () ; # answer # build lookup table foreach $item (@fileB) { $seen{$item} = 1} # find only elements in @fileA and not in @fileB foreach $item (@fileA) { unless ($seen{$item}) { # its not in %seen, so add to @aonly push(@aonly, $item); } } print "@aonly \n"; print("Saving matched company records as new file\n");

Ideally, as suggested from previous email - I am merging two records @fileA and @fileB into a single new file which is given as @aonly.

Problems I face:
  1. the contents of @fileA and @fileB - I would like to convert everything into lower capitals
  2. can I add a search condition to this script which would search (a) the first 4 letters of each string in the arrays (either from the start or end of line? )

Your advice will be appreciated.

regards

Edited by Chady -- cleaned up formatting

In reply to Fuzzy Matching by Tan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.