G'day james28909,
"Anyway, let me start off by posting example code and files:"
For future reference, please post a short, representative sample of your data here. I tried to download the zip file you linked to, but
$ wget https://dl.dropboxusercontent.com/u/64707444/monks/monks.zip --2015-10-29 15:29:58-- https://dl.dropboxusercontent.com/u/64707444/ +monks/monks.zip Resolving dl.dropboxusercontent.com (dl.dropboxusercontent.com)... 199 +.47.217.101 Connecting to dl.dropboxusercontent.com (dl.dropboxusercontent.com)|19 +9.47.217.101|:443... connected. ERROR: The certificate of `dl.dropboxusercontent.com' is not trusted. ERROR: The certificate of `dl.dropboxusercontent.com' hasn't got a kno +wn issuer.
[Perhaps I could've tried harder to get this but I don't really have the time and I shouldn't have to, anyway.]
Here's some tips on the code you presented.
When opening files, always check for problems. Either use the autodie pragma or hand-craft messages (see open for examples).
Repeatedly opening files in a loop, and reading their entire contents multiple times, is rarely (if ever) a good idea. I see that you've done this in both a while and a for loop. Aim to open and read once. If you need to jump around in an opened file, consider seek and tell.
When you read "file1" (for the first time), it may be better to store the data in a hash. For example, instead of
push( @original, $rightside );
perhaps something closer to
++$original{$rightside};
You can then lose the "for (@original) {...}" loop altogether, and change
if ( $last =~ $_ ) {
to something like
if ($original{$last}) {
Also, your use of a regex match ($last =~ $_) seems questionable. I haven't delved too deeply into this, but a straight equality check ($last eq $_) looks like it might be a better idea.
These suggestions have been intentionally vague. Without any input and only erroneous expected output (you wrote: "EDIT: It seems there is indeed an error in the output"), I am somewhat loathe to attempt to suggest anything more concrete with regards to the actually processing.
If you do provide sample input and real expected output, myself (or another monk) might provide a better answer.
— Ken
In reply to Re: Getting data from second file, based on first files contents;
by kcott
in thread UPDATED - Getting data from second file, based on first files contents;
by james28909
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |