in reply to Re: Filtering Output from two files
in thread Filtering Output from two files

i am actually new to scripting languages i didn't quite follow what you meant in step 2

Replies are listed 'Best First'.
Re^3: Filtering Output from two files
by roboticus (Chancellor) on Feb 04, 2018 at 14:25 UTC

    pvighneshmufc:

    He meant that the following lines are inside of a loop, like this:

    # read file1 into a %hash ... code to do that here ... # inside a loop while (my $line = <$file2>) { # read file2 line by line ... this was done in the loop condition above ... # split the $line to @fields at | # if the first $fields[0] exists in the %hash, # print the whole $line to file 3 }

    This is a relatively common question, so LanX gave you the outline of a good solution to the problem.

    A frequent mistake is to try to read *both* files inside the loop, giving one of two bad outcomes:

    • Either the first file is completely read in the first pass of the loop, so the code can only find a single match if it happens to be the first line in the second file, or
    • the code re-opens the first file each time, and therefore can find all the matches, but runs extremely slowly(1) because it reads the first file completely for each line in the second file.

    (1) Extremely slowly in the relative sense--for small files you may not notice it. But if your files get large enough, you'll wonder why such a fast computer is so freakin' slow.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      The file has around 7 lakh lines so yeah :p

        If your File1 has 700,000 lines, rest assured that a hash can easily accommodate this number of lines or even many more — if the lines are sufficiently short, i.e., not thousands of characters!

        One pitfall to avoid: When you read each line from your File1, it will have some kind of line-end delimiter, typically a newline. This will have to be removed (see chomp) before adding it to the lookup hash because a COA213345 field split from the beginning of a line in File2 will have no such delimiter. (Looking up a key in a hash (see exists) is essentially an  eq string exact equality operation.)

        Perhaps take a look at some of the articles in the Input and Output section of the Monastery's Tutorials.


        Give a man a fish:  <%-{-{-{-<

        Lakh is not really a standard English word!

        Maybe you should show some efforts in programing AND your communication skills?

        So yeah :p

        Cheers Rolf
        (addicted to the Perl Programming Language and ☆☆☆☆ :)
        Wikisyntax for the Monastery

        PS: We are still happy to help you fixing code, as soon as you show some.

Re^3: Filtering Output from two files
by LanX (Saint) on Feb 04, 2018 at 12:17 UTC
      Thanks a lot , suggest me some tutorials to start with , i am very new to this so suggest like that :p

        There's the tutorials section of this site to start with.

Re^3: Filtering Output from two files
by hippo (Archbishop) on Feb 04, 2018 at 12:38 UTC