freakpea has asked for the wisdom of the Perl Monks concerning the following question:

Hi Guys Sorry a bit new to this so struggling a bit I have a data file with strings that I need to search for in another file. At the moment the match is for the whole line but I would like to change it that the match is that the data can match a piece of the line

DATA file

hostname banner exec search file hostname XXXXX banner exec ********************************************************** +******************** banner exec banner exec This device belongs to the

Here is my code below

use strict; use warnings; my %file2; open my $file2, '<', shift or die; while ( my $line = <$file2> ) { ++$file2{$line}; } open my $file1, '<', shift or die; while ( my $line = <$file1> ) { print $line unless $file2{$line}; }

Replies are listed 'Best First'.
Re: searching string from one file in another
by hippo (Archbishop) on Feb 28, 2017 at 13:25 UTC

    Calling the files in your code $file2 and $file1 doesn't help us to see which one is supposed to be "data" and which is supposed to be "search". Is this the sort of thing you are after:

    #!/usr/bin/env perl use strict; use warnings; my $data = <<EOT; hostname banner exec EOT my $search = <<EOT; hostname XXXXX banner exec ********************************************************** +******************** banner exec</> banner exec This device belongs to the EOT for my $substr (split /\n/, $data) { for my $line (split /\n/, $search) { print "Matched '$substr' in '$line'\n" if index ($line, $subst +r) > -1; } }

      ++ on the bad variable names. The fact that the variables have numbers is a first sign that the names are probably not well chosen, but having two variables with the same names (a hash and a scalar named file2) should absolutely be avoided.

      Your proposition works fine if the lines in $data are unique, (the fact that they are counted using $file2{$line}++ makes me think they may not be) but will print duplicates otherwise.

Re: searching string from one file in another
by Eily (Monsignor) on Feb 28, 2017 at 13:29 UTC

    Your post formatting is a bit off (and there's that trailing "This device belongs to the") but I've seen it updated once since I first opened it, so I suppose you are working on it. If you are having difficulties, you can look at Markup in the Monastery, or ask (maybe in the Chatterbox).

    Anyway, the simplest way to check that a string is included in another is to use the index function (it will return the position of the match when there is one, or -1 when the substring can't be found). This would be something like:

    my %file2; open my $file2, '<', shift or die; while ( my $line = <$file2> ) { chomp($line); # Do not keep the "\n" at the end of $line ++$file2{$line}; } open my $file1, '<', shift or die; while ( my $line = <$file1> ) { SEARCH: for my $search (keys %file2) { print $line and last SEARCH if index $line, $search > -1; # Ed +it: GotToBTru pointed out I forgot "index" } }
    The for loop can be written with grep instead: print $line if grep { index $line, $_ > -1 } keys %file2;. You should use whatever is easier for you to understand and edit

    ++ to hippo on the variable names (if you have to add a number to a variable name, it's probably not the right name).

    Edit2: s/a bit of\K/f/; Thanks to Lotus1.

      I think the intention clearer if you use 'any' (from List::MoreUtils) rather than 'grep'. A secondary benefit is that it is probably slightly faster because it can quit as soon as it finds one match.

      Update: Corrected typo and added link.

      Bill
      Hi. What should the code be if I am trying to do a regex match (substring match) (i.e. look for "public" from file2 in "public sector union" in one of the lines in file1? Thanks a lot

        Hello stray_tachyon. Please post your question in a new post with the following information:

        • What your input looks like (a short example)
        • What your expected output looks like (from the example input)
        • What you have tried so far
        • How that failed and where you are stuck
        You can also read How do I post a question effectively?