rakesh01 has asked for the wisdom of the Perl Monks concerning the following question:

How do I search for every occurrence of strings within an array and display them? Hi, I have a text file called keywords.txt that has the keywords call,please,urgent. I have another text file called Skypelogs.txt which contains a log of all the Skype chat conversations that has occurred so far on Skype. I need to search the file called Skypelogs.txt for every occurrence of every keyword locateed in keywords.txt , which are call, please, urgent and display that particular line on the screen. When it is finished, it should print " Scan completed " on the screen. I came up with a program, but for some reason, I am not able to get the split function to split the keywords within the file keywords.txt using the delimiter ',' and store it in an array. Can someone help me out with it as it is very important and urgent. Any help is much appreciated.
#!/usr/bin/perl open FILE,"keywords.txt" or die $!; # read file into an array my @data = <FILE>; my @values = split(' ', @data, 9); close (FILE); open FILE,"Skypelogs.txt" or die $!; # read file into another array my @array = <FILE>; $found; foreach $var (@data) { foreach $line(@array) { if (index($var, $line) != "") { $found .= $line. "\n"; } } } print $found; # close file close (FILE);
Thanks in advance Rakesh
  • Comment on How do I search for every occurrence of strings within an array and display them?
  • Download Code

Replies are listed 'Best First'.
Re: How do I search for every occurrence of strings within an array and display them?
by AnomalousMonk (Archbishop) on Sep 27, 2010 at 22:42 UTC

    Thank you for updating to properly format the code. Preliminary inspection reveals a few problems:

    • The scalar  $found; is sitting all alone before the first  foreach block and was probably meant to be a  my $found = ''; statement. This suggests you are not running with warnings and strictures. Do yourself a big favor and  use warnings; and  use strict; in all your programs. Additionally,  use diagnostics; is often very helpful to Perl beginners.
    • If there is or might be a newline at the end of the line you read from the keywords file, you don't deal with it. See chomp.
    • In your example, you have only a single line in the keyword file, but you read the file to an array (which you then improperly try to split). I don't understand this approach. If there may be multiple lines in the keyword file, keyword processing can be done in a loop, on a line-by-line basis:
      • Try to read a line;
      • If read was successful, pre-process line (e.g., remove any newline);
      • split keyword line on delimiter;
      • Add individual keywords to  @values array. Update: Changed to  @values from  @data
    • The split statement
          my @values = split(' ', @data, 9);
      has problems.
      • You appear to be splitting on a single space, not the ',' (comma) delimiter that is shown in the example in the OP.
      • split only splits on a string, not on an array, @data in this case.
      • You are using a split limit count of 9, which I don't understand in the context of the stated problem.

    You might want to break the problem apart: hard-code the  @values array with some keywords and then get the Skype file read-and-scan operation right, then go back and attack the keyword read-parse-prepare part. (From a quick glance, I would say that the Skype file read-scan is very close to working. Oops: except that you are looping over the keyword file line array  @data instead of the extracted keywords  @values array. Better variable naming (e.g., @keywords instead of @values) might have prevented this. Also: improper index offset comparison spotted by johngg.)

    Update: Added section on multi-line keword file.

Re: How do I search for every occurrence of strings within an array and display them?
by johngg (Canon) on Sep 27, 2010 at 22:46 UTC

    It is difficult for us to work out what you are trying to do, your description does not seem to gel with the code you show. For instance, you describe (I think) three keywords, viz. "call", "please" and "urgent" in a comma-separated string which you try to split yet your code is split'ing on whitespace, trying to work on an array rather than a scalar and you seem to be looking for a maximum of nine fields. Puzzling.

    Some points to consider:-

    • Always use strict; and use warnings; in your code to force some coding discipline and help catch typos.

    • Use meaningful variable names and avoid things like @array and FILE as they convey nothing much to others trying to understand your code.

    • As I mention above, split works on scalars, not arrays.

    • index returns integers indicating position (-1 for not found) not strings so "" will never be returned.

    • Use ne, eq etc. when doing string comparisons, use !=, == etc. for numerical comparisons.

    • Consider using a regular expression rather than index to find your keywords. You can build an expression with alternation, something like if( $line =~ /(keyword1|keyword2|keyword3)/ ) { print "Found $1 in $line\n" } so you can find any of the keywords in a single test rather than having to keep looping over individual keyword searches.

    I hope these points are useful and help you towards a solution. If Ihave misunderstood what you are trying to achieve please ask further and perhaps give a brief example of your data.

    Cheers,

    JohnGG

Re: How do I search for every occurrence of strings within an array and display them?
by ww (Archbishop) on Sep 27, 2010 at 21:27 UTC

    Update: OP (promptly!) fixed the formatting.
    ++ rakesh01.
    Retained my comments, from "---" to "------" only for future newcomers

    ---

    Your "very important and urgent" (twice, yet!) would come a whole lot closer to being my "important and urgent" had you taken time to format your post properly. The input page provides the minimal markup needed here; the preview function should have told you that you were making a mess.

    Please fix it. The Monks are most generous with their assistance... when they read an appropriate question -- and yours appears to be appropriate (though it skirts perilously close to "fix it for me") -- with minimal difficulty (which is NOT the case above).

    Please see Writeup Formatting Tips, Markup in the Monastery, Perl Monks Approved HTML tags (re markup) and On asking for help and How do I post a question effectively? (re the preferred approach to posting a question).

    ------

    Now we could be of more help, I believe, were we provided samples (inside <code> tags, of course) of the keywords file and the logfile, where a few sample lines will suffice.

    Update2: Added in light of the fine replies by AnomalousMonk and johngg

    The analysis and suggestions by these stalwarts is right on the money. strict and warnings (esp in light of the bare "$found" and the improper use of split) meaningful variable names; and, in this case, use of a regex rather than index.

    And given the disconnects between narrative and code, I've made some assumptions... reflected in these source files:

    keywords.txt:

    call,please,urgent

    Skypelog.txt:

    001 This is a convstn with "please" and "urgent" in it. SHOULD MATCH 002 This is another conversation without any of OP' selected keys 003 Call failed. Aborted. SHOULD MATCH. 004 Call includes "call." This is foolish but SHOULD MATCH 005 None of the keys appear here.

    all of which is leadup to the code:

    #!/usr/bin/perl # 862294 use strict; use warnings; my $file = "keywords.txt"; open(my $fh_keys, '<', $file) or die "Can't open $file for read: $!"; my $keys = <$fh_keys>; close ($fh_keys); chomp $keys; my @keys = split(',', $keys); my ($keyword1, $keyword2, $keyword3) = @keys; my $logfile = "Skypelogs.txt"; open (my $fh_log, '<', $logfile) or die $!; my @log = <$fh_log>; my $line; for $line(@log) { if ($line =~ /($keyword1|$keyword2|$keyword3)/i) { my $found .= $line . "\n"; print $found; } } =head OUTPUT: 001 This is a convstn with "please" and "urgent" in it. SHOULD MATCH 003 Call failed. Aborted. SHOULD MATCH. 004 Call includes "call." This is foolish but SHOULD MATCH =cut

    Note, also, the manner in which this uses open() or die; the way we get the individual keywords out of $keys...@keys; and the reset of $found inside the loop (it would be very easy to write this in such a way as to print each line multiple times -- I know; I did it and then had to puzzle a bit to see why).

    + + votes to AnamalousMonk and johngg, please!

Re: How do I search for every occurrence of strings within an array and display them?
by oko1 (Deacon) on Sep 28, 2010 at 03:24 UTC

    Like the good Monks that came before me, I assume that your keywords are a) on one line, b) are separated by commas (maybe even with some space around them), and c) that you want a case-insensitive match. Especially for the latter, a regex is the way to go.

    #!/usr/bin/perl -w use strict; open my $f, "keywords.txt" or die "keywords.txt: $!\n"; chomp(my $kw = join '|', split /\s*,\s*/, <$f>); close $f; my $re = qr/$kw/i; open my $s,"Skypelogs.txt" or die "Skypelogs: $!\n"; while (<$s>){ print if /$re/; } close $s;

    --
    "Language shapes the way we think, and determines what we can think about."
    -- B. L. Whorf
Re: How do I search for every occurrence of strings within an array and display them?
by JavaFan (Canon) on Sep 28, 2010 at 12:22 UTC
    Assuming you want to match "words", and not part of words (that is, if one of your keywords is "perl", you don't want it to trigger on "properly"), I'd do the following (untested):
    use 5.010; use strict; use warnings; use autodie; open my $kwh, "keywords.txt"; while (<$kwh>) { chomp; $keyword{$_}++ for split /,/; } close $kwh; open my $fh, "Skypelogs.txt"; while (<$fh>) { print if grep {$keywords{$_}} /(\pL(?:\S*\pL)?)/g; } close $fh;
    Note the definition of a "word": any sequence of non-space characters that start and end with a letter. This is far from perfect.