How to club different lines of program into one

ashok13123 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: How to club different lines of program into one by McDarren (Abbot) on May 25, 2009 at 14:42 UTC
I think part of your question is missing. That is, you haven't specified which of the words in your file are supposed to be present and which are not. But in any case, consider the following: darren@dino:~/perl$ cat words.txt january february egypt moon saturday darren@dino:~/perl$ cat missing.pl #!/usr/bin/perl use strict; use warnings; my $word_file = 'words.txt'; my @required_words = qw/january larry_wall february holiday egypt moon + saturday/; my %words; open my $in, '<', $word_file or die "$!\n"; while (my $line = <$in>) { chomp $line; $words{$line}++; } close $in; for my $word (@required_words) { if ($words{$word}) { print "Required word $word is present\n"; } else { print "Required word $word is missing\n"; } } darren@dino:~/perl$ perl missing.pl Required word january is present Required word larry_wall is missing Required word february is present Required word holiday is missing Required word egypt is present Required word moon is present Required word saturday is present [download] Hope this helps, Darren	[reply] [d/l]
Re: How to club different lines of program into one by arc_of_descent (Hermit) on May 25, 2009 at 14:42 UTC
Create a hash of elements to look for. You already have a list of items fetched from the file. Then read up on the exists function. This will help - How can I tell whether a certain element is contained in a list or array?	[reply]
Re: How to club different lines of program into one by AnomalousMonk (Archbishop) on May 25, 2009 at 14:47 UTC
One possible approach: `>perl -wMstrict -le "my @terms = qw(january foo february egypt moon saturday bar); my $search = qr{ @{[ join '\|', @terms ]} }xms; my $str = 'in february the moon is not visible in egypt on saturday night'; my %count; ++$count{$1} while $str =~ m{ ($search) }xmsg; for my $term (@terms) { print qq{$term is }, $count{$term} ? '' : 'NOT ', 'present'; } " january is NOT present foo is NOT present february is present egypt is present moon is present saturday is present bar is NOT present` [download]	[reply] [d/l]
Re: How to club different lines of program into one by GrandFather (Saint) on May 25, 2009 at 23:48 UTC
If you use a hash to record your results you don't need to slurp the file and can generate a little more information. Consider: `use strict; use warnings; my %matches = map {$_ => 0} qw(january february egypt); while (<DATA>) { chomp; ++$matches{$_} if exists $matches{$_}; } for my $word (sort keys %matches) { if (! $matches{$word}) { print "Didn't find $word.\n"; } elsif (1 == $matches{$word}) { print "Found $word once.\n"; } else { print "Found $word $matches{$word} times.\n"; } } __DATA__ january february january moon saturday` [download] Prints: `Didn't find egypt. Found february once. Found january 2 times.` [download] True laziness is hard work	[reply] [d/l] [select]
Re: How to club different lines of program into one by akho (Hermit) on May 25, 2009 at 14:39 UTC
`use strict; use warnings; my @words = qw{january february egypt}; open my $file, '<', 'file.txt'; my $text = do { local $/; scalar <$file>; }; close $file; for my $word (@words) { print (($text =~ /\Q$word\E/) ? $word : "$word not present"); }` [download] Didn't test it, however. upd: fixed syntax error	[reply] [d/l]
Re^2: How to club different lines of program into one by ww (Archbishop) on May 25, 2009 at 17:34 UTC
...and the lack of testing shows. `syntax error at 766040.pl line 7, near "close"` You're missing a terminal semicolon at the end of line 6. where file.txt is: `001: january 002: february 003: egypt 004: moon 005: saturday` [download] The presence or absence of the line numbers reflects laziness and slow downloads but makes no difference here. Suggestion: Use 3-arg opens and test each one (`...\|\| die "Can't open $file: $!\n";`. Also, IMO, McDarren's response below strikes an appropriate chord. If the list of "wanted" words is in file.txt, then testing for their presence merely burns cycles and inconveniences electrons to no purpose whatsoever. Hence, one might infer that OP failed to specify the issue adequately and that leads to another question: Is the intent to find the "wanted" words anywhere within the text or is it to test the text, line-by-line, and report per-line. One might guess from OP's wording that it's the former < update for clarity (in which case, slurping the file is fine [size issues aside] but would NOT be a good approach in the latter case ). ~~but~~ In /update> any case, leaving the reader guessing doesn't always get the best answer. But, all that said, a question (perhaps ignorant) for akho: why `scalar <$file>;` for this application?	[reply] [d/l] [select]
Re^3: How to club different lines of program into one by akho (Hermit) on May 25, 2009 at 19:40 UTC
I was writing this on a machine without Perl, thus the syntax error and no testing; I also tried to be extra safe with context in the `do`. That `scalar` is not necessary. Sorry for the confusion, if there was any. As for the OP's intent: it is hard to understand it. But the title question was "How to club different lines of program into one", so I tried to do the same thing the OP's code is doing, but in less lines. And I don't have an excuse for not testing my `open` except that I usually `use autodie`.	[reply] [d/l] [select]
Re: How to club different lines of program into one by Marshall (Canon) on May 26, 2009 at 01:47 UTC
"slurping" in a whole file into a scalar variable sounds like a good idea in this case. $text = <IN>; looks fine to me. There is no need for @text = <IN>. I would suggest the use of the Perl function index() rather than regex in this situation. index requires an exact match and so you should case search term and search text to be the same. But this "casing" operation is very fast. The index function will quit on the first match which is an advantage over regex this situation. As always, your mileage may vary! Short "how to" is shown below. #!/usr/bin/perl -w use strict; my @listOfWords = qw (january february egypt moon saturday zoos zoo thingies thing ); my $text = "moon. I love full moons but this it has been a long thing since yesterday on the beach. And a whole buch of BLAH.\nYet another february line.\n More jan stuff goes here. What a zoo this text searching thing can be!"; print"\n\nUsing ListOfWord Tokens\n"; foreach my $word (@listOfWords) { if ( index($text,"$word")>= 0) { print "word: $word\t found\n"; } else { print "word: $word\t NOT found\n"; } } __END__ Using ListOfWord Tokens word: january NOT found word: february found word: egypt NOT found word: moon found word: saturday NOT found word: zoos NOT found word: zoo found word: thingies NOT found word: thing found [download]	[reply] [d/l]
Re^2: How to club different lines of program into one by akho (Hermit) on May 26, 2009 at 16:41 UTC
`$text = <IN>;` will not work like you want it to if you do not undef `$/`. Regexen stop after the first match; you may gain a performance benefit, but not for the reason you cite.	[reply] [d/l] [select]
Re^3: How to club different lines of program into one by Marshall (Canon) on May 26, 2009 at 17:50 UTC
Thanks for the clarifications! As far as performance goes, it probably doesn't make any difference. So here we just have another way of doing things.	[reply]
Re: How to club different lines of program into one by ig (Vicar) on May 26, 2009 at 20:06 UTC
For yet another alternative that allows you to include regular expressions in the patterns to be found: `use strict; use warnings; my $filename = 'file.txt'; open(FH, '<', $filename) or die "$filename: $!"; my $text = do { local $/; <FH> }; close(FH); my @patterns = qw(january february egypt a.e <sample>(.?)</sample> e +tc); print map { $text =~ /$_/ ? "$_: found\n" : "$_: not found\n" } @patterns; __END__ january: found february: found egypt: found a.e: not found <sample>(.?)</sample>: not found etc: not found` [download] Depending on how you want your patterns interpreted you might want to add the s or m modifiers to the pattern match, to change the behavior of '^', '$' and '.'. See perlre for details.	[reply] [d/l]