Re: Perl script to Parse a file for a string
by GrandFather (Saint) on Jun 05, 2014 at 20:45 UTC
|
sub find_string {
my ($file, $string) = @_;
open my $fh, '<', $file;
while (<$fh>) {
return 1 if /\Q$string/;
}
die "False :Test result is FAIL";
find_string('search.txt', 'name: abc ');
}
it is clear that the die prevents the recursive call to find_string.
However, if the file is not huge (say, bigger than a few hundred megabytes), just suck the whole thing into a string and run the regex against it directly:
sub find_string {
my ($file, $string) = @_;
open my $fh, '<', $file or die "Can't open '$file': $!\n";
my $text = do {local $/; <$fh>};
return $text =~ /\Q$string/;
}
Perl is the programming world's equivalent of English
| [reply] [d/l] [select] |
Re: Perl script to Parse a file for a string
by ww (Archbishop) on Jun 06, 2014 at 01:44 UTC
|
Others have suggested far better ways.
However, in light of your repeated notes that you really want only a mod to your existing sub:
#!/usr/bin/perl -w
use strict;
use 5.016;
# 1088912
my @args=@ARGV;
find_string(@args);
# mod of OP's original:
sub find_string {
my $file = shift;
my $string = shift;
open my $fh, '<', $file;
while ( <$fh> ) {
if ($_ =~ /\Q$string/) {
warn "Test result is FAIL (i.e. $string is present in $file)
+";
} else {
if ($_ !~ /\Q$string/) {
say "\t Did NOT find $string in $_";
}
}
}
}
But your statement (" Most of my scripts use that function") and request that we give you a modification rather than a different approach puzzles me: I don't see why modifying your existing code in multiple scripts creates less workload than replacing it -- in the same number of scripts -- with something including the improvement for which you asked.
| [reply] [d/l] |
Re: Perl script to Parse a file for a string
by neilwatson (Priest) on Jun 05, 2014 at 21:13 UTC
|
Negatives are tricky, but I think this will help you.
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my $return = find_string(
file => "/path/to/file",
positive_match => 1,
string => "abc"
);
say "did match? $return";
$return = find_string(
file => "/path/to/file",
negative_match => 0,
string => "abc"
);
say "did not match? $return";
sub find_string
{
my %params = @_;
my $found = 0;
# assumes that file is not tainted.
# open my $fh, '<', $param{file} or die "Cannot opne $param{file},
+$!";
while (<DATA>) { $found++ if m/\Q$params{string}/ }
return 1 if ( $params{positive_match} && $found > 0 );
return 0 if ( $params{positive_match} && $found == 0 );
return 1 if ( $params{negative_match} && $found == 0 );
return 0 if ( $params{negative_match} && $found > 0 );
}
__DATA__
abc
| [reply] [d/l] |
|
Thanks for your code, It would be great if you could help me in editing the function that I have posted. Most of my scripts use that function. It would be really helpful for me if you could assist me in editing the function to check negatives as well
| [reply] |
Re: Perl script to Parse a file for a string
by thanos1983 (Parson) on Jun 05, 2014 at 21:24 UTC
|
As every body suggest, I think regex will help you a lot. Here is short version perlrequick and the complete one perlre.
A few months ago I implemented something similar, here is what I did. I hope it helps for me worked just fine based on my needs. I have included some information to help you understand the process, just in case that something is a bit more complicated just let me know I will try to explain more.
#!/usr/bin/perl
use warnings; # it warns about undefined values
use strict; # it's a lexical scoped declaration
use Data::Dumper;
use constant ARGUMENTS => scalar 3;
$| = 1; #flushing output
my $filename;
my $Counting = 0;
my $Characters = 0;
my $source = $ARGV[0];
my $patern = $ARGV[1];
my $target = $ARGV[2];
my $elements;
my $value;
my $lines;
my @found;
my @word;
#my $path_source = "/path/".$source."";
#my $path_destination = "/path/".$target."";
open(READ, ,"<", $source)
or die "Can not open ".$source.": $!\n"; # < read mode
open(WRITE, ,">", $target)
or die "Can not open ".$target.": $!\n"; # > write mode
if (@ARGV > ARGUMENTS) {
print "Too many argument inputs\nCorrect syntax ".$source." ".$pat
+ern." name of the file to store the data.\n";
}
elsif (@ARGV < ARGUMENTS) {
print "Need more argument inputs\nCorrect syntax ".$source." ".$pa
+tern." name of the file to store the data.\n";
}
else {
while ($lines = <READ>) {
chomp ($lines); # chomp avoid \n on last field
@word = split ((/\s+/), $lines); #\s+ = 1 or more white-space char
+acters # \d+ matches one or more integer numbers
foreach $value (@word) {
if ($value eq $patern) {
push (@found, $value);
print "I have found one matching pattern: $value\n";
}
}
}
$elements = scalar(@found);
print "I found number of occurrences: ".$elements."\n";
print WRITE "I found pattern: $patern, number of occurrences: $ele
+ments\n";
}
close (READ) or die "READ did not close: $!\n";
close (WRITE) or die "WRITE did not close: $!\n";
$| = 1; #flushing output
You can modified it to meet your needs in sub loop or print error conditions etc.
Seeking for Perl wisdom...on the process...not there...yet!
| [reply] [d/l] |
Re: Perl script to Parse a file for a string
by james28909 (Deacon) on Jun 06, 2014 at 01:45 UTC
|
excuse me for butting in here, or dont :P
But wouldnt a simple for not present:
if ( $string !~ /search patter here/ ) {
print ( "didnt find match" );
};
for if present:
if ( $string =~ /search pattern here/ ) {
print ( "found string" );
} else {
print ( "didnt find string" );
};
please keep in mind i am new at perl but i think this would work for finding or not finding patterns in strings :)
Edit: seems i just reposted some code from above lol. | [reply] [d/l] [select] |
Re: Perl script to Parse a file for a string
by taint (Chaplain) on Jun 05, 2014 at 20:57 UTC
|
Greetings, user786.
I guess I might try something like similar to the following, given your example
my $string = qw(abc) | undef; # for illustration purposes only
unless $string ne "" {
print "found STRING"; } else {
print "string NOT found";}
Then you can simply change the logic, depending upon the results you desire. You might even want to turn the entire routine into a sub -- OO people would call it an Object. :)
All the best.
--Chris
¡λɐp ʇɑəɹ⅁ ɐ əʌɐɥ puɐ ʻꜱdləɥ ꜱᴉɥʇ ədoH
| [reply] [d/l] |
Re: Perl script to Parse a file for a string
by 2teez (Vicar) on Jun 05, 2014 at 20:35 UTC
|
if(condition){
..
}
else{
...
}
within the while loop work? Except, I don't get the question you are asking.
Then why call the subroutrine within itself?
If you tell me, I'll forget.
If you show me, I'll remember.
if you involve me, I'll understand.
--- Author unknown to me
| [reply] [d/l] |
|
"Then why call the subroutrine within itself?"
Sorry,That was a mistake, I updated the code, I'm not calling within itself.
| [reply] |
Re: Perl script to Parse a file for a string
by perlfan (Parson) on Jun 05, 2014 at 20:35 UTC
|
You could always take the regex you wish to return on as a secondary parameter. The rub is that you would have to manage regexes outside of this function and create a more generic die message.
Also if you wish to make this some sort of alt logic recursive method, there are better ways to get out of the recurse loop than a die.
Another options is to generate the anonymous sub routines to fit the mode you wish to execute. There are a variety of ways to do what I think you want to do. This solution could be recursive or operate over a queue (list) of generated subs (non-recursive). | [reply] |
Re: Perl script to Parse a file for a string
by Lennotoecom (Pilgrim) on Jun 05, 2014 at 21:47 UTC
|
/abc/ and print "error\n" and last for <DATA>;
__DATA__
123
456
abc
670
Uuuuuum, no ? | [reply] [d/l] |
Re: Perl script to Parse a file for a string
by hexcoder (Curate) on Jun 06, 2014 at 14:28 UTC
|
Hi, if you need to lookup multiple strings in the same file, I would generalize the detection routine like this:
use strict;
use warnings;
sub check_strings {
my ($file, $r_stringsNeeded, $r_stringsToAvoid) = @_;
open my $fh, '<', $file or die "Can't open '$file': $!\n";
my %foundNeeded = map { $_ => 0 } @{$r_stringsNeeded};
my %foundToAvoid = map { $_ => 0 } @{$r_stringsToAvoid};
while (<$fh>) {
for my $string (@{$r_stringsNeeded}) {
++$foundNeeded{$string} if /\Q$string/;
}
for my $string (@{$r_stringsToAvoid}) {
++$foundToAvoid{$string} if /\Q$string/;
}
}
my $foundAllNeeded = 0 == scalar grep { $_ == 0 } values %foun
+dNeeded;
my $foundNoneOfAvoided = 0 == scalar grep { $_ != 0 } values %foun
+dToAvoid;
return $foundAllNeeded && $foundNoneOfAvoided;
}
if (!check_strings('search.txt', ['first_neededString', 'second_needed
+String'],
['firstStringToAvoid', 'secondStringToAvoid'])) {
die "file did not contain all needed Strings or contained forbidde
+n strings\n";
}
Here the file is only read once. You could supply multiple strings you want to have present in the first array reference, and multiple strings in the second array reference where none of them should be present in the file.
This simple subroutine only gives a boolean answer, but it could be extended to give more detailed answers what was matched and what not.
Hope that helps. | [reply] [d/l] |
Re: Perl script to Parse a file for a string
by Anonymous Monk on Jun 06, 2014 at 13:34 UTC
|
OK, I can't resist. I take it the signature for the original functionality can not change. It must change for the inverted functionality of course, unless you have a Vulcan Mind Meld interface. With that in mind, how about this?
sub find_string {
my ( $file, $string, $negative ) = @_;
return 1
if _find_string( $file, $string, $negative )
die "False :Test result is FAIL";
}
sub find_without_string {
$_[2] = 1;
goto &find_string;
}
sub _find_string {
my ($file, $string, $negative) = @_;
open my $fh, '<', $file;
while (<$fh>) {
return !$negative if /\Q$string/;
}
return $negative;
}
# Original functionality
find_string('search.txt', 'name: abc ');
# Inverted functionality
find_string('search2.txt', 'name: abc ', 1 );
# or
find_without_string('search2.txt', 'name: abc ' );
Of course, now that find_string() is a wrapper, you can play with the signature of _find_string() however you like. For example, you could pass it qr{\Q$string}, as was suggested earlier. | [reply] [d/l] [select] |