Re: Lost in Perl - text file searching and printing
by GrandFather (Saint) on Jul 23, 2011 at 23:58 UTC
|
Show us what you have tried. The tool you need is Perl's regular expression engine - regex. See perlrequick, perlretut and perlre.
True laziness is hard work
| [reply] |
Re: Lost in Perl - text file searching and printing
by zentara (Cardinal) on Jul 23, 2011 at 22:06 UTC
|
I'm a regex novice too, but at least the technique is crude enough for your teacher to believe it's really your work :-) Have a nice weekend!
#!/usr/bin/perl
use strict;
use warnings;
while (<DATA>) {
if ( $_ =~ s/([?.,]$)/$1/){ print "$_ $1\n" };
}
# puts ?., in a character class [] anchored at the
# end $, and () captured to $1, a
# regex internal variable.
__DATA__
4 score and 7 years ago?
our forefathers came,
upon a continent.
foo
bar bazz
__END__
| [reply] [d/l] |
|
|
Thanks for your replies!
I am getting closer to understanding this... a bit of clarification:
Yes, the text file can contain lines that do not contain the word "it".
So, I need it to look for lines that match both the punctuation, and then containing the word "it", I have it selecting the correct text now based on the previous reply and some modification, but can't figure out how to only print the text AFTER that word... any advise? (I hope to understand this stuff yet... Perl regular expressions are a bit cryptic!!!
Thanks again!
Eric
| [reply] |
|
|
Well here is a 2 step solution. It still may have a problem if you encounter lines with 2 or more 'it' in them. Its called regex greediness. See greediness . I leave you some work to fix the possible greediness issue. :-)
#!/usr/bin/perl
use strict;
use warnings;
while (<DATA>) {
if ( $_ =~ m/([?.,]$)/){
# print " matching first criteria $_";
#now check for 'it', using capture parenthesis
#for the text before and after 'it'
if ( $_ =~ m/(.*)it(.*)/ ){
print "$2\n"; # the second capture parenthesis
}
};
}
__DATA__
4 score, and 7 years ago?
our ? forefathers came,
upon it -- a continent.
it was it was it ? foo?
.bar bazz
__END__
| [reply] [d/l] |
|
|
Uhm, it's a bit pointless to replace what you match by itself. m// instead of s/// is way better. But what's worse, you're printing the matching line, followed by the comma, dot, or question mark that was matched. The OP wants to print everything following the word it. There's no it in your pattern.
| [reply] [d/l] [select] |
|
|
Thanks again for the input!
The text file I am reading in is a couple of paragraphs of text from an article about Perl. One of the lines that the program should find is:
Server (PAUSE) and it was happily feeding modules through to the CPAN archive sites.
The output of the program, seeing the punctuation at the end, and the word "it" in the line would be : was happily feeding modules through to the CPAN archive sites.
My program finds that line now, but it prints the entire line, not just what comes after "it".
Any thoughts?
Thanks!
Eric
| [reply] |
|
|
|
|
Re: Lost in Perl - text file searching and printing
by JavaFan (Canon) on Jul 23, 2011 at 22:04 UTC
|
The exercise does not say what needs to be done with a line that ends in a period, question mark comma, but does not contain the word "it". So, I assume this program runs in a universe that does not have such lines. So, I'd write (untested):
perl -nlwe 'print $1 if /\bit\b(.*[.?,])$/' text-file
Or:
perl -nlwe '/[.?,]$/ && /\bit\b/ && print $'"'" text-file
| [reply] [d/l] [select] |