Re: How keep the count...
by bart (Canon) on Mar 04, 2004 at 00:34 UTC
|
The basis can be quite simple:
/\blazy\b(.*?)\bdog\b/
If it matches, then $1 will hold the string between these words. From there on, you can calculate the number of words between them, which seems to be more or less what you're after.
The full sample code becomes:
while (<DATA>) {
if(/\blazy\b(.*?)\bdog\b/) {
my $length = length $1;
my @words = grep length, split /\W+/, $1;
printf "Distance is %d characters, stepping %d words\n", $length,
+ 1+@words;
}
}
__DATA__
The brown fox jumps over the lazy dog
The lazy brown fox jumps over the quick dog
The lazy fox jumps over the dog
The lazy fox dog
The fox jumps over the dog...
This prints:
Distance is 1 characters, stepping 1 words
Distance is 32 characters, stepping 7 words
Distance is 20 characters, stepping 5 words
Distance is 5 characters, stepping 2 words
| [reply] [d/l] [select] |
Re: How keep the count...
by arden (Curate) on Mar 03, 2004 at 16:23 UTC
|
Since you are only looking for two specific words, just use two variables, $lazy and $dog. If you might later want to know how many times any other word (say "jumps") shows up, then use a hash. . .
This smells strongly of homework btw. . .
- - arden.
Update: Sorry, I mis-read your original post. Since what you are looking for is the offset between the two words in every sentence, why not just use an array? I don't really recommend using a hash where the keys are sequencial numbers.
| [reply] [d/l] [select] |
|
|
reeks of homework..provide samples of what you have tried so far...please explain the task you're trying to accomplish so we know we're not just doing your homework.
| [reply] |
Re: How keep the count...
by Limbic~Region (Chancellor) on Mar 03, 2004 at 16:35 UTC
|
Anonymous Monk,
First you have to define a "word". You also need to specify if order of the words is important or what to do if the words appear multiple times in the same line. Do you want min or max offset. This is also not the most efficient way, but as it smells like homework to me as well it will be left as an excersise of the reader to improve upon it.
my @offsets;
while ( my $line = <INPUT> ) {
chomp $line;
my @words = split " ", $line;
next if ! grep {$_ eq 'lazy'} @words && ! grep {$_ eq 'dog'} @word
+s;
my $first;
for ( 0 .. $#words ) {
my $word = $words[$_];
if ( $word eq 'dog' || $word eq 'lazy' ) {
if ( $first ) {
push @offsets, $_ - $first + 1;
$first = 0;
last;
}
$first = $_;
}
}
}
print "The number of matches is : ", scalar @offsets, "\n";
print "The offesets are :\n";
print "$_\n" for @offsets;
Cheers - L~R
| [reply] [d/l] |
|
|
Thank you Monk
Maybe my question was not clear, What I need to know is how to keep track of those sentences where the offset was x (any 2, 3 ,etc) for example. Should I use hashes and how. Thanks again!
| [reply] |
|
|
Anonymous Monk,
Your insistence on using a hash strengthens my feeling that this is a homework problem. While I do not mind homework problems nearly as much as other monks, if this is homework you should state:
- That it is homework
- What the specific requirements are
- What you have tried so far
- What you "think" may work but do not know how to code
Now to answer your questions, there is no need to use a hash. you could change:
# push @offsets, $_ - $first + 1;
# to
push @offesets, [ $line , $_ - $first + 1 ];
# and
# print "$_\n" for @offsets;
# to
print "$_->[0] : $_->[1]\n" for @offsets;
Cheers - L~R | [reply] [d/l] |
|
|
Re: How keep the count...
by matija (Priest) on Mar 03, 2004 at 16:28 UTC
|
> Should I create a hash with a sentenceIndex as key and the offset as a value, to then do my calculations?
No point. If you find that a sentence matches, just push the offset into an array, and the number of elements in the array ($#arr+1) is the number of sentences. | [reply] [d/l] |
|
|
Thanks, I am going to try that, but in case I need to retrieve those sentences where the offset was 3, I need a hash eh?
| [reply] |
|
|
No, you don't need a hash for that either. You can have another array, which will contain the number of the sentence in which the offset was found.
Or, if you absolutely insist on using hashes, you could, instead of two arrays, use an array of hashes, where each element of an array would be a hash, with two keys: offset and sentence_number.
| [reply] |
Re: How keep the count...
by flyingmoose (Priest) on Mar 03, 2004 at 21:22 UTC
|
rant removed
So he says it's not a homework question. Never mind then. For future questions, I recommend leading with context ("this is not homework, this is for a project on language...") whenever the example seems as suspicious as "a quick brown fox".
| [reply] |