Re: Repeating the same command in different portions of input

tobyink has provided an excellent solution. For your future reference--and in case the need arises again--there are Perl modules that can be used for parsing the kind of text you have. Here's an example that uses Mojo::DOM to parse your <a> tags:

use strict;
use warnings;
use Mojo::DOM;

my $text = <<END;
<a>
word1
word2
word3
</a>
<a>
word4
word5
</a>
<a>
word6
word7
</a>
END

my $dom = Mojo::DOM->new($text);

my $i = 1;
for my $chunk ( $dom->find('a')->each ) {
    print 'Chunk ' . $i++ . ': ' . $chunk->text . "\n";
}
[download]

Output:

Chunk 1: word1 word2 word3
Chunk 2: word4 word5
Chunk 3: word6 word7
[download]

Thus, each group that you need to analyze is contained by $chunk->text within the for loop.

Hope this helps!

Comment on Re: Repeating the same command in different portions of input Select or Download Code

Replies are listed 'Best First'.
Re^2: Repeating the same command in different portions of input by albascura (Novice) on Jan 15, 2013 at 20:50 UTC
It really helps, thanks. I was wondering. I see that `$chunk->text` doesn't preserve the new line at the end of each word. Since I need to check stuff that are in lines (I did simply my code a little in the previous example) I was wondering if I could do something like these: `for my $chunk ( $dom->find('s')->each ) { my @values = split('\n', $chunk); foreach $line (@values) { do stuff on every line } }` [download] I'm trying it right now. I hope it works. Thanks again!	[reply] [d/l] [select]
Re^3: Repeating the same command in different portions of input by Kenosis (Priest) on Jan 15, 2013 at 21:03 UTC
Yes, `split`ting the 'chunk' is a good solution! However, since you've noticed the chunk lacks newlines, change: `my @values = split('\n', $chunk);` to `my @values = split /\s+/, $chunk;` [download] This `split`s on whitespace It uses a regex, not a string literal (also, '\n' would not be interpolated into a newline since you've used single quotes) Parentheses are optional	[reply] [d/l] [select]