in reply to Re: Re: Extracting a substring of N chars ignoring embedded HTML
in thread Extracting a substring of N chars ignoring embedded HTML
Thanks for the feedback!while ( my $token = $p->get_token ) { if ($token->is_text) { my $text = $token->as_is; $text =~ s/\s+/ /g; if (length($text) + $total <= 200) { $doc2 .= $text; $total += length($text); } else { for (split / /, $text) { if ($total + length($_) <= 200) { $doc2 .= $_ . ' '; $total += length($_) + 1; } else { last; } } chop($doc2) if $doc2 =~ /\s$/; last; } } else { $doc2 .= $token->as_is; } }
Should be:$doc2 .= substr( $tkntext, 0, rindex( $tkntext, ' ', $maxlen );
it was missing a bracket. =)$doc2 .= substr( $tkntext, 0, rindex( $tkntext, ' ', $maxlen ) );
|
|---|