in reply to Re: Problem: how to split long words
in thread Problem: how to split long words

Two small issues: 'ju[color=1]s[/color]t a tesss[color=2]t[/color]' will return a string ending with a '-', and '[2345678901234567890' will result in a 20 letter word.

Replies are listed 'Best First'.
Re^3: Problem: how to split long words
by jbware (Chaplain) on Aug 24, 2004 at 15:55 UTC

    I noticed the end '-' thing too. I did add a section to ccn's code that handles that. Its tagged w/ the "# ADDED" comment.

    ++ btw ccn, took me awhile to wrap my head around this one.

    my $brackets = qr(\[[^\]]*\]); # text enclosed in brackets my $char = qr([^\[\]\s]); # not spaces or brackets s{ ( # group to $1 (?: $char # a char (?:$brackets*) # followed by any number of brackets ) {3} # 3 times ) (?!(?:$brackets*)(?:\s|\Z)) # ADDED: eliminates '-' on the end of wo +rds w/ a multiple of 3 chars (?= # looking forward to ensure that [^\[\]]* # we are not inside of a brackets (?: \[ | \z ) ) } {$1-}gx; print;


    -jbWare
      Thank you very much! ccn, jbware
      Your solution is great.
      I've made a one-liner from it and it looks like
      $text=~s{((?:([^\[\]\s])(?:(\[[^\]]*\])*)){3})(?!(?:(\[[^\]]*\])*)(?:\s|\Z))(?=[^\[\]]*(?:\[|\z))}{$1-}gx;
      It's so nice! Can be used in some obfuscation code... Thanks again!
Re^3: Problem: how to split long words
by ccn (Vicar) on Aug 24, 2004 at 19:42 UTC

    Here is non-regexp solution. It validates input in addition to the main task.

    sub splitlong { my $text = shift; my @result; my ($cnt, $inside) = (0, 0); for (split //, $text) { ($inside = 0), next if $inside and $_ eq ']'; $inside = 1 if !$inside and $_ eq '['; next if $inside; $cnt = /\s/s ? 0 : $cnt + 1; ($cnt = 0), push @result, '-' if $cnt > 3; } continue { push @result, $_; } warn "invalid input: $text" if $inside; return join '', @result; } my $text = 'ju[color=1]s[/color]t a sam[color=2]p[/color]le'; print splitlong($text);