$_ = 'ju[color=1]s[/color]t a sam[color=2]p[/color]le';
my $brackets = qr(\[[^\]]*\]); # text enclosed in brackets
my $char = qr([^\[\]\s]); # not spaces or brackets
s{
( # group to $1
(?:
$char # a char
(?:$brackets*) # followed by any number of brackets
)
{3} # 3 times
)
(?:$brackets*) # followed by any number of brackets
(?= # looking forward to ensure that
$char # we have at least one char
[^\[\]]* # and we are not inside of a brackets
(?:
\[
|
\z
)
)
}
{$1-}gx;
print;
Update: fixed "last '-'" issue pointed by ikegami
P.S. Concerning '[2345678901234567890' I suggest that is not valid input. | [reply] [d/l] [select] |
| [reply] [d/l] [select] |
I noticed the end '-' thing too. I did add a section to ccn's code that handles that. Its tagged w/ the "# ADDED" comment.
++ btw ccn, took me awhile to wrap my head around this one.
my $brackets = qr(\[[^\]]*\]); # text enclosed in brackets
my $char = qr([^\[\]\s]); # not spaces or brackets
s{
( # group to $1
(?:
$char # a char
(?:$brackets*) # followed by any number of brackets
)
{3} # 3 times
)
(?!(?:$brackets*)(?:\s|\Z)) # ADDED: eliminates '-' on the end of wo
+rds w/ a multiple of 3 chars
(?= # looking forward to ensure that
[^\[\]]* # we are not inside of a brackets
(?:
\[
|
\z
)
)
}
{$1-}gx;
print;
-jbWare
| [reply] [d/l] |
sub splitlong {
my $text = shift;
my @result;
my ($cnt, $inside) = (0, 0);
for (split //, $text) {
($inside = 0), next if $inside and $_ eq ']';
$inside = 1 if !$inside and $_ eq '[';
next if $inside;
$cnt = /\s/s ? 0 : $cnt + 1;
($cnt = 0), push @result, '-' if $cnt > 3;
}
continue {
push @result, $_;
}
warn "invalid input: $text" if $inside;
return join '', @result;
}
my $text = 'ju[color=1]s[/color]t a sam[color=2]p[/color]le';
print splitlong($text);
| [reply] [d/l] |
Thank your for you code.
It works fine with a valid input i.e [tag].
If we do not check for correct [] pairs and fix it, it doesn't work. I understand it was not the task but still... We cannot just count the number of brackets in the string to check. The check has to be more complicated checking actually tags. To open a tag use '[', to close the tag use ']' not ']]' or not '[['.
I think a little check might help. Split the word followed by ']' if the last bracket is not '['.
$_ = 'sss[[]sssss]sssssss';
print $_."\n";
my $brackets = qr(\[[^\]]*\]); # text enclosed in brackets
my $char = qr([^\[\]\s]); # not spaces or brackets
s{
( # group to $1
(?:
$char # a char
(?:$brackets*) # followed by any number of brackets
)
{3} # 3 times
)
(?!(?:$brackets*)(?:\s|\Z)) # ADDED: eliminates '-' on the end of wo
+rds w/ a multiple of 3 chars
(?= # looking forward to ensure that
[^\[\]]* # we are not inside of a brackets
(?:
\[
|
\z
)
)
}
{$1-}gx;
print;
Thank you. | [reply] [d/l] [select] |
$text = 'ju[color=1]s[/color]t a tes[color=2]t[/color]';
$text =~ s/
(
(?:
(?: \[ [^]]* \] )*
[^ []
){3}
(?: \[ [^]]* \] )*
)
(?=[^ ])
/$1-/gx;
print($text, "\n");
# ju[color=1]s[/color]-t a tes[color=2]-t[/co-lor-]
It's not quite there yet, but I have to go.
| [reply] [d/l] |
I think this problem and it's solutions should be copied to Q&A when we have everything said and done.
It will be a 'must-have' solution for everyone who is doing some kind of web-chats or discussion boards. | [reply] |
The following fix for the code provided by ccn.
Should be
($cnt = 1), push @result, '-' if $cnt > 3;
not
($cnt = 0), push @result, '-' if $cnt > 3;
If $cnt = 0 then the 2nd, 3rd... group of characters in each word will be (N+1) long. In the following example the first group will be 3 characters long and all the rest groups will be 4 characters long sub splitlong {
my $text = shift;
my @result;
my ($cnt, $inside) = (0, 0);
for (split //, $text) {
($inside = 0), next if $inside and $_ eq ']';
$inside = 1 if !$inside and $_ eq '[';
next if $inside;
$cnt = /\s/s ? 0 : $cnt + 1;
($cnt = 1), push @result, '-' if $cnt > 3;
}
continue {
push @result, $_;
}
warn "invalid input: $text" if $inside;
return join '', @result;
}
my $text = 'ju[color=1]s[/color]t a sam[color=2]p[/color]le';
print splitlong($text);
| [reply] [d/l] [select] |