in reply to Re: regexp question
in thread regexp question
prints:$string = "now is the time for all good men to come to the aid of thei +r country while the gratuitously extralongwordwithmorethantwentychara +ctersexdtendson and on."; grep { push(@arr,substr($_,0,19)) } split (" ",$string); print join("\n",@arr);
now is the time for all good men to come to the aid of their country while the gratuitously extralongwordwithmo ###1 and on.while tfoertsch's "this works for me"
prints:while ( $string =~ /(\S.{0,19})(?=\s|$)/g ) { # NB: "=~" here, rather than a simple "=" in original. push(@arr, $1); } for my $arr(@arr) { print $arr . "\n"; }
now is the time for all good men to come to the aid of their country while the gratuitously charactersexdtendson ###2 and on.
Note that tfoertsch's output does most of what's specified in the OP, BUT both ###1 and ###2 truncate the "extra long word;" one from the head and one from the tail. That's a problem only if the source data can't be relied upon to use words of more ordinary length. In non-technical English, this isn't likely to be a problem, but I wouldn't want to bet on this auf Deutsch or any of the Germanic/Low Countries/Scandanavian languages.
Update, in light of the estimable swampyankee's comment below: This may be an example of one of the cases swampyankee had in mind when opting for split
|
|---|