Re: Splitting a string into words
by tye (Sage) on Aug 11, 2005 at 18:01 UTC
|
Fun. Some edge cases got missed several times. Strings under 12 characters and a first word of over 12 characters were the most common, I think. Though what to do when the first word is over 12 characters is not clear from the problem statement. Strict interpretation of the problem statement would end up with an empty string for that case. More likely, the first 12 characters of the first word is appropriate.
But I think you can still do this and keep it pretty simple:
my( $first )= $message =~ /^\s*(.{1,12}\s|.{0,12})/;
Many ways to do this. For such a tiny, simple operation, I could envision a half-dozen tests in the UT suite off the top of my head. (:
Note that I don't collapse internal spaces, which is a nice touch in Transient's solution.
| [reply] [d/l] |
Re: Splitting a string into words
by pbeckingham (Parson) on Aug 11, 2005 at 17:00 UTC
|
This will extract anything up to 12 characters (greedy) that is followed by whitespace. It does what you described, but needs work to be of more general use.
#! /usr/bin/perl
use strict;
use warnings;
my $string = 'this is a test message';
my ($chunk) = $string =~ /(.{1,12})\s/;
print $chunk, "\n";
pbeckingham - typist, perishable vertebrate.
| [reply] [d/l] |
|
|
(?:\s|$) works better than \s.
An other alternative is \b, which will chop after a word, but before punctuation.
| [reply] [d/l] [select] |
|
|
Depends on what a word is. If a word is \w+, this variant might be of use: ((\s|\w){0,11})(\s|$)
| [reply] [d/l] |
Re: Splitting a string into words
by Transient (Hermit) on Aug 11, 2005 at 17:08 UTC
|
#!/usr/bin/perl
my $max_str_len = 12;
my $string = "This is a test message";
$string =~ tr/ / /s;
$string =~ s/^\s*//;
$string =~ /(.{0,$max_str_len})(\s|$)/;
my $trunc_string = $1;
my @words = split /\s+/, $trunc_string;
print "Word: ", $_, "\n" foreach @words;
| [reply] [d/l] |
Re: Splitting a string into words
by chester (Hermit) on Aug 11, 2005 at 17:22 UTC
|
Text::Wrap isn't necessarily overkill. What about words that are longer than 12 characters? The answers given so far don't handle that, so far as I can tell. Text::Wrap isn't the greatest, but at least doesn't lose any letters.
This code is also a bit more readable, in my opinion.
use warnings;
use strict;
use Text::Wrap qw(wrap);
my $phrase = 'This is a test message with lots of words in it. Tintina
+bulations!';
$Text::Wrap::columns = 13;
my $wrapped = wrap('','',$phrase);
print $wrapped;
| [reply] [d/l] |
Re: Splitting a string into words
by cog (Parson) on Aug 11, 2005 at 16:51 UTC
|
| [reply] |
|
|
| [reply] |
|
|
| [reply] |
Re: Splitting a string into words
by borisz (Canon) on Aug 11, 2005 at 17:25 UTC
|
my $string = 'this is a test message';
my ($chunk) = $string =~ /(.{1,12})\b/;
| [reply] [d/l] |
Re: Splitting a string into words
by ysth (Canon) on Aug 11, 2005 at 17:00 UTC
|
$msg = "This is a test message";
@lines = $msg =~ /(?>\s*)(.{0,11}\S)(?!\S)/g;
Update: this strips leading characters if any \S+ "word" is longer than 12. | [reply] [d/l] |
Re: Splitting a string into words
by tphyahoo (Vicar) on Aug 12, 2005 at 08:41 UTC
|
The "natural" way to do it:
No more than 12 characters: ^.{0,12}
Words: ^(\w+(\s|$))*
Both must be true:
HOW?
Is there maybe some way to do this with a perl6 rule, or Parse::Recdescent? | [reply] [d/l] [select] |