Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks!
I am trying to get the largest or longest string in a sentence, and add a space to break this long word/string if it has more than 30 chars long to avoid breaking where it supposed to display. I am trying to use a regular expressions but at this point I lost track, can get it to work, here I have same code where in a loop I can find the longest string in this test sentence but using a regular expression it fails. Any help would be nice!
#!/usr/bin/perl my $str2 = "Be careful what you wish for you just might get it ------- +--------test-----------results----------."; my @words = split(/\b/, $str2); my $longestwordlength = 30; my $longestword; foreach my $word (split(/\s+/, $str2)) { if (length($word) > $longestwordlength) { $longestword = $word; print "\n$longestword\n\n"; } }
Here would be where the magic should happen, but it doesn't:
(my $break2 = $str2)=~ s/(length($str2)>30)/$1\n/mg; print "\n$break2\n";
Thanks for looking!

Replies are listed 'Best First'.
Re: Finding the largest word in a string help!
by moritz (Cardinal) on Dec 12, 2011 at 16:11 UTC
    (my $break2 = $str2)=~ s/(length($str2)>30)/$1\n/mg;

    You can't just use any code inside a regular expression and expect it to work.

    I'd approach the problem like this:

    sub break_word { my $str = shift; # break up $str into separate words, # and return the result ... } # search for all words longer than 30 characters: s/(\S{31,})/ break_word($1) /eg;
Re: Finding the largest word in a string help!
by Anonymous Monk on Dec 12, 2011 at 16:20 UTC
    Splitting is unnecessary.

    $_ = 'might get it ---------------test-----------results----------.' s/(\S{30})/$1 /g; # $_ is now # 'might get it ---------------test----------- results----------.'
      I like this approach, I am trying to exclude the inserting of a break if the string in the sentence its a link starting with "http", it gets complicated I guess:
      my $str2 = "Be careful what you wish for you just might get it http:// +www.perlmonks.com/?parent=943112;node_id=3333 ---------------test---- +-------resutls----------!"; (my $break3 = $str2)=~ s/([^http.*?\s+]\S{30})/$1 /g; print "\n\n$break3\n\n";

        I like this approach, I am trying to exclude the inserting of a break if the string in the sentence its a link starting with "http", it gets complicated I guess:

        Um, are you trying to invent the syntax or guess?

        This is your regex

        use YAPE::Regex::Explain; print YAPE::Regex::Explain ->new( qr/([^http.*?\s+]\S{30})/ )->explain; __END__ The regular expression: (?-imsx:([^http.*?\s+]\S{30})) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- [^http.*?\s+] any character except: 'h', 't', 't', 'p', '.', '*', '?', whitespace (\n, \r, \t, \f, and " "), '+' ---------------------------------------------------------------------- \S{30} non-whitespace (all but \n, \r, \t, \f, and " ") (30 times) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

        Neither approach is going to work

        Try

        s{(\S{30,}){ my $ret = $1; if( $ret !~ /http/ ){ $ret .= " "; } $ret; }ge;

        see also perlintro, perlretut, perlre#(?<!pattern)

      Problem is, you end up adding a space even if the string is exactly 30 characters, resulting in a potential double-space.
Re: Finding the largest word in a string help!
by TJPride (Pilgrim) on Dec 12, 2011 at 19:12 UTC
    Splitting into chunks of no more than an arbitrary length is easy:

    use strict; use warnings; my $max = 5; my $str = '123456789012'; $str =~ s/(\S{$max})(?=\S)/$1 /g; print $str;

    But what if you want to split into equal pieces?

    use strict; use warnings; my $max = 5; my $str = '123456789012'; $str =~ s/(\S\S{$max,})/mySplit($1)/eg; print $str; sub mySplit { my $s = $_[0]; my ($n, $i, @p, @r); ### Calculate number of pieces $n = ceil (length($s) / $max); ### Calculate length of pieces for ($i = 0; $i < length $s; $i++) { $p[$i % $n]++; } ### Get pieces for (@p) { $s =~ s/(.{$_})//; push @r, $1; } return join ' ', @r; } sub ceil { return $_[0] == int $_[0] ? $_[0] : int ($_[0] + 1); }