harishnuti has asked for the wisdom of the Perl Monks concerning the following question:


Hello Monks
i do have a requirement as below.
got a string having n number of characters , i want to break that string into 2 lines with each line not exceeding more than 60 characters.
for which i have done below.
#!/usr/bin/perl use strict; use warnings; my $maxlen = 60; my $finalbuffer = undef; print "Check the no of characters in each smstext \n"; die "Pls supply SMS Text as Argument\n" if ( $#ARGV < 0 ); my $smstext = $ARGV[0]; if ( length($smstext) > $maxlen ){ $finalbuffer=subdivide($smstext,60); } print $finalbuffer,"\n"; # This will divide into chunks of 70 characters each sub subdivide { my ($s, $n) = @_ ; $s =~ s/\G(.{$n})(?!\Z)/$1\n/g ; return $s ; } ; # Another method print join "\n", unpack("(a59)*", $smstext),"\n";

we can call the code as below
ersms.pl "This sentence will have more than 120 characters and i want to truncate this string into two lines containing 60 characters each and ignore characters above 140 in length"
my problem is, by above methods i can cut the line to 60 characters, but sometimes line end is the mid of some word which is not intended.
i need line to have 60 characters with word ending properly otherwise if if needed that word to move into next line i.e. each line should with word boundary and should not exceed 60 characters
pls help

Replies are listed 'Best First'.
Re: Breaking String
by moritz (Cardinal) on Oct 17, 2008 at 07:37 UTC
    You are looking for Text::Wrap, core module since perl 5.002.

      Exactly , i used Text::Wrap module, indeed its solving my purpose, but i only need initial 2 lines and ignore rest of the lines..
      for this i have remove lines greater than 2 in $text as below.
      $Text::Wrap::columns = 60; my $text = wrap('', '', $smstext); print $text,"\n";
        for this i have remove lines greater than 2 in $text
        To extract just 2 lines, you can do $top2 = $text =~ /^(.*\n.*\n)/;

        You can avoid unnecessary work in Text::Wrap if you limit the string you submit to it to a bit over the absolute maximum length of 2 lines together (your text says 140 characters, indeed a bit over 2x60 + 2 for the newlines), because the rest will be cut off anyway.

        my $smstext = "This sentence will have more than 120 characters and i +want to truncate this string into two lines containing 60 characters +each and ignore characters above 140 in length" ; use Text::Wrap; local $Text::Wrap::columns = 60; my($text) = wrap('', '', substr($smstext, 0, 140)) =~ /^(.*\n.*\n)/; print $text; __END__ Output: This sentence will have more than 120 characters and i want to truncate this string into two lines containing 60

      The below will take first lines in the string.
      print "Now using Wrap method \n"; $Text::Wrap::columns = 60; my $text = wrap('', '', $smstext); my @chunks = split(/\n/,$text); # split $text into chunks print $chunks[0]."\n".$chunks[1],"\n"; #take first two lines

      Any other methods are welcome.
        Using Text::Wrap module is the better way. Here is the solution without using that module.
        #! /usr/bin/perl my @arr = split("", $ARGV[0]); my ($i, $len) = 0; my $newlines=''; my $maxlen = 60; while ($i <= length($ARGV[0])) { $str1 = $str1 . $arr[$i]; my $len = length($str1); if ($len == $maxlen) { if ($str1 =~ /\S+$/) { $str1 =~ s/\s+(\S+)$//; $newlines .= $str1 . "\n"; $i = length($newlines)-2; $str1= "";$len=0 } else { $newlines .= $str1 . "\n"; $str1 = "";$len=0 } } $i++; } print "$newlines\n";

        Since other modules might be using Text::Wrap, it's a good idea to localise your change to $Text::Wrap::columns. This ensures that other modules aren't affected by your change to this variable.

        local $Text::Wrap::columns = 60;

        Kudos to bart for already mentioning this in their response.

Re: Breaking String
by periapt (Hermit) on Oct 17, 2008 at 19:27 UTC

    Text::Wrap is usually the best way to go in these matters. Here is a simple regex that has worked for me in the past.

    use strict; use warnings; use diagnostics; my $inp = 'This sentence will have more than 120 characters and i want + to '. 'truncate this string into two lines containing 60 character +s each '. 'and ignore characters above 140 in length'; my $linelen = 60; my $matchlen = $linelen - 1; # this is what you get with your basic substring parsing; my $line01 = substr($inp,0,60); my $line02 = substr($inp,60,60); my $line03 = substr($inp,120); print '[',length($line01),'] : %',$line01,'%',"\n"; print '[',length($line02),'] : %',$line02,'%',"\n"; print '[',length($line03),'] : %',$line03,'%',"\n"; print "\n"; # this regex splits on a max of 59 chars plus one "terminating" char; # in this case this is just another word char (as defined by \w). this + is really # only relevant if the line of non-breaking characters is exactly line +len in size. # the very last capture is the rest of the text so, theoretically, you + could # repeat the pattern for as many match sequences as desire. useful if +the "terminator" # changes for each field. as a side note, the regex will discard all n +on-word chars # at the split location. if needed, you could change the \W+ to any cl +ass of chars # such as [\s,-:]+ $inp =~ m/^(.{0,$matchlen}\w??)\W+(.{0,$matchlen}\w??)\W+(.*)/; $line01 = ($1 || ''); $line02 = ($2 || ''); $line03 = ($3 || ''); print '[',length($line01),'] : %',$line01,'%',"\n"; print '[',length($line02),'] : %',$line02,'%',"\n"; print '[',length($line03),'] : %',$line03,'%',"\n"; print "\n"; # the added advantage of this construct is that it allows you to do so +mething # with the last matching character if needed $inp =~ m/^(.{0,$matchlen})(\w)??\W+(.{0,$matchlen})(\w)??\W+(.*)/; $line01 = ($1 || '').($2 || ''); $line02 = ($3 || '').($4 || ''); $line03 = ($5 || ''); print '[',length($line01),'] : %',$line01,'%',"\n"; print '[',length($line02),'] : %',$line02,'%',"\n"; print '[',length($line03),'] : %',$line03,'%',"\n"; print "\n";

    PJ
    use strict; use warnings; use diagnostics;
Re: Breaking String
by UnderMine (Friar) on Oct 17, 2008 at 13:13 UTC
    Maybe I have this wrong.
    use strict; use warnings; my $maxlen = 60; my $finalbuffer = undef; my $smstext = $ARGV[0]; if ($smstext=~s/^(.{0,$maxlen})(?:\s|$)//) { $finalbuffer.=$1; $finalbuffer.="\n".$1 if ($smstext=~s/^(.{0,$maxlen})//); } print $finalbuffer
      This code splits the second line in the middle of the word "characters". On the other hand, if you use the same regex as in the first match, it seems to work.

      PJ
      use strict; use warnings; use diagnostics;