Re: Truncating after the last period
by jwkrahn (Abbot) on Aug 22, 2011 at 17:05 UTC
|
( my $new_string = substr $string, 0, 400 ) =~ s/[^.]*\z//;
| [reply] [d/l] |
Re: Truncating after the last period
by ikegami (Patriarch) on Aug 22, 2011 at 17:26 UTC
|
If you're trying to wrap text, you might want Text::Wrap. Even if you're not, you might want to wrap the text then just keep the first line.
| [reply] |
Re: Truncating after the last period
by AR (Friar) on Aug 22, 2011 at 17:01 UTC
|
Can you show us what you've tried? We can help you with any holes in your knowledge or point out any incorrect assumptions.
| [reply] |
Re: Truncating after the last period
by BrowserUk (Patriarch) on Aug 22, 2011 at 17:06 UTC
|
$string =~ s[^.{1,400}\.\K.*?$][];
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
|
Let's assume the text is in the ASCII character encoding and that there is at least one period among the first 400 characters in the text.
The code you posted, and then posted again, has a defect in it. I demonstrated the defect to you in the complete, ready-to-run Perl script I posted. You haven't fixed the defect yet.
| [reply] |
|
|
#!/usr/bin/perl
use strict;
use warnings;
use open qw( :encoding(UTF-8) :std );
use Modern::Perl;
my $string = ('X' x 400) . '.';
say length $string; # Prints 401
$string =~ s[^.{1,400}\.\K.*?$][];
say length $string; # Prints 401
The first 400 characters are all 'X's. Ergo, your code demonstrates exactly nothing!
And I can safely assume the fact that you have dropped your \X stuff like a hot brick means that you've finally realised that that is a dead end also.
So, I was right. Nothing more than PST.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
|
|
|
|
|
|
#!/usr/bin/perl
use strict;
use warnings;
use open qw( :encoding(UTF-8) :std );
use Modern::Perl;
my $string = ('X' x 400) . '.';
say length $string; # Prints 401
$string =~ s[^.{1,400}\.\K.*?$][];
say length $string; # Prints 401
It will also truncate a string in the middle of a character. And it won't truncate a string that doesn't have a FULL STOP (U+002E) in it.
| [reply] [d/l] |
|
|
die 'Bad data' unless $s =~ s[^.{1,400}\.\K.*$][];
It will also truncate a string in the middle of a character.
Is that really a possibility?
Cos if it is, it means perl's unicode handling must be even more broken than I thought.
I've just had a go at making it happen and failed, but maybe I'm just not clever enough.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
|
|
|
|
Re: Truncating after the last period
by Anonymous Monk on Aug 22, 2011 at 19:39 UTC
|
"Remove all characters after the last period" can also be re-stated as, "keep all characters up to a period," relying upon the default 'greedy' behavior to slurp as many characters as it can. Keep what the regular-expression keeps (if it kept anything at all, otherwise leave the string unchanged as it contains no period at all). | [reply] |
|
|
| [reply] |