FatDog has asked for the wisdom of the Perl Monks concerning the following question:
I have several million long-text descriptions that I need to truncate to 1000 characters. This often leaves me with a block of text that looks like:
"...one of his best tracks. It's good. His other notew"
I want to identify these fragments and remove the partial sentence or all chars past the last period to get this:
"...one of his best tracks. It's good."
I have used "Split" to create an array based on "." chars, then truncated the last member off and re-joined but this adds a 20 fold increase in processing time.if Len($lRow) > 1000 { $lRow = substr($lRow, 0, 1000); $lRow =~ tr/\. xxxx$//; # here is where I have trouble }
I also know I have to be careful of greedy pattern matching but I am un-sure how to use the non-greedy "+?" regreps.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Truncating Last Sentence
by japhy (Canon) on May 14, 2004 at 20:38 UTC | |
|
Re: Truncating Last Sentence
by Enlil (Parson) on May 14, 2004 at 19:35 UTC | |
|
Re: Truncating Last Sentence
by sacked (Hermit) on May 14, 2004 at 19:36 UTC | |
|
Re: Truncating Last Sentence
by Belgarion (Chaplain) on May 14, 2004 at 19:34 UTC | |
|
Re: Truncating Last Sentence
by FatDog (Beadle) on May 14, 2004 at 21:00 UTC | |
by Belgarion (Chaplain) on May 14, 2004 at 21:15 UTC | |
by paulbort (Hermit) on May 17, 2004 at 18:19 UTC | |
|
Re: Truncating Last Sentence
by qq (Hermit) on May 14, 2004 at 23:12 UTC | |
by dimar (Curate) on May 17, 2004 at 18:52 UTC | |
|
Re: Truncating Last Sentence
by Enlil (Parson) on May 14, 2004 at 20:00 UTC |