Truncate Data from MySQL

rich731 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Truncate Data from MySQL by jwkrahn (Abbot) on Jul 07, 2009 at 11:28 UTC
`#pull out only the first 15 words from the pubText field @copy = ( split ' ', $data->{ pubText } )[ 0 .. 14 ];` [download]	[reply] [d/l]
Re: Truncate Data from MySQL by ww (Archbishop) on Jul 07, 2009 at 12:31 UTC
TIMTOWTDI (clumsier, but only slightly different): `#!/usr/bin/perl use strict; use warnings; # 777834 my (@copy, $copy, $i); @copy = split (/\s/, <DATA>, 16); for (0..14) { print $copy[$_] . " "; } __DATA__ Pull out only the first 15 words from the pubText field. This is where + I need suggestions, the code below does not work.` [download]	[reply] [d/l]
Re^2: Truncate Data from MySQL by mzedeler (Pilgrim) on Jul 07, 2009 at 18:51 UTC
`/\s/` should be `/\s+/` unless the empty string between two spaces counts as a word.	[reply] [d/l] [select]
Re^3: Truncate Data from MySQL by ww (Archbishop) on Jul 07, 2009 at 22:19 UTC
That certainly is the right way to go... and cheap at the price. ++! Some minor quibbles though: OP offers no indication of actually having double spaces between sentences but that is a not uncommon occurance, which is why your observation is so valuable: Put two spaces rather than one in "...field. This..." in my `__DATA__` and my `split` pattern does NOT DWIM) whereas yours does. The sample I used, from the OP, has no doubled spaces. Whether or not the db's text field has doubled spaces depends on how it was created. If it was simply scraped from a webpage, odds are that it has none, since browsers (and I believe, browser-substitutes) do not render but one in any string of literal whitespaces (character entities are, of course, a differnt matter). For some reason, your "`...unless the empty string between two spaces counts as a word.`" does not parse to anything plausible (possible blind spot?) for me. FMI, is there a way to persuade split to treat the empty string between two spaces as a word boundary (`\b`) or a not_word boundary (`\B`)? Update: Oversight addendum: "the empty string between two spaces" is a position (despite cf `perldoc -f split` at "As a special case for "split", using the empty pattern "//"....")	[reply] [d/l] [select]
Re^4: Truncate Data from MySQL by mzedeler (Pilgrim) on Jul 09, 2009 at 19:59 UTC
Re: Truncate Data from MySQL by CountZero (Bishop) on Jul 07, 2009 at 19:35 UTC
Wrapped up in a subroutine: `use strict; my $string = 'one two three four five six seven eight nine ten eleven twelve +thirteen fourteen fifteen sixteen seventeen eighteen nineteen twenty' +; { local $, = ', '; print first_x( $string, 10 ), "\n"; } print scalar first_x( $string, 10 ), "\n"; sub first_x { return wantarray ? ( split /\s+/, $_[0], $_[1] + 1 )[ 0 .. $_[1] - 1 ] : join ' ', ( split /\s+/, $_[0], $_[1] + 1 )[ 0 .. $_[1] - 1 +]; }` [download] Output: `one, two, three, four, five, six, seven, eight, nine, ten, one two three four five six seven eight nine ten` [download] If called in list context, it returns a list with the "x" first words. If called in scalar context it returns a string of the first "x" words joined by spaces. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James	[reply] [d/l] [select]
Re^2: Truncate Data from MySQL by wol (Hermit) on Jul 07, 2009 at 22:27 UTC
Going off on a bit of a tangent, it just occurred to me that it should be possible to write the first_x function recursively, to reduce code duplication. Ie: `sub first_x { return wantarray ? ( split /\s+/, $_[0], $_[1] + 1 )[ 0 .. $_[1] - 1 ] : join ' ', first_x( @_ ); }` [download] Maybe a similar pattern could be applied in other circumstances as well. -- use JAPH; print JAPH::asString();	[reply] [d/l]
Re^3: Truncate Data from MySQL by spazm (Monk) on Jul 08, 2009 at 20:01 UTC
I'm not particularly fond of the extra trip through the sub when we could use a lexical to store the data: `sub first_x { my @data = ( split /\s+/, $_[0], $_[1] + 1 )[ 0 .. $_[1] - 1 ]; return wantarray ? @data : join ' ' , @data ; }` [download] then again, I don't like the 0..n slice. Why not just use pop? `sub first_x { my @data = split /\s+/, $_[0], $_[1] + 1 ; pop(@data); #throw away last element return wantarray ? @data : join ' ' , @data ; }` [download] peace, spazm	[reply] [d/l] [select]
Re: Truncate Data from MySQL by poolpi (Hermit) on Jul 07, 2009 at 12:58 UTC
text to text: `$data->{pubText} =~ /\A((?:\w+\s){14}\w+).+/; my $text = $1 if defined $1; # ...` [download] hth, PooLpi 'Ebry haffa hoe hab im tik a bush'. Jamaican proverb	[reply] [d/l]
Re^2: Truncate Data from MySQL by kulls (Hermit) on Jul 07, 2009 at 14:16 UTC
Hi, I guess you can try with mySQl string functions() , so that you can do it through your sql query itself instead of fetching all the values from the db. - Raja	[reply]
Re^3: Truncate Data from MySQL by wol (Hermit) on Jul 07, 2009 at 22:18 UTC
I wondered about that as well, but having had a look at the MySQL reference, I can't see anything that would do what the OP needs. I'm no expert though. Is it possible? Can anyone enlighten? -- use JAPH; print JAPH::asString();	[reply]
Re^4: Truncate Data from MySQL by stevemayes (Scribe) on Jul 08, 2009 at 05:46 UTC
Re^5: Truncate Data from MySQL by zebedee (Pilgrim) on Jul 08, 2009 at 07:53 UTC


Do you know where your variables are?
	PerlMonks