rich731 has asked for the wisdom of the Perl Monks concerning the following question:
Dear Monks,
I am seeking advice on how to limit the amount of words that print to when I bring in data from a MySQL text field. Any suggestions on where to start would be much appreciated. Below is a sample of code with comments on what I'm trying to do.
Thanks,
Rich
#get data from DB
my $sth = $dbh->prepare("SELECT * FROM publications") or error($cgi, "
+Error #1");
$sth->execute( ) or error($cgi, "Error #2: ". $dbh->errstr( ));
#loop through data
while (my $data = $sth->fetchrow_hashref) {
#pull out only the first 15 words from the pubText field
#this is where I need suggestions, the code below does not work
while ($count < 15){
@copy = split " ", $data->{pubText};
$count ++
}
#print resulting data
print qq{
<td width="195" align="left" valign="top"><div class
+="newsBox"><span class="newsTitle">$data->{pubTitle}</span><br />
@copy
<a href="http://linkToFullArticle.html">More...<
+/a>
</div></td>
};
}
Re: Truncate Data from MySQL
by jwkrahn (Abbot) on Jul 07, 2009 at 11:28 UTC
|
#pull out only the first 15 words from the pubText field
@copy = ( split ' ', $data->{ pubText } )[ 0 .. 14 ];
| [reply] [d/l] |
Re: Truncate Data from MySQL
by ww (Archbishop) on Jul 07, 2009 at 12:31 UTC
|
TIMTOWTDI (clumsier, but only slightly different):
#!/usr/bin/perl
use strict;
use warnings;
# 777834
my (@copy, $copy, $i);
@copy = split (/\s/, <DATA>, 16);
for (0..14) {
print $copy[$_] . " ";
}
__DATA__
Pull out only the first 15 words from the pubText field. This is where
+ I need suggestions, the code below does not work.
| [reply] [d/l] |
|
| [reply] [d/l] [select] |
|
That certainly is the right way to go... and cheap at the price. ++!
Some minor quibbles though:
- OP offers no indication of actually having double spaces between sentences but that is a not uncommon occurance, which is why your observation is so valuable: Put two spaces rather than one in "...field. This..." in my __DATA__ and my split pattern does NOT DWIM) whereas yours does.
- The sample I used, from the OP, has no doubled spaces.
- Whether or not the db's text field has doubled spaces depends on how it was created. If it was simply scraped from a webpage, odds are that it has none, since browsers (and I believe, browser-substitutes) do not render but one in any string of literal whitespaces (character entities are, of course, a differnt matter).
For some reason, your "...unless the empty string between two spaces counts as a word." does not parse to anything plausible (possible blind spot?) for me. FMI, is there a way to persuade split to treat the empty string between two spaces as a word boundary (\b) or a not_word boundary (\B)?
Update: Oversight addendum: "the empty string between two spaces" is a position (despite cf perldoc -f split at "As a special case for "split", using the empty pattern "//"....")
| [reply] [d/l] [select] |
|
Re: Truncate Data from MySQL
by CountZero (Bishop) on Jul 07, 2009 at 19:35 UTC
|
Wrapped up in a subroutine: use strict;
my $string
= 'one two three four five six seven eight nine ten eleven twelve
+thirteen fourteen fifteen sixteen seventeen eighteen nineteen twenty'
+;
{
local $, = ', ';
print first_x( $string, 10 ), "\n";
}
print scalar first_x( $string, 10 ), "\n";
sub first_x {
return
wantarray
? ( split /\s+/, $_[0], $_[1] + 1 )[ 0 .. $_[1] - 1 ]
: join ' ', ( split /\s+/, $_[0], $_[1] + 1 )[ 0 .. $_[1] - 1
+];
}
Output:one, two, three, four, five, six, seven, eight, nine, ten,
one two three four five six seven eight nine ten
If called in list context, it returns a list with the "x" first words. If called in scalar context it returns a string of the first "x" words joined by spaces.
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] [d/l] [select] |
|
Going off on a bit of a tangent, it just occurred to me that it should be possible to write the first_x function recursively, to reduce code duplication. Ie:
sub first_x {
return
wantarray
? ( split /\s+/, $_[0], $_[1] + 1 )[ 0 .. $_[1] - 1 ]
: join ' ', first_x( @_ );
}
Maybe a similar pattern could be applied in other circumstances as well.
--
use JAPH;
print JAPH::asString();
| [reply] [d/l] |
|
I'm not particularly fond of the extra trip through the sub when we could use a lexical to store the data:
sub first_x {
my @data = ( split /\s+/, $_[0], $_[1] + 1 )[ 0 .. $_[1] - 1 ];
return
wantarray
? @data
: join ' ' , @data
;
}
then again, I don't like the 0..n slice. Why not just use pop?
sub first_x {
my @data = split /\s+/, $_[0], $_[1] + 1 ;
pop(@data); #throw away last element
return
wantarray
? @data
: join ' ' , @data
;
}
peace,
spazm | [reply] [d/l] [select] |
Re: Truncate Data from MySQL
by poolpi (Hermit) on Jul 07, 2009 at 12:58 UTC
|
$data->{pubText} =~ /\A((?:\w+\s){14}\w+).+/;
my $text = $1 if defined $1;
# ...
hth, PooLpi
| [reply] [d/l] |
|
Hi,
I guess you can try with mySQl string functions() , so that you can do it through your sql query itself instead of fetching all the values from the db.
- Raja
| [reply] |
|
| [reply] |
|
|
|
|