Counting fields from database

htmanning has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Counting fields from database (updated) by haukex (Archbishop) on Oct 02, 2020 at 21:49 UTC
I need to know the total number of threadIDs It's a little unclear to me from your description whether you just need to know this information, in which case you can ask the database to provide it with a `GROUP BY` clause, or whether you're looping through all messages anyway, and you just want to count thread IDs. In the latter case, since I assume the thread IDs are a unique identifier per thread, you can do this with a hash. Here are examples of both: use warnings; use strict; use Data::Dump; use DBI; # set up a dummy database my $dbh = DBI->connect("dbi:SQLite::memory:", undef, undef, { RaiseError=>1 }); $dbh->do(<<'END'); CREATE TABLE messages ( msgID INTEGER, threadID INTEGER ) END my $sth_i = $dbh->prepare( 'INSERT INTO messages (msgID, threadID) VALUES (?,?)'); $sth_i->execute($_,$_>>4) for 0..127; # ask the DB to give us per-thread information dd $dbh->selectall_arrayref( 'SELECT threadID, COUNT() FROM messages GROUP BY threadID'); # => [ [0, 16], [1, 16], [2, 16], [3, 16], # [4, 16], [5, 16], [6, 16], [7, 16], ] # loop through rows and count threads ourselves my $sth_s = $dbh->prepare( 'SELECT msgID, threadID from messages'); $sth_s->execute; my %threadIDs; while ( my $row = $sth_s->fetchrow_hashref ) { $threadIDs{ $row->{threadID} }++; } dd \%threadIDs; # => { "0" => 16, "1" => 16, "2" => 16, "3" => 16, # "4" => 16, "5" => 16, "6" => 16, "7" => 16 } [download] * Update: Upon rereading, it's probably the latter. By the way, `$current_count_all <= $stopcount_all` sounds like you want to limit the number of records returned, which you can also do in the database, e.g. in MySQL with a `LIMIT` clause. Update 2: For completeness, there are two issues with the approach you showed in the OP. First, you need to be certain that `$threadID` doesn't ever contain the separator character (`","` in this case), and your regex is too simple in that it will also match partial IDs (e.g. `/234/` will match in `"1234,5678"`), which you'd need to prevent by anchoring the regex appropriately - for example, if the IDs are integers, you could say `/\b\Q$threadID\E\b/)` (note I've used `\Q...\E` to escape any special characters in `$threadID` even though I just said they're integers; it's just to play it extra safe). But the method of searching the string for the `$threadID` will be way less efficient than a hash, so I would always recommend that instead. Also modified the first example to supply the threadID in addition to the count.	[reply] [d/l] [select]
Re^2: Counting fields from database (updated) by htmanning (Friar) on Oct 03, 2020 at 00:49 UTC
Thanks for this. I've updated my regex with what you suggested. You're right, mine was too simple. I used this example from someone else below and it works, but I still have to query the database twice. I can't seem to figure out how to do it in one loop, but it works. I loop through to count the total threads, then loop through again to print the rows.	[reply]
Re^2: Counting fields from database (updated) by htmanning (Friar) on Oct 03, 2020 at 01:31 UTC
I think my logic is flawed. This is a better example of what I'm doing: This is the query of the database: `Select * from messages where (username='$username' or toname='$usernam +e') and message!='' order by dateadded desc, ID desc` [download] This follows: `$titlesperpage = 10; $currentpage = 1; $startcount = ($currentpage - 1) * $titlesperpage + 1; $stopcount = $currentpage * $titlesperpage; $current_count = 0; $threadcount=0; $threadIDlist = ""; while ($pointer3 = $sth3->fetchrow_hashref){ $current_count++; $threadID = $pointer3->{'threadID'}; $threads{$pointer3->{'threadID'}}++; if ($threadIDlist !~ /$threadID/) { $threadIDlist=$threadID . "," . $threadIDlist; $current_count--; $threadcount++; } }` [download] I try to only print 10 records per page. It works perfectly for the first page, but when I come back to print the second page it duplicates some threadIDs because they are not yet in the $threadIDlist var. The second page is called with this query_string: `?currentpage=2` [download]	[reply] [d/l] [select]
Re^3: Counting fields from database by haukex (Archbishop) on Oct 03, 2020 at 07:29 UTC
Sorry, but your code is not runnable, and doesn't seem to be representative of the problem you're having. Please see How do I post a question effectively?, I know what I mean. Why don't you?, and Short, Self-Contained, Correct Example. This is a guess of what you want that I've pieced together from your various posts. use warnings; use strict; use Data::Dump; use DBI; my $titlesperpage = 10; # set up a dummy database my $dbh = DBI->connect("dbi:SQLite::memory:", undef, undef, { RaiseError=>1 }); $dbh->do(<<'END'); CREATE TABLE messages ( msgID INTEGER, threadID INTEGER ) END my $sth_i = $dbh->prepare( 'INSERT INTO messages (msgID, threadID) VALUES (?,?)'); $sth_i->execute($_,$_>>4) for 0..127; for my $currentpage (1..5) { # simulate requests for different pages my $startcount = ($currentpage - 1) * $titlesperpage + 1; my $stopcount = $currentpage * $titlesperpage; dd $startcount, $stopcount; my $sth_s = $dbh->prepare('SELECT msgID, threadID FROM messages'); $sth_s->execute; my %threadIDs; my $rowcnt = 1; while ( my $row = $sth_s->fetchrow_hashref ) { $threadIDs{ $row->{threadID} }++; if ( $rowcnt >= $startcount && $rowcnt <= $stopcount ) { dd $row->{msgID}; } } continue { $rowcnt++ } dd \%threadIDs; } [download] But this loops through all results for every request. You can have the database do the pagination, even though you have to hit the database twice it still doesn't require you to loop through all records, only those for the current page. `for my $currentpage (1..5) { # simulate requests for different pages my $startcount = ($currentpage - 1) * $titlesperpage + 1; dd $dbh->selectall_arrayref( 'SELECT threadID, COUNT(*) FROM messages GROUP BY threadID'); my $sth_s = $dbh->prepare('SELECT msgID, threadID FROM messages ' .'ORDER BY msgID LIMIT ?,?'); $sth_s->execute($startcount, $titlesperpage); while ( my $row = $sth_s->fetchrow_hashref ) { dd $row->{msgID}; } }` [download]	[reply] [d/l] [select]
Re^4: Counting fields from database - (Updated) by htmanning (Friar) on Oct 03, 2020 at 21:34 UTC
Re: Counting fields from database by wazat (Monk) on Oct 02, 2020 at 23:53 UTC
If all you want is a count of unique theadIDs, you can use a hash. `rows=$sth->rows; %threads; while (($pointer = $sth->fetchrow_hashref) && ($current_count_all <= $ +stopcount_all)){ $threads{$pointer->{'threadID'}}++; } my $threadCount = scalar keys %threads;` [download]	[reply] [d/l]
Re^2: Counting fields from database by htmanning (Friar) on Oct 03, 2020 at 00:50 UTC
Thanks! Your example with the hash works but I still have to hit the database twice. Once to get the total threadcount and then again to print the content. I wasn't able to get it to work in one loop, but that's okay it works. Thanks.	[reply]
Re^3: Counting fields from database by BillKSmith (Monsignor) on Oct 03, 2020 at 18:29 UTC
Clearly you cannot print any counts until after you have read all the data at least once. As you read the DB, compute the counts and store all data (including counts) in a convenient data structure. Pass that structure to a subroutine which formats and prints it any way you like. Any data structure that saves all the data will 'work', but a well designed structure will greatly reduce the amount of code you must write. You have not told us enough about your data or your print requirements for us to offer much help. Bill	[reply]
Re: Counting fields from database by Anonymous Monk on Oct 03, 2020 at 14:22 UTC
`SELECT DISTINCT threadId FROM ... ... SELECT COUNT(DISTINCT threadID) ...`	[reply]
Re^2: Counting fields from database by Anonymous Monk on Oct 05, 2020 at 16:22 UTC
Garbage as usual.	[reply]