Good day monks. I have text stored in a postgres database. For each record, I also have a JSONB column containing a hash of keywords previously found in the text. I want to count the number of a particular subset of keywords in each record. My question is whether it is faster to do it using option 1:
my @kwlist = qw(foo bar baz qux); my $kwre = join('|',@kwlist); my @items = getfromdb("select id,text from table"); # I have a sub tha +t does this foreach my $i (0..$#items) { my ($id,$text) = @{$items[$i]}; my $count = 0; while($text =~ /$kwre/g) { $count++; } }
or using option 2:
use JSON::XS; my @kwlist = qw(foo bar baz qux); my @items = getfromdb("select id,keywordhash from table"); # I have a +sub that does this foreach my $i (0..$#items) { my ($id,$temp) = @{$items[$i]}; my $kws = decode_json($temp); my $count = 0; foreach my $k (@kwlist) { if (defined($kws->{$k})) { $count++; } } }
Or is there some faster way to do it I haven't thought of? I have no idea what is going on under the hood in these two options that would affect execution time but I feel confident some other monks do.

BTW I realize there may be some way to do it with an exotic postgres query, but it's beyond me and anyway the list of keywords I'm looking for is rather long so it would be unwieldy.


In reply to Faster of two options for finding keywords in a database by cormanaz

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.