Re: Alternative to querying a database to save time

You -could- do this with Perl, problem is you'll need to load all your data into memory to do efficient searching, and unless you want to keep it there (which will eat hundreds of MB of RAM minimum...), you'll have to recreate the hash every time you want to do a lookup, which probably won't be that fast either. I'd personally use a database if you're going to be doing more than a few searches per day or if your data updates often or if it has to be accurate in real time. If you search very infrequently and your data is more or less static, a Perl solution will work fine.

As always, giving us samples of your data is going to help if you want specific queries / code.

Comment on Re: Alternative to querying a database to save time

Replies are listed 'Best First'.
Re^2: Alternative to querying a database to save time by mbethke (Hermit) on Dec 29, 2011 at 15:09 UTC
You -could- do this with Perl, problem is you'll need to load all your data into memory to do efficient searching, and unless you want to keep it there (which will eat hundreds of MB of RAM minimum...), you'll have to recreate the hash every time you want to do a lookup, which probably won't be that fast either. As far as I understood, he just needs the set of abstracts from a static DB for every set of <file1, file2> that are both associated with one of the keys in file1 and match one of the terms in file2. To do that, he could fetch the abstracts for each of the keys, do the searching in Perl and depending on the result either output the abstract or just throw the record away. That way he only has to keep one record at a time in memory and would likely be faster than with MySQL's horribly inefficient regexen, not to mention one separate query per key times search terms. As always, giving us samples of your data is going to help if you want specific queries / code. True. The Stackoverflow link has some snippets though, if not the clearest.	[reply]
Re^3: Alternative to querying a database to save time by smandape1 (Acolyte) on Dec 29, 2011 at 19:36 UTC
Thank you mbethke. Yeah, this is exactly what I want to do. I'm following the code from the stackoverflow by stevenl. It works, but takes much longer time. One more thing worth to mention is my abstracts are full-text-indexed. And after searching in the abstract, I just want the associated primarykey and not the abstract. I have put the terms in an array and matching them with the abstract in mysql. Can you please help to modify the code. `my $against = join ' ', map {qq/"$_"/} @foo; while (my $PK = <FILE1>) { chomp $PK; my $sth = $dbh->prepare(qq{ SELECT value FROM value_table WHERE primary_key = ? # no '+' so it can match against at least one of the words i +n the list AND MATCH (column_text_indexed) AGAINST (? IN BOOLEAN MODE) }); $sth->execute($PK, $against);` [download] FILE1 has list of all primarykeys and FILE2 has the list of terms. Thank you for your help.	[reply] [d/l]
Re^4: Alternative to querying a database to save time by Anonymous Monk on Dec 29, 2011 at 20:34 UTC
Um, that code is doing the preparing in a loop. You are supposed to do it just a single time. Besides, the parametrised query for `$against` is counter-productive here (it will waste loads of time if `$against` is big as you seem to have here). I recommend embedding it to the SQL statement (quote it with `$dbh->quote`!)	[reply] [d/l] [select]
Re^3: Alternative to querying a database to save time by smandape1 (Acolyte) on Dec 29, 2011 at 19:39 UTC
I am sorry. I didn't see your answer. I will try it.	[reply]