Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: speeding up a file-based text search

by Thelonius (Priest)
on May 06, 2003 at 18:19 UTC ( [id://255977]=note: print w/replies, xml ) Need Help??


in reply to speeding up a file-based text search

You should see if you can get some kind of more sophisticated indexing system. I don't remember if Glimpse speeds up within-file sorts, but if it does you could use it with "agrep". (Google(TM) it).

I haven't worked with the module Search::InvertedIndex, but you could still use it, or a similar approach. You need to keep a list of all the indexed words so that you can do a fast serial scan over it (I don't know if Search::InvertedIndex will allow this) and see which of these your pattern matches. Then you look those up in the InvertedIndex to get the list of actual matches. You should probably do a merge/sort of all the matches before you retrieve them from the actual data file.

  • Comment on Re: speeding up a file-based text search

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://255977]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (4)
As of 2024-04-26 00:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found