Perl_Love has asked for the wisdom of the Perl Monks concerning the following question:
Hello,everyone! I am developping an crawler program, my process is as follow:
CentOS Linux release 7.2.1511(Core)
This is perl 5, version 20, subversion 3 (v5.20.3) built for x86_64-linux
Now I use BerkeleyDB module to store the URL queue, the script is able to work, but the efficiency is not high.Because when I use multiple script to read the database db, I found multiple script can't read at the same time . They were blocked. After completing the script will turn to a script.
There is a question: I need to store the URL queue in the file database, not the memory, what module is the non blocking, simple and efficient ?
I want to read to a URL from the database ,and this URL will deleted immediately , or lock the URL not to be read by other script at the same time, thank you!
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: The question of URL queue selection module ?
by beech (Parson) on Aug 12, 2016 at 23:00 UTC | |
by Perl_Love (Acolyte) on Aug 13, 2016 at 01:58 UTC | |
by beech (Parson) on Aug 13, 2016 at 09:30 UTC |