Hello,everyone! I am developping an crawler program, my process is as follow:
CentOS Linux release 7.2.1511(Core)
This is perl 5, version 20, subversion 3 (v5.20.3) built for x86_64-linux
Now I use BerkeleyDB module to store the URL queue, the script is able to work, but the efficiency is not high.Because when I use multiple script to read the database db, I found multiple script can't read at the same time . They were blocked. After completing the script will turn to a script.
There is a question: I need to store the URL queue in the file database, not the memory, what module is the non blocking, simple and efficient ?
I want to read to a URL from the database ,and this URL will deleted immediately , or lock the URL not to be read by other script at the same time, thank you!
In reply to The question of URL queue selection module ? by Perl_Love
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |