sulfericacid has asked for the wisdom of the Perl Monks concerning the following question:

The database isn't storing the last word it comes across each time through the loop. In fact, the database shows to be empty when I loop through it at the top (commented out now).

For the life of me I can't figure out why the database isn't storing the newest word. I've tried everything. Twice. My only hypothesis is it's not storing because I CTRL+C my way out of the program each time because the dictionary is so large that it'll take quite some time to work through.

Could this be the problem? If so, how could I go about being able to stop the script safely where the database will save the last word?

The dictionary file is not blank. The array is fine. The entire script works properly except for the database. When the script is done, there will be a sleep between each page parse as to be nice to GoDaddy.

Any suggestions?

#!/usr/bin/perl use warnings; use strict; use DB_File; use POSIX; my $dbase = "dbase.db"; my %dbase; tie (%dbase, 'DB_File', $dbase, O_CREAT|O_RDWR, 0644) || die "Died tying database\nReason: $!\n"; my $dict = "dictionary.txt"; my $saved = "saved.txt"; #foreach (keys %dbase) #{ print "$_ => $dbase{$_}\n ."; } #exit; my $search = "https://www.godaddy.com/gdshop/default.asp"; open(DICT, $dict) or die "Cannot open $dict because $!"; my $words = <DICT>; my @words = split(/ /, $words); close(DICT) or die "Cannot close $dict because $!"; if ($ARGV[0] =~ m/con/i) { my $start = $dbase{"word"}; shift @words until $words[0] =~ m/$start/; print "con found!\n"; } my $count = -1; open(SAVED, ">>$saved") or die "Cannot open $saved because $!"; foreach my $word (@words) { $count++; print "Searching $word..\n"; use WWW::Mechanize; my $mech = WWW::Mechanize->new(); $mech->get($search); $mech->submit_form( form_name => "LookupForm", fields => { domainToCheck => "$word" } ); $dbase{"word"} = "$word"; my $results = $mech->content; if ($results =~ m/This domain name IS AVAILABLE/i) { print SAVED "$word.com\n"; print "\t $word AVAILABLE!\n\a"; } else { print "\t $word TAKEN\n"; } } close(SAVED) or die "Cannot close $saved because $!"; untie %dbase;


"Age is nothing more than an inaccurate number bestowed upon us at birth as just another means for others to judge and classify us"

sulfericacid

Replies are listed 'Best First'.
Re: database not storing data
by Tanktalus (Canon) on Oct 29, 2005 at 03:14 UTC

    You could try untie'ing and retie'ing inside the loop - that should force the whole thing to flush at the expense of reopening the db each time. Then again, if you're going to grab a webpage each time, this overhead probably isn't that big.

      Thank you. That was actually a neat idea to try. Unfortunately it still isn't storing to the database though.


      "Age is nothing more than an inaccurate number bestowed upon us at birth as just another means for others to judge and classify us"

      sulfericacid
Re: database not storing data
by ChemBoy (Priest) on Oct 29, 2005 at 05:01 UTC

    If the only possible cause you can think of is that you're exiting via interrupt (which seems very plausible), then try not exiting via interrupt. A shorter dictionary file (for testing only) is probably the easiest way to do that, though something along the lines of last if $count > 10; would also work. Alternatively, you could step through in the debugger, which offers a slightly more graceful way to exit early when you're bored with stepping through.



    If God had meant us to fly, he would *never* have given us the railroads.
        --Michael Flanders

Re: database not storing data
by graff (Chancellor) on Oct 29, 2005 at 19:38 UTC
    I'm a little baffled about your "dictionary.txt" file. You say it is large, but you read it like this:
    open(DICT, $dict) or die "Cannot open $dict because $!"; my $words = <DICT>; my @words = split(/ /, $words); close(DICT) or die "Cannot close $dict because $!";
    Normally, I'd assume that a "dictionary.txt" file would have a line-feed (or CRLF, depending on how the file was created) at the end of each word, rather than a space; this makes it easier to know how many words are present, and to add or remove words -- because counting lines is easy, and adding or (de)selecting lines is easier than adding or (de)selecting words within a single, very long line). But these issues might not be important to you, and you can do it whatever way suits you.

    Just curious: is there a line-feed (or CRLF) at the end of that text file? If so, you don't seem to be doing anything about it (it remains attached to the end of the last word), and maybe this is affecting how that last word in the list is being handled when you put it into the submit_form request.

      Actually this was a free dictionary file from http://www.orchy.com/dictionary/. (with all the first part cropped out, though). So it is space separated data and not line-fed.

      I'm not so sure about the end of the file though I doubt that's the problem because it's only testing the first 5-10 of this 150k word file. It never gets close to the end of the dictionary.

      Thank you.



      "Age is nothing more than an inaccurate number bestowed upon us at birth as just another means for others to judge and classify us"

      sulfericacid
Re: database not storing data
by Jenda (Abbot) on Oct 29, 2005 at 21:07 UTC

    Maybe I am missing something, but the only key you seem to use with the hash %dbase is "word". The string "word", not the contents of the $word variable. If all you want to do is to store the last processed word, it's not necessary to use a tied hash. Anyway if you do want to keep it and make sure the word does get stored in the file you have to flush it.

    my $dbase_obj = tie (%dbase, 'DB_File', $dbase, O_CREAT|O_RDWR, 0644) || die "Died tying database\nReason: $!\n"; ... $dbase{word} = $word; $dbase_obj->sync()

    Also please do NOT enclose variables in doublequotes, unless you do need to stringify a reference or something like that. "$variable" is almost always better written as $variable. If you are lucky, the doublequotes will just slow the script down, if not they will break it!

    And there is another problem, the

    shift @words until $words[0] =~ m/$start/;
    looks for the first word that CONTAINS the last processed word. I don't think that's what you meant. I think you wanted this:
    shift @words until $words[0] eq $start;

    BTW, it's a shame some people don't agree with your sig. Well, to hell with some people! It's a shame my Squirrel thinks age does have some importance :-(

    Jenda
    XML sucks. Badly. SOAP on the other hand is the most powerfull vacuum pump ever invented.

Re: database not storing data
by Anonymous Monk on Oct 29, 2005 at 11:21 UTC
    I've tried everything. Twice.
    What is everything?