in reply to Improve password solver

Here are some extremely minor points:

if (exists $tried{$guess}) { print "\tSkipping $guess - already attempted\n\n"; }
It seems like you are not really skipping anything -- you are just printing a message; this IO might be slowing you down a little.

If you don't print your progress message ("Guessing: ...) that may also save a tiny bit of time.

I wonder if your shuffle sub could be more efficient if you eliminated all the array ref and wantarray checks, since you always seem to be passing an array and returning an array. Or maybe try shuffle from the Core List::Util module.

This shorter code may save a tiny bit:

my @temp_chars; for (1 .. $length) { push @temp_chars, (shuffle(@chars))[0]; }
my $guess = "@temp_chars";

Update: Nevermind that last line (thanks AnomalousMonk)

Replies are listed 'Best First'.
Re^2: Improve password solver
by tprocter (Sexton) on Jul 03, 2009 at 01:03 UTC
    "It seems like you are not really skipping anything"

    Agreed. Normally, the CPU-intensive part of a password cracker is not the listing of passwords to guess, but in performing the password calculation related to the application. Since you are just comparing strings, the program should be greatly faster than your algorithm is producing.

    In this case, I suspect the majority of your time is spent on making duplicate guesses. The further along the program gets, the more likely that a random guess has already been tried. At some point you'll have a fraction of a percent chance of guessing a password that you haven't already guessed, meaning that more than 99% of your guesses are a waste of time. Every increment you apply to the length will make this exponentially a more wasteful algorithm.

    Your guess generator needs to stop guessing strings that it's already tried. You could try to pre-generate all of the the guesses in a shuffled list and pop them off. Or you could try a sequential list (but randomly define what that sequence is). In fact a smart sequence might attempt common characters before less common ones. Another solution might be to keep track of partial character strings and mark them as dead ends as you go, deleting the children to save memory.

    These are just ideas. I see your goal in taking a random approach at the guesses, but you need to step it up a bit, because right now, someone doing a sequential scan, and making all the wrong guesses first, will out-perform you.

    And yes, I/O is slow. Definitely don't print every single guess. Maybe print counters every hundred or thousand cycles. This might give you a clearer picture of how the algorithm slows down over time.

    Good Luck!