I agree about the steps like "initialize variables". What variables? Everything should be encapsulated!
Think "functional decomposition" and think "encapsulation".
The robots file can be important. You can get stuck in a loop otherwise, chasing down dynamicly generated pages that have different URLs but are the same.