Keep configuration details at the top of the script, rather than expecting people to read all of the code to find where they need to change things like file paths.
Anticipate large logs. Assuming you can fit all of most of a log into memory works fine until you get slashdotted.
Test to see if your regex does, in fact, match. There are fringe cases that I've seen show up in my logs that would break the regexp you're using.
Assuming that the "userid" field will be "-" works almost all of the time. When it doesn't, don't assume that the userid won't contain a blank.