Parsing badly formed RSS or XML might help you find an answer to your first two questions. I use the code I described in that discussion and find it copes with most common RSS errors, although I'm well aware of its faults and wouldn't use it in any serious application where robustness or accuracy is required.