in reply to (german) region code detection - request for thoughts
If you can rule out erraneous region codes, you might drop those codes with whatever length has the most occurences (probably 5 or 6) from the list. Any region code you don't find (whether looked up in a hash or checked per regex) must have that length then.