Actually, that is also not the issue. I am aware that cmp is for ascii (read the original post, and actually look at the data I am sorting). The fact that the regex is pulling ascii if it's there is not the point. In the third 'lump' of data there are only years (look at the data). Change the \w\s to \d if you like. Still doesn't work.
I've realised the issue is that it's capturing the \| as well, and numerical sort doesn't like that.