in reply to Using HOP::Lexer to parse a document

You want your TEXT regex to include the line feeds (which your text transformer function will discard):
[ 'TEXT', qr/(?s:.*)/, \&text ],
If the lexer can not tokenize a piece of a string, it will return the untokenized piece, not the array ref "token".