Ah! Then this is how I would do it:
- Loop over the lines of your input.
- Extract all word characters from the beginning of the line for the text field.
- Extract all word characters at the end of the line for the type field.
- Independently build the initial <text>...</text> and the <annotation>...</annotation> parts by adding them to two strings.
- After the loop print the two strings into an xml file.