Thanks for your clarification. I did understand Grandfather's code, I think I just used the wrong terminology in my question -- as you said, what I wanted was the proper regex to search for that four-digit year. Your additions (as well as your modification of the $bibData field) did that beautifully.
I do expect to come upon a number of rough spots, especially as I'm expecting to edit all of my research notes in a file that is equally human- and machine-readable. Quite a dream, isn't it?
One immediate problem I see with this is that the script only recognizes bibliographic data between quotation marks. So, a journal article between quotes will get picked up while a book title, which conventionally doesn't have quotes, will not. This effectively excludes about a third of my data from the xml output.
I think I might go back and edit the raw text file so that the bibliographic info on each line is between | characters.
My question is, what regex could I use to replace ^([^"]* "[^"]+".*?) so that $bibData identifies all text between | characters?
Thanks again. I'll be sure to show everyone the final product once I'm finished.
In reply to Re^4: Converting a Text file to XML
by monk8148n038
in thread Converting a Text file to XML
by monk8148n038
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |