The format is definitely not optimal, but this is the only way I can get this data from the source I'm using. I can have 10% loss on the data without being in trouble, though, so I'm trying to make due. The Text::CSV is for filing the sorted references into an output file--sorry to have included this red herring here!
Thank you so much for the capturing advice! I didn't know that captures were stored into arrays in this context, but after blocking them everything works. I've never considered putting in named sub-regexes--thanks for the tip.
This is one of several steps in sorting the messy data--other areas deal with references with no authors. Some of the entries don't have capital initials, which is why I had A-z rather than A-Z. I didn't want to post a ton of lines of code (or examples--there are thousands) when my problem was just one regex operation.
In reply to Re^2: Regex for splitting a string on a semicolon (conditionally)
by grouse
in thread Regex for splitting a string on a semicolon (conditionally)
by grouse
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |