in reply to Re: Regex for splitting a string on a semicolon (conditionally)
in thread Regex for splitting a string on a semicolon (conditionally)
The format is definitely not optimal, but this is the only way I can get this data from the source I'm using. I can have 10% loss on the data without being in trouble, though, so I'm trying to make due. The Text::CSV is for filing the sorted references into an output file--sorry to have included this red herring here!
Thank you so much for the capturing advice! I didn't know that captures were stored into arrays in this context, but after blocking them everything works. I've never considered putting in named sub-regexes--thanks for the tip.
This is one of several steps in sorting the messy data--other areas deal with references with no authors. Some of the entries don't have capital initials, which is why I had A-z rather than A-Z. I didn't want to post a ton of lines of code (or examples--there are thousands) when my problem was just one regex operation.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Regex for splitting a string on a semicolon (conditionally)
by AnomalousMonk (Archbishop) on Feb 12, 2015 at 18:26 UTC | |
|
Re^3: Regex for splitting a string on a semicolon (conditionally)
by Anonymous Monk on Feb 12, 2015 at 09:19 UTC |