As someone who's had to parse author lists before, let me just say that unless they're clean coming in, you can run into a *lot* of problems if you just try to split. You also have the 'Susan L.' example of a given-name, but you might also run into a 'Frank de Leo' where the last name would be 'de Leo' not 'Leo'

It looks like the Biblio::Citation::Parser hasn't seen an update in 7 years, but it's likely that's it's a solved problem, and isn't in need of updates (unless someone wants to add DOI or other ID handling). It's intended to take a full citation, so you might have to look at how they're parsing the author string -- look for sub find_authors in Biblio::Citation::Parser::Citebase.

As there are quite a few people in the libraries using Perl, you could ask on the code4lib mailing list, which has lots of Perl folks on it, or the perl4lib which is lower volume (but more focused in scope), to ask if there are any better parsers out there.


In reply to Re: split function using multiple delimiters by jhourcle
in thread split function using multiple delimiters by maha

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.