in reply to Re^3: Partitioning a set of strings by regular expressions
in thread Partitioning a set of strings by regular expressions

Well, "tag soup" sounds more chaotic than it turned out to be. At the moment I have a text file with > 2M citations recorded by several different catalogers. If you let your eyes fly over that list top down you do see that finding patterns each of which matching quite a huge bunch of these citations should be impossible and tackling that task as you suggested was my first intuition. However, I felt there might be a less tedious way to do it :-)

I am well aware of the various efforts to standardize citations but the core problem seems to be their variety...

  • Comment on Re^4: Partitioning a set of strings by regular expressions

Replies are listed 'Best First'.
Re^5: Partitioning a set of strings by regular expressions
by hippo (Archbishop) on May 12, 2020 at 15:44 UTC
    However, I felt there might be a less tedious way to do it :-)

    Well, there's always the brute force approach.

    I am well aware of the various efforts to standardize citations but the core problem seems to be their variety...

    ObXkcd