I stand corrected.

As for those advanced heuristics, my first instinct would be to look into the "Open/Import" functionality of all those open source Spreadsheet tools like LibreOffice. Those developers spent the last few decades writing software that can make sense of user provided, badly formatted data files.

As far as it concerns myself, those self-"learning" AI/heuristics/statistics tools might be somewhat interesting for occasional hobby use. But i wouldn't consider them for production use. If something goes wrong (e.g. "a bug happens"), it's easy enough to debug (and verify/certify) a handcrafted parser. If an AI goes wrong, all you can do is tweak the training data, retrain the model and pray to a $DEITY of your choice that

  1. this has fixed the current problem
  2. the change in your training data hasn't introduced new problems

Advanced statistics (including what we commonly refer to AI) is an amazing tool by itself. But when is goes wrong, you basically have to find an error (or omission) in what boils down to a formula with possibly tens of millions of variables. I mean, winning a Nobel price is nothing to sneer at, but i'm not sure how one would do it on a typical IT department budget ;-)

PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP

In reply to Re^7: Module for parsing tables from plain text document by cavac
in thread Module for parsing tables from plain text document by GrandFather

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.