I am trying to use perl to parse my mysqldump backup files.

The format of the colum name heading list is:
(`col1_name`, `col2_name`, `col3_name`,...)
The format of a general table is:
(col1_entry, col2_entry, col3_entry),(col1_entry, col2_entry, col3_ent +ry), (col1_entry, col2_entry, col3_entry)...
where each parenthetical list represents a row and rows are separated by commas. Numerical entries can be written as-is, while string entries are enclosed in single quotes. Single quotes within a string can be escaped with a backslash. Commas and parenthesis are treated as string characters when within a quoted string.

. It seems like the most logical way to store the data would be either as an array of arrays (without any explicit column header names) or as an array of hashes where the hash is indexed by the column names.

I am interested in trying both ways. However, I don't know how best to parse the single quotes and backslashed quotes so that for example parentheses or commas or a backslashed quote within a single quoted string get properly treated as a string character and not as a seperater.

. I also would be interested in a slick way of slurping this in without having to use some generic heavy-duty perl parser module. It seems like the parsing shouldn't be too difficult since the rules are simple but I don't know the best way...

Has anyone either parsed mysqldump format (or similar formats) before?

In reply to Parsing mysqldump files by puterboy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.