Does anyone know of a module to pretty-print SQL statements?

I'm writing a utility to analyze postgresql log files. I'd like to be able to do two things:

  1. Identify parameters in a SQL statement to be able to identify families of statements that are essentially "the same" but use different constants. Just using bone-headed parsing of numbers and tick-delimited strings, I can do OK with this one, but I know I'm not doing it perfectly. One obvious thing I'm not handling yet is embedded ticks in strings, but maybe there are other things.

  2. Format the SQL statements in a more readable way, as an option. E.g. have each major clause start on a new line and maybe try to break up lengthy boolean clauses somehow. Again, it's easy to fake this by just looking for keywords, but it would be nice to be able to do it right.

Thanks,
Kevin

P.S. Right now my script's usage looks like this:

analyze_pg_log [options] file ... Options: --sort-by-count, -c Sort by the number of each type of quer +y (default) --sort-by-sql, -s Sort by the query text itself --sort-by-dupes, -d Sort by the ratio of instances of a que +ry to unique instances of a query --sort-by-weight, -w Sort by the total time spent on each ty +pe of query --params, -p Show the unique parameter sets used for + each query type --one-pid, -o Ignore entries for all but the first pi +d encountered. analyze_pg_log --help: Show this help message. analyze_pg_log --man: Show the full manual page. analyze_pg_log --version: Show the version number.

The output of the script looks like this (when not using --params) (I haven't added even primitive pretty-printing yet):

murphy@genometest$ analyze_pg_log --one-pid egdata.2.2.log There are 21 unique basic queries: Fam Count Uniques TotalTime Avg-Time Query ----- ----- ------- --------- -------- ----- A 1 1 0.001548 0.001548 SELECT main.* +FROM CustomFieldValues main WHERE ((main.CustomField = ?String-1?)) O +RDER BY main.SortOrder ASC B 1 1 0.005947 0.005947 SELECT main.* +FROM Tickets main WHERE ((main.EffectiveId = main.id)) AND ((main.Sta +tus != ?String-1?)) AND ((main.Type = ?String-2?)) AND\ ( ( (main.Status = ?String-3?)OR(main.Status = ?String-4?) ) AND ( (m +ain.Queue = ?String-5?) ) ) ORDER BY main.Priority DESC LIMIT ?Num-6? ... ---- ----- ----- --------- Totals: 161 0.091403 Approx. elapsed time of this section of log is 4.594 seconds.

In reply to SQL pretty printer? by haricothoriz

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.