I have a collegue who is a proficient programmer but new to Perl and has been charged with producing a simple cross-tab from an SQL query in HTML (and potentially in future also as an Excel workbook or other format).

I decided to have a go at this myself to see how it would look. And it looks ghastly! I would dearly love a bit of a code critique before I present this as a good solution. I can't help feeling I must be missing some shortcut or CPAN module. (Note: I've considered and largely rejected DBIx::Xtab and it would take quite a stong case to make me consider it again).

The raw output of the SQL query looks a bit like this:

A B C D E F P Q
A1 B1 C1 D1 E1 F1 One Two
A1 B1 C1 D1 E1 F2 Five Six
A1 B1 C1 D2 E1 F1 Three Four
A1 B1 C2 D1 E1 F1 Seven Eight
A1 B1 C2 D1 E2 F1 Nine Ten
A1 B1 C3 D2 E3 F2 Eleven Twelve
A1 B2 C1 D1 E1 F1 Foo Bar
A1 B2 C1 D1 E1 F1 Floob Blab
A1 B2 C2 D2 E1 F1 Thirteen Fourteen
A1 B2 C2 D2 E2 F1 Fifteen Sixteen

And I'd like output a bit like

A: A1
B: B1
EE1E2E3
FF1F2F1F2
CDPQPQPQPQ
C1 D1One Two       
C1D1  Five Six     
D2Three Four       
C2D1Seven Eight   Nine Ten   
A: A1
B: B2
EE1E2
FF1F1
CDPQPQ
C1D1Foo Floob Bar Blab   
C2D2Thirteen Fourteen Fifteen Sixteen

My code looks like this

use strict; use warnings; sub first_diff { my ($a1,$a2) = @_; # Hmmm... do I really want to destinguish undef and '' here? # I don't do so in terms of the column headers. no warnings 'uninitialized'; for ( 0..$#$a1 ) { return $_ if $_ > $#$a2 || $a1->[$_] ne $a2->[$_] || defined $a1->[$_] != defined $a2->[$_]; } return; } sub quote_ids { join ',' => map qq{"$_"}, @_; } sub make_xtab { my %args = @_; my ( $db, $page_headers, $row_headers, $col_headers, $values ) = @args{'DB','PAGES','ROWS','COLS','VALUES'}; my $page_info; { my $c = quote_ids(@$page_headers,@$col_headers); my $sth = $db->prepare ("SELECT DISTINCT $c\nFROM $args{QUERY} AS Q\nORDER BY $c"); $sth->execute; my @counts; while ( my @data = $sth->fetchrow_array ) { # First find the page... my $r = \$page_info; $r = \$$r->{shift @data} for 1 .. @$page_headers; # ... then within page analyse the columns my $xtab_data_cols = \$$r->{COL_COUNT}; my $headers = \@{$$r->{HEADERS}}; $r = \$$r->{COL_INDEX}; # Index of headers to column my $i = 0; for ( @data ) { $r = \$$r->{$_}; if ( $$r ) { # We've seen this before $headers->[$i][-1]{SPAN}++; } else { # First time we've seen this combination push @{$headers->[$i]} => { VALUE => $_, SPAN => 1, }; } $i++; } $$r = $$xtab_data_cols++; } } my $sth = $db->prepare ("SELECT ".quote_ids(@$page_headers,@$row_headers,@$col_headers,@$ +values)."\n". "FROM $args{QUERY} AS Q\n". "ORDER BY ".quote_ids(@$page_headers,@$row_headers)); $sth->execute; my ($xtab_data,@prev_row_header_values,$row_change_depth); my $this_page_info = 0; my $flush_xtab_data = sub { $args{EMIT_ROW}->(\%args, $row_change_depth, \@prev_row_header_values, $xtab_data) if $xtab_data; undef $xtab_data; }; QUERY_ROW: while ( my @query_data = $sth->fetchrow_array ) { my @page_header_values = splice @query_data, 0, @$page_headers; { my $p = $page_info; $p = $p->{$_} for @page_header_values; # Should never happen if we perform the two queries as a # single SQL transaction; next QUERY_ROW unless $p; unless ( $p == $this_page_info ) { $args{EMIT_FOOTER}->(\%args,$this_page_info) if $this_page_info; $this_page_info = $p; $flush_xtab_data->(); @prev_row_header_values = (); $row_change_depth = 0; $args{EMIT_HEADER}->(\%args,$this_page_info,\@page_header_valu +es); } } my @row_header_values = splice @query_data, 0, @$row_headers; if ( defined ( my $d = first_diff(\@row_header_values,\@prev_row_h +eader_values))) { $flush_xtab_data->(); $row_change_depth = $d; # print STDERR "@row_header_values - @prev_row_header_values\n +"; @prev_row_header_values = @row_header_values; } my $xtab_col = $this_page_info->{COL_INDEX}; $xtab_col = $xtab_col->{shift @query_data} for 1..@$col_headers; # Should never happen if we perform the two queries as a # single SQL transaction; next QUERY_ROW unless defined $xtab_col; $xtab_data ||= [ map { [] } 1 .. $this_page_info->{COL_COUNT} * @$ +values ]; $xtab_col *= @$values; for ( @$xtab_data[ $xtab_col .. $xtab_col+$#$values ] ) { push @$_ => shift @query_data; } } $flush_xtab_data->(); $args{EMIT_FOOTER}->(\%args,$this_page_info) if $this_page_info; } sub emit_html_header { my ($args,$page_info,$header_values) = @_; $args->{ROW_NO}=0; $args->{HTML_BUFFER} = []; require CGI; for (@{$args->{PAGES}}) { print "<div>$_: ",CGI->escapeHTML(shift @$header_values),"</div>\n +"; } print qq{<table border="1">\n}; my ($row_headers,$col_headers,$values) = @$args{'ROWS','COLS','VALUES'}; my $print_values = @$values > 1; for ( 0 .. $#$col_headers ) { my $rowspan=''; if ( !$print_values && $_ == $#$col_headers ) { $rowspan = ' rowspan="2"'; } print '<tr><td colspan="'. scalar(@$row_headers). '">'. $col_headers->[$_]. '</td>'; for ( @{$page_info->{HEADERS}[$_]} ) { print '<td colspan="' . $_->{SPAN} * @$values . '"'. $rowspan .'>' . CGI->escapeHTML($_->{VALUE}) . '</td>'; } print "</tr>\n"; } print '<tr>'; for ( @$row_headers ) { print "<td>$_</td>"; } if ( $print_values ) { for ( @{$page_info->{HEADERS}[-1]} ) { for ( @$values ) { print "<td>$_</td>"; } } } print "</tr>\n"; } sub emit_html_row { my ($args,$depth,$header_values,$data) = @_; my $buffer = $args->{HTML_BUFFER}; my $row_no = ++$args->{ROW_NO}; # @$buffer holds an array of partially built HTML at each level of # row header so that we can go back and fill in the rowspans after # the fact. Yes this is ugly but avoids lots of large, complex # intermediate structures. if (@$buffer) { # I wish I could think of something more elegant than putting # the <tr> on the front then ripping it off again. my $html = ''; for ( reverse splice @$buffer, $depth ) { $html = '<tr><td rowspan="' . ($row_no - $_->{START_ROW}) . '">'. CGI->escapeHTML($_->{HEADER_VALUE}). '</td>'. substr($_->{HTML} . $html ,4); } if ( $depth ) { $buffer->[-1]{HTML} .= $html; } else { print $html; } } return unless $header_values; # If called from emit_html_footer for ( @$header_values[@$buffer .. $#$header_values] ) { push @$buffer => { START_ROW => $row_no, HEADER_VALUE => $_, HTML => '', }; } no warnings 'uninitialized'; $buffer->[-1]{HTML} = '<tr>'. join('', map { "<td>$_</td>" } map { length() ? $_ : '&nbsp;' } map { CGI->escapeHTML("@$_") } @$data ). "</tr>\n"; } sub emit_html_footer { my ($args) = @_; emit_html_row($args,0); print "</table>\n"; } require DBI; # Assume $ENV{DBI_DSN} et al are set... my $db = DBI->connect or die $DBI::errstr; $db->{RaiseError} = 1; make_xtab( DB => $db, QUERY => 'TEST_DATA', PAGES => ['A','B'], ROWS => ['C','D'], COLS => ['E','F'], VALUES => ['P','Q'], EMIT_ROW => \&emit_html_row, EMIT_HEADER => \&emit_html_header, EMIT_FOOTER => \&emit_html_footer, );

In reply to SQL to HTML cross-table code critique by nobull

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.