I'm impressed. I see that XLSperl even does the right thing when cells contain unicode characters! I managed to find one issue that seems to limit the usefulness of the "XLSprint" function: it looks like it should be easy to do one-liner editing to create a modified xls file, like this:
XLSperl -lane 's/fubar/foobar/ for(@F); XLSprint @F' < old.xls > new.x +ls
and that does in fact work well -- except when the input xls file has two or more non-empty worksheets one or more non-empty worksheets with a name other than "Sheet1", in which case all worksheets get glommed together into "Sheet1" in the output xls file. (If there is a way to preserve worksheet names and boundaries in this sort of edit proces, someone please enlighten me.)

I noticed the problem when trying verify the results of such an edit process, using a one-liner like this:

XLSperl -lne '$h{"$WS:$CELL: $_"}++;END{print for(grep {$h{$_}==1} key +s %h)}' old.xls new.xls
That turns out to be very handy for listing cell-value differences between two xls files -- provided that they both contain the same worksheets and are mostly similar except for value differences in particular cells.

But if the two files don't have the same quantity and names of worksheets, this sort of comparison is useless (or rather, it reports an awful lot of diffs, which actually makes it good for discovering problems, like the loss of worksheet boundaries caused by simple cell-editing operations.)

UPDATE: Here's a slightly more verbose but more informative version of the one-liner for comparing two xls files:

XLSperl -lne '$h{"$WS:$CELL:$_"}.="in $ARGV"; END{print "$_ ($h{$_})" for(grep{$h{$_}!~/\.xlsin /}sort keys %h)}' +a.xls b.xls
It sorts the cells in the output listing, and tells which file contained each of the unique cell values.

In reply to Re: MS Excel One-Liner Challenge by graff
in thread MS Excel One-Liner Challenge by spectre9

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.