DeductionPro is a software package that is used by many U.S. residents to help determine the value of items donated to charity. Unfortunately, the UI does not provide a good search function and finding items within the hierarchical tree of categories can be tedious. It did not take me long to get frustrated enough to take matters into my own hands (with a bit of Perl, of course).

The data file used by the program is a simple, albeit awkward, XML file. I sprinkled a bit of XML::Twig over it and produced a tab-separated text file that can be searched more easily.

I'm quite sure there is a more efficient, or at least more Perl-ish, way of doing this. I'd be interested in other approaches, especially since I'm not very familiar with Twig.

use strict; use warnings; use XML::Twig; # Could specify $infile in @ARGV, but this is a specialized use case my $infile = 'DPNoncashDetails.xml'; #************************************************* open( my $outfh, '>', $infile . '.txt' ) or die $!; my %data; # holds item and pricing data my $twig = XML::Twig->new( start_tag_handlers => { Item => \&item } ); $twig->parsefile( $infile ); my @fields = ( 'name', 'Like New', 'Minor Wear', 'Average Wear' ); print $outfh '# ', join( "\t", 'Category', @fields ), "\n"; foreach my $treestr ( sort { $a cmp $b } keys %data ) { my $h = $data{$treestr}; foreach my $id ( sort { $h->{$a}{name} cmp $h->{$b}{name} } keys % +$h ) { print $outfh join( "\t", $treestr, @{ $h->{$id} }{ @fields } ) +, "\n"; } } #************************************************* sub item { my ( $twig, $elt ) = @_; my $tree = get_category_tree( $elt ); $tree = join( " => ", @$tree ); my $href = $elt->atts; verify_id( $tree, $href ); $data{$tree}{ $href->{itemNum} }{name} = $href->{name}; $data{$tree}{ $href->{itemNum} }{ $href->{quality} } = $href->{fmv +}; } sub get_category_tree { my ( $elt ) = @_; my @tree; while( my $parent = $elt->parent ) { last if $parent->tag eq 'NonCashDetails'; next if $parent->tag ne 'Category'; unshift( @tree, $parent->att('name') ); $elt = $parent; } return \@tree; } sub verify_id { my ( $tree, $href ) = @_; my $id = $href->{itemNum}; my $name = $href->{name}; if( exists $data{$tree} && exists $data{$tree}{$id} ) { if( $name ne $data{$tree}{$id}{name} ) { print "Warning: about to overwrite data due to record mism +atch\n"; print " $tree, item id = $id:\n"; print " [existing]: $data{$tree}{$id}{name}\n"; print " [new]: $name\n"; } } }


In reply to Reformat DeductionPro 2008 Data File by bobf

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.