Lord Rau has asked for the wisdom of the Perl Monks concerning the following question:

Could anyone offer any advice on how I might be able to build a parser? I'm working with Dazel Output server and I need to parse the printer attribute lists for template (example below). My problem is this, I'm upgrading from 3.1 to 3.2 and I need to find out where the templates differ between the versions, so I have to parse the same template for both version and compare them. Any advice would be greatly appreciated! This is what the attributes in a template would look like...
-input-trays-supported : top, middle, bottom, large-capacity, tray1, tray2, tray3, tray4 -descriptor : hp5si PS TCP Template -printer-name : hp5si_ps_tcp_template -printer-model : pjl -printer-realization : 1 -medium-default : north-american-letter-white -output-bin-names-supported : top, face-down, face-up -start-sheet-default : none -separator-sheet-default : none -character-sets-supported : iso-latin-1, ascii -character-set-default : iso-latin-1

Replies are listed 'Best First'.
Re: Need to build attribute parser
by swiftone (Curate) on Sep 28, 2000 at 21:29 UTC
    Look into Parse::RecDescent. It should do everything you need, and comes with plenty of documentation.
Re: Need to build attribute parser
by japhy (Canon) on Sep 28, 2000 at 21:37 UTC
    Here's a simple parser for you:
    $/ = "\n-"; open INPUT, "lp.attr" or die "... $!"; # or whatever while (<INPUT>) { chomp; my ($name,$attr) = split /\s*:\s*/, $_, 2; $name = "-$name" if $. != 1; # the first record has a - my @attrs = split /,\s+/, $attr; $CONFIG{$name} = (@attrs == 1) ? $attrs[0] : \@attrs; } close INPUT;
    This would create a hash, %CONFIG, where the keys are strings like '-input-trays-supported' or '-descriptor', and the values are either strings, or references to an array of strings: $CONFIG{'-input-trays-supported'} is [ 'top', 'middle', 'bottom', ... ], and $CONFIG{-descriptor} is 'hp5si PS TCP Template'.

    $_="goto+F.print+chop;\n=yhpaj";F1:eval
Re: Need to build attribute parser
by runrig (Abbot) on Sep 28, 2000 at 21:38 UTC
    As long as the templates are consistent, its pretty simple. You probably have each printer template in a separate file, so read each file into a hash of hash of hashes. Something like:
    use File::Basename; opendir(DIR, "/template/dir") or die "can't open dir: $!"; # I'm guessing at the template extension if there is one @templates = grep { /\.tmpl$/ } readdir DIR; closedir DIR; my %printers; for my $file (@templates) { open (FH, $file) or die "Error opening $file: $!"; my $name; while (<FH>) { chomp; my ($tmp_name, $value) = split /\s*:\s*/; $name = $tmp_name if $tmp_name; $value =~ s/,$//; $printers{basename($file)}{$name}{$value} = undef; } close FH; } # Now %printers contains all valid printer attributes # and their values
    Update: I didn't write code to compare versions, but lets say you do the above once for the old version, once for the new version, so you have two hashes:
    my (%old, %new); ... for my $printer (sort keys %old) { print("Printer $printer not in new\n"), next unless exists $new{$printer}; for my $name (sort keys %{$old{$printer}}) { print("Attribute $name not in new for printer $printer\n"), next unless exists $new{$printer}{$name}; for my $val (sort keys %{$old{$printer}{$name}}) { print("Val $val not in new for printer $printer attrib $name\n"), next unless exists $new{$printer}{$name}{$val}; } } } # Then do the reverse comparision looping over %new # I'm sure you can figure that out
Re: Need to build attribute parser
by mirod (Canon) on Sep 28, 2000 at 21:43 UTC

    It might be stupid, but why not do a diff?

    Otherwise I guess this would work:

    #!/bin/perl -w use strict; my( $file1, $file2)= @ARGV; my %v1= parse_att( $file1); my %v2= parse_att( $file2); $\="\n"; my @new= grep { !$v2{$_} } keys %v1; my @deleted= grep { !$v1{$_} } keys %v2; print "new: ", join "\n ", @new; print "deleted: ", join "\n ", @deleted; sub parse_att { my $file= shift; my %atts; my $field=''; open( FILE, "<$file") or die "I'd better tell you that the open su +cceded or I'll get --'ed into oblivion, I'll even report why I failed +: $!"; while( <FILE>) { chomp; my $val; if( m/^\s*-([\w-]+) \s*:\s*([^,]*),?$/) { ($field, $val)= ($1, $2); } elsif( m/^\s*([^,]*),?$/) { $val= $1; } else { die "uh-oh, looks like this line has a problem: $_"; } # why bother with clever data structures $atts{"$field $val"}=1; } return %atts; }
      OK... So I tried the above and that was great but in my rush to post. I didn't accurately represent the sample data. So the script is dying on certain unparsable lines. (See below.)

      Any advice? Lines like this also exist!
      -frame-print-settings-path : !{dazel-install-directory}!/lib/fm_pr +int_settings.doc -font-directories : !{dazel-install-directory}!/lib/FONTS +/Soft_Horizons -ps-init-directories : !{dazel-install-directory}!/lib/PS -footer-text-default : Printed by DAZEL, Page !{page-number}!, !{started-printing-time}! -header-footer-font : Helvetica-Bold -header-footer-font-size : 12 -ps-prologue-file : ps_prologue.psc, hp8500dn.ps -physical-device-type : ps-printer -output-document-format-default: ps -output-bin-xlate : {face-down, 1}, {top, 1}, {face-up, 2}, {left, 2} -input-trays-xlate : {tray1, 0}, {tray2, 2}, {tray3, 3}, {tray4, 4}, {large-capacity, 4} -ps-printer-echoes-eof : true

        The problem is probably that some lines include commas, so the [^,]* stops at the first comma instead of the last one

        so you need to replace:

        if( m/^\s*-([\w-]+) \s*:\s*([^,]*),?$/) { ($field, $val)= ($1, $2); } elsif( m/^\s*([^,]*),?$/) { $val= $1; }

        by

        if( m/^\s*-([\w-]+) \s*:\s*(.*)$/) { ($field, $val)= ($1, $2); } elsif( m/^\s*(.*)?$/) { $val= $1; } $val=~s/,$//;