kathys39 has asked for the wisdom of the Perl Monks concerning the following question:

I am parsing a basic xml with with XML::Simple. I need to get the value of the stylesheet, as there are several different versions running around the company. In other words, I am trying to get the value of the href attribute here:
<?xml-styleshet type="text/xsl" href="stylesheet1.xsl"?>
I can't find anything that will help me get the value of these processing instructions short of grepping..

Replies are listed 'Best First'.
Re: parse xml; get name of xsl file
by ww (Archbishop) on Apr 21, 2009 at 12:32 UTC
    If your xml is valid and compliant, the first part of the pointer should be invariant...
    hence:

    grep &/or a regex will work ... but parsing html or xml with a regex is not usually a good idea because a small departure from compliant xml or html is apt to make your effort fail in any number of unpleasant manners.

    A parser is likely to do a better job, at the price of a brief learning curve on use of the module your select. XML::Twig is often suggested among the several hundred XML modules available from CPAN (or, if on windows w/ActiveState), via PPM.

    (And if your goal is checking the style sheet, note the XSL::... modules.)

Re: parse xml; get name of xsl file
by Anonymous Monk on Apr 21, 2009 at 12:40 UTC
    #!/usr/bin/perl -- use strict; use warnings; my $xml = <<'__XML__'; <?xml version="1.0" encoding="UTF-8"?> <?xml-styleshet type="text/xsl" href="stylesheet1.xsl"?> <?xml-styleshet type="text/xsl" href="stylesheet2.xsl"?> <?xml-styleshet type="text/xsl" href="stylesheet3.xsl"?> <config /> __XML__ use XML::Twig; my @style; { XML::Twig->new( twig_handlers => { '?xml-styleshet' => sub { my ( $t, $pi, $data ) = @_; push @style, $data; return; }, }, )->parse($xml); } print join "\n", @style; __END__ type="text/xsl" href="stylesheet1.xsl" type="text/xsl" href="stylesheet2.xsl" type="text/xsl" href="stylesheet3.xsl"
      #!/usr/bin/perl -- use strict; use warnings; use XML::Twig; sub get_style { my @style; XML::Twig->new( twig_handlers => { '?xml-styleshet' => sub { my ( $t, $pi, $data ) = @_; XML::Twig->new( twig_handlers => { stylesheet => sub { push @style, $_->atts; }, }, )->parse("<stylesheet $data />"); return; }, }, )->parse(@_); return @style; } my $xml = <<'__XML__'; <?xml version="1.0" encoding="UTF-8"?> <?xml-styleshet type="text/xsl" href="stylesheet1.xsl"?> <?xml-styleshet type="text/xsl" href="stylesheet2.xsl"?> <?xml-styleshet type="text/xsl" href="stylesheet3.xsl"?> <config /> __XML__ use Data::Dumper; print Data::Dumper->new( [ get_style($xml) ] )->Indent(1)->Dump; __END__ $VAR1 = { 'href' => 'stylesheet1.xsl', 'type' => 'text/xsl' }; $VAR2 = { 'href' => 'stylesheet2.xsl', 'type' => 'text/xsl' }; $VAR3 = { 'href' => 'stylesheet3.xsl', 'type' => 'text/xsl' };