Twig / Simple / xmlgrep --help

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello Sir,
I have a XML file as follows:

##################################################################################

<?xml version="1.0" encoding="ISO-8859-1"?>

<bookstore>

<book category="COOKING">
  <title>Everyday Italian</title>
  <lang>en</lang>

  <author>Giada De Laurentiis</author>
  <year>2005</year>
  <price>30.00</price>
</book>

<book category="CHILDREN">

  <title>Harry Potter</title>

  <lang>en</lang>

  <author>J K. Rowling</author>
  <year>2005</year>
  <price>29.99</price>
</book>

<book category="WEB">
  <title>XQuery Kick Start</title>

  <lang>fr</lang>

  <author>James McGovern</author>
  <author>Per Bothner</author>
  <author>Kurt Cagle</author>
  <author>James Linn</author>
  <author>Vaidyanathan Nagarajan</author>
  <year>2003</year>
  <price>49.99</price>
</book>

<book category="WEB">
  <title>Learning XML</title>

  <lang>br</lang>

  <author>Erik T. Ray</author>
  <year>2003</year>
  <price>39.95</price>
</book>

</bookstore>
[download]

##################################################################

I am writing a perl script using XML::Twig. What i want is as follows:

[root@localhost MyPrgs]# MyScript.pl COOKING
[download]

should fetch the corresponding TITLE and now I have another simple text file which is as follows:

#################################################################################

TITLE                                    COUNTRY    HITS    PUBLICATIO
+N

Everyday Italian     USA    101  Stallion


vryday Italian       SA     11  Stallion1


Everyday Italian1     USA2   01  Stallion2


Everyday Italian2     USA3   121  Stallion3


Everyday Italian3     USA4   151  Stallion4
[download]

#################################################################################

Now it will find for "Everyday Italian" (the TITLE) in Simple.txt and output the following to STDOUT:

####################################################################################################

TITLE                                    COUNTRY    AUTHOR            
+                        PUBLICATION        PRICE

Everyday Italian     USA    Giada De Laurentiis            Stallion   
+30.00
[download]

####################################################################################################

Will XML::Twig be a correct approach, is this possible by XML::Simple or xmlgrep?

I am a beginner, a novice...could you help me out with this sir. Could you also give me a direction to move on.

--
Best Regards,

Cherry

20070815 Janitored by Corion: Removed excessive formatting, added code tags, as per Writeup Formatting Tips

Comment on Twig / Simple / xmlgrep --help Select or Download Code

Replies are listed 'Best First'.

Re: Twig / Simple / xmlgrep --help
by Jenda (Abbot) on Aug 15, 2007 at 22:15 UTC

I know you did not ask, but ... ;-)
You could use XML::Rules to extract, massage and filter the data to make it easy to use later. This will read the XML and create a hash indexed by the book title:

use XML::Rules;

my $parser = XML::Rules->new(
    rules => [
        _default => 'content',
        book => sub {
            my $title = delete $_[1]->{title};
            delete $_[1]->{'_content'};
            $title => $_[1],
        },
        bookstore => 'pass no content',
    ]
);

my $data = $parser->parse(\*DATA);

use Data::Dumper;
print Dumper( $data);

__DATA__
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
...
[download]

use XML::Rules;

my $parser = XML::Rules->new(
    rules => [
        _default => 'content',
        book => sub {
            return unless $_[1]->{category} eq $_[4]->{parameters};
            my $title = delete $_[1]->{title};
            delete $_[1]->{'_content'};
            $title => $_[1],
        },
        bookstore => 'pass no content',
    ]
);

my $category = $ARGV[0] or die "Usage: BookStore2.pl category\n";

my $data = $parser->parse(\*DATA, $category);

use Data::Dumper;
print Dumper( $data);

__DATA__
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
...
[download]

P.S.: Do not let the $_[1]->{tag_or_attr_name} and $_[4]->{parameters} scare you. I was just lazy to assign the parameters to the unnamed subroutine. With them named it would look like this:

...
        book => sub {
            my ($tag, $attr, $context, $parents, $parser) = @_;
            return unless $attr->{category} eq $parser->{parameters};
            my $title = delete $attr->{title};
            delete $attr->{'_content'};
            $title => $attr,
        },
...
[download]

Jenda
Support Denmark!
Defend the free world!

[reply]
[d/l]
[select]

Re^2: Twig / Simple / xmlgrep --help

by Anonymous Monk on Aug 16, 2007 at 06:45 UTC

Thanks alot for the suggestion. Actually I have so many XML files in a directory, and I cannt use element name(title) in the code. I want my script to be generic.
--
Best Regards,
Cherry

[reply]

Re^3: Twig / Simple / xmlgrep --help

by Jenda (Abbot) on Aug 16, 2007 at 08:27 UTC

Then build the ruleset in the script, something like:

use XML::Rules;

@ARGV == 5 or die "Usage: BookStore3.pl roottag datatag idtag fitertag
+ filtervalue\n";

my ( $roottag, $datatag, $idtag, $fitertag, $filtervalue) = @ARGV;


my $parser = XML::Rules->new(
    rules => [
        _default => 'content',
        $datatag => sub {
            return unless $_[1]->{$fitertag} eq $_[4]->{parameters};
            my $id = delete $_[1]->{$idtag};
            delete $_[1]->{'_content'};
            return $id => $_[1]
        },
        $roottag => 'pass no content',
    ]
);


my $data = $parser->parse(\*DATA, $filtervalue);

use Data::Dumper;
print Dumper( $data);

__DATA__
<?xml version="1.0" encoding="ISO-8859-1"?>
<bookstore>
[download]

BookStore3.pl bookstore book title category CHILDREN
[download]

Jenda
Support Denmark!
Defend the free world!

[reply]
[d/l]
[select]

Re^4: Twig / Simple / xmlgrep --help

by Anonymous Monk on Aug 16, 2007 at 09:13 UTC

Re^5: Twig / Simple / xmlgrep --help

by Jenda (Abbot) on Aug 16, 2007 at 09:59 UTC

Some notes below your chosen depth have not been shown here

Re: Twig / Simple / xmlgrep --help
by Anonymous Monk on Aug 15, 2007 at 03:29 UTC

Yes

[reply]