One good method for parsing XML into Perl arrays is to use XML::Twig handlers. In my opinion, Twig has a better user interface than XML::Parser, and it has an excellent tutorial and documentation. On advantage of Twig over XML::Simple is that it can be used with a larger variety of XML formats.

Since you did not specify what data you wanted to go into what arrays, I decided to demonstrate how you could stuff all your data into a single Hash-of-Hashes data structure. Here is a snippet:

my %books; my $twig= new XML::Twig( twig_handlers => { book => \&books } ); $twig->parse($xmlStr); print Dumper(\%books); exit; sub books { my ($twig, $book) = @_; my $id = $book->att('id'); $books{$id}{'author' } = $book->first_child('author' ) +->text(); $books{$id}{'title' } = $book->first_child('title' ) +->text(); $books{$id}{'genre' } = $book->first_child('genre' ) +->text(); $books{$id}{'price' } = $book->first_child('price' ) +->text(); $books{$id}{'publish_date' } = $book->first_child('publish_date') +->text(); }

Here is the full example:

use strict; use warnings; use XML::Twig; use Data::Dumper; my $xmlStr = <<XML; <?xml version="1.0"?> <catalog> <book id="bk101"> <author>Gambardella, Matthew</author> <title>XML Developer's Guide</title> <genre>Computer</genre> <price>44.95</price> <publish_date>2000-10-01</publish_date> <description>An in-depth look at creating applications with XML.</description> </book> <book id="bk102"> <author>Ralls, Kim</author> <title>Midnight Rain</title> <genre>Fantasy</genre> <price>5.95</price> <publish_date>2000-12-16</publish_date> <description>A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</description> </book> <book id="bk103"> <author>Corets, Eva</author> <title>Maeve Ascendant</title> <genre>Fantasy</genre> <price>5.95</price> <publish_date>2000-11-17</publish_date> <description>After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society.</description> </book> <book id="bk104"> <author>Corets, Eva</author> <title>Oberon's Legacy</title> <genre>Fantasy</genre> <price>5.95</price> <publish_date>2001-03-10</publish_date> <description>In post-apocalypse England, the mysterious agent known only as Oberon helps to create a new life for the inhabitants of London. Sequel to Maeve Ascendant.</description> </book> <book id="bk105"> <author>Corets, Eva</author> <title>The Sundered Grail</title> <genre>Fantasy</genre> <price>5.95</price> <publish_date>2001-09-10</publish_date> <description>The two daughters of Maeve, half-sisters, battle one another for control of England. Sequel to Oberon's Legacy.</description> </book> <book id="bk106"> <author>Randall, Cynthia</author> <title>Lover Birds</title> <genre>Romance</genre> <price>4.95</price> <publish_date>2000-09-02</publish_date> <description>When Carla meets Paul at an ornithology conference, tempers fly as feathers get ruffled.</description> </book> <book id="bk107"> <author>Thurman, Paula</author> <title>Splish Splash</title> <genre>Romance</genre> <price>4.95</price> <publish_date>2000-11-02</publish_date> <description>A deep sea diver finds true love twenty thousand leagues beneath the sea.</description> </book> <book id="bk108"> <author>Knorr, Stefan</author> <title>Creepy Crawlies</title> <genre>Horror</genre> <price>4.95</price> <publish_date>2000-12-06</publish_date> <description>An anthology of horror stories about roaches, centipedes, scorpions and other insects.</description> </book> <book id="bk109"> <author>Kress, Peter</author> <title>Paradox Lost</title> <genre>Science Fiction</genre> <price>6.95</price> <publish_date>2000-11-02</publish_date> <description>After an inadvertant trip through a Heisenberg Uncertainty Device, James Salway discovers the problems of being quantum.</description> </book> <book id="bk110"> <author>O'Brien, Tim</author> <title>Microsoft .NET: The Programming Bible</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-09</publish_date> <description>Microsoft's .NET initiative is explored in detail in this deep programmer's reference.</description> </book> <book id="bk111"> <author>O'Brien, Tim</author> <title>MSXML3: A Comprehensive Guide</title> <genre>Computer</genre> <price>36.95</price> <publish_date>2000-12-01</publish_date> <description>The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more.</description> </book> <book id="bk112"> <author>Galos, Mike</author> <title>Visual Studio 7: A Comprehensive Guide</title> <genre>Computer</genre> <price>49.95</price> <publish_date>2001-04-16</publish_date> <description>Microsoft Visual Studio 7 is explored in depth, looking at how Visual Basic, Visual C++, C#, and ASP+ are integrated into a comprehensive development environment.</description> </book> </catalog> XML my %books; my $twig= new XML::Twig( twig_handlers => { book => \&books } ); $twig->parse($xmlStr); print Dumper(\%books); exit; sub books { my ($twig, $book) = @_; my $id = $book->att('id'); $books{$id}{'author' } = $book->first_child('author' ) +->text(); $books{$id}{'title' } = $book->first_child('title' ) +->text(); $books{$id}{'genre' } = $book->first_child('genre' ) +->text(); $books{$id}{'price' } = $book->first_child('price' ) +->text(); $books{$id}{'publish_date' } = $book->first_child('publish_date') +->text(); }

Here is the output:

$VAR1 = { 'bk111' => { 'publish_date' => '2000-12-01', 'price' => '36.95', 'title' => 'MSXML3: A Comprehensive Guide', 'author' => 'O\'Brien, Tim', 'genre' => 'Computer' }, 'bk108' => { 'publish_date' => '2000-12-06', 'price' => '4.95', 'title' => 'Creepy Crawlies', 'author' => 'Knorr, Stefan', 'genre' => 'Horror' }, 'bk105' => { 'publish_date' => '2001-09-10', 'price' => '5.95', 'title' => 'The Sundered Grail', 'author' => 'Corets, Eva', 'genre' => 'Fantasy' }, 'bk102' => { 'publish_date' => '2000-12-16', 'price' => '5.95', 'title' => 'Midnight Rain', 'author' => 'Ralls, Kim', 'genre' => 'Fantasy' }, 'bk112' => { 'publish_date' => '2001-04-16', 'price' => '49.95', 'title' => 'Visual Studio 7: A Comprehensive Guid +e', 'author' => 'Galos, Mike', 'genre' => 'Computer' }, 'bk106' => { 'publish_date' => '2000-09-02', 'price' => '4.95', 'title' => 'Lover Birds', 'author' => 'Randall, Cynthia', 'genre' => 'Romance' }, 'bk107' => { 'publish_date' => '2000-11-02', 'price' => '4.95', 'title' => 'Splish Splash', 'author' => 'Thurman, Paula', 'genre' => 'Romance' }, 'bk103' => { 'publish_date' => '2000-11-17', 'price' => '5.95', 'title' => 'Maeve Ascendant', 'author' => 'Corets, Eva', 'genre' => 'Fantasy' }, 'bk104' => { 'publish_date' => '2001-03-10', 'price' => '5.95', 'title' => 'Oberon\'s Legacy', 'author' => 'Corets, Eva', 'genre' => 'Fantasy' }, 'bk109' => { 'publish_date' => '2000-11-02', 'price' => '6.95', 'title' => 'Paradox Lost', 'author' => 'Kress, Peter', 'genre' => 'Science Fiction' }, 'bk101' => { 'publish_date' => '2000-10-01', 'price' => '44.95', 'title' => 'XML Developer\'s Guide', 'author' => 'Gambardella, Matthew', 'genre' => 'Computer' }, 'bk110' => { 'publish_date' => '2000-12-09', 'price' => '36.95', 'title' => 'Microsoft .NET: The Programming Bible +', 'author' => 'O\'Brien, Tim', 'genre' => 'Computer' } };

In reply to Re: Perl XML by toolic
in thread Perl XML by mecrazycoder

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.