Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Question: parse info in a Zipped XML document attached in an email

by lihao (Monk)
on Nov 19, 2008 at 22:18 UTC ( [id://724746]=perlquestion: print w/replies, xml ) Need Help??

lihao has asked for the wisdom of the Perl Monks concerning the following question:

The raw ZIP file(contains only one XML source file) is an attachment in an email which I can use Mail::POP3Client and MIME::Base64 to parse into a Perl string $zipdata. From here, I need to uncompress the attached $zipdata into XML flow, and use XML::Simple to parse the information I need. My questions are:

  • Which Perl modules are recommended to unzip a standard ZIP flow(saved in a Perl scalar), do I have to use a temporary file??
  • To parse the XML file in the ZIP flow, it seems that I need to know the filename of this XML file, i.e. Archive::Zip::MemberRead. But the XML filename contains some arbitrary numbers for different emails which I donot know beforehand. how to handle this?? :-)

I know I can save $zipdata into a ZIP file and use Linux unzip to uncompress it and check the filename of XML document and then go back to Perl to parse the XML document. But how to handle all of these under Perl. Any suggestions?

Many thanks

lihao

  • Comment on Question: parse info in a Zipped XML document attached in an email

Replies are listed 'Best First'.
Re: Question: parse info in a Zipped XML document attached in an email
by GrandFather (Saint) on Nov 19, 2008 at 22:46 UTC

    You can use Archive::Zip's member manipulation to obtain a list of files:

    use strict; use warnings; use Archive::Zip; my $zippath = 'wibble.zip'; my $zip = Archive::Zip->new (); $zip->read ($zippath); my @members = $zip->members(); print $_->fileName(), "\n" for @members;

    Perl reduces RSI - it saves typing
Re: Question: parse info in a Zipped XML document attached in an email
by ig (Vicar) on Nov 20, 2008 at 19:49 UTC
    do I have to use a temporary file??

    If your zip data is small enough to fit in memory, then you do not have to use a temporary file. Archive::Zip supports reading from a file handle, but the file handle must be seekable.

    Since perl 5.8.0, one can open a string. Unfortunately, the resulting file handle is not compatible with Archive::Zip.

    But IO::String gives compatible file handles, so you can use something like the following.

    #!/usr/bin/perl use strict; use warnings; use Archive::Zip; use IO::String; my $zipfile = "test.zip"; open(ZIP, "<", "$zipfile") or die "$zipfile: $!"; my $zipdata = do { local $/; <ZIP>; }; close(ZIP); my $io = IO::String->new($zipdata); my $zip = Archive::Zip->new(); my $status = $zip->readFromFileHandle($io); print "\$status = $status\n"; foreach my $member ($zip->members()) { print "member: " . $member->fileName() . "\n"; } close(ZIP);

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://724746]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2024-04-19 11:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found