My first instinct was to suggest an XML parsing module. However, it looks like your data might not be strictly following XML formatting rules (i.e. Khen1950fx's comment). So, I decided to step into the trap of rolling my own code. The trap comes from working off of assumptions made from your "scrubbed" data.

Below is the code that I came up with and the output that it produced. Although you didn't ask for path information, it seemed like a natural next step, which is why I went ahead and added it into the code.

Code:

use strict; use warnings; use Data::Dumper; my $file = "data.xml"; my @path; open(XML,"<".$file) || die "Unable to open file '$file': $!\n"; while (<XML>) { if (/<ncp_directory/i) { my ($dir) = (/name=\"(.+?)\"/i); push @path, ($dir); } if (/<\/ncp_directory/i) {pop @path;} if (/ncp_file/i) { my ($fname) = (/name=\"(.+?)\"/i); my ($md5) = (/md5=\"(.+?)\"/i); my $full_path = join("\\",(@path),$fname); print "$md5 Name = $fname\n"; print "(Full path = $full_path)\n\n"; } } close(XML);

Output:

27A8QATED9I2Ox8F65OGEPPDCIV Name = cpwmon2k.dll (Full path = $dir1\CutePDFWriter\cpwmon2k.dll) 5F7UGLCH9K3GKxBNML1LM0G3RNL Name = gsdll32.dll (Full path = $dir1\CutePDFWriter\converter\GPLGS\gsdll32.dll) 7EDJ7V7QHMBQ1x6HLC54FG0OP6T Name = a010013l.pfb (Full path = $dir1\CutePDFWriter\converter\GPLGS\a010013l.pfb) E61K8P45E8D81x3T3E47C8QIP0U Name = GSSetup.exe (Full path = $dir1\CutePDFWriter\converter\GPLGS\GSSetup.exe) D6VRKCQ4IFOSTxCLRJ9GHN6KR6J Name = ICONLIB.DLL (Full path = $dir1\CutePDFWriter\converter\Driver\ICONLIB.DLL) 9EJQCU5IT5H58xBCG8GIT8PQEBS Name = PS5UI.DLL (Full path = $dir1\CutePDFWriter\converter\Driver\PS5UI.DLL) 3REVK8VG65NGUx88P61CUHUN603 Name = PS5UI.DLL (Full path = $dir1\CutePDFWriter\converter\Driver\x64\PS5UI.DLL) BIAA93SK0Q9VRxBT7BCG7U4L2F0 Name = PSCRIPT5.DLL (Full path = $dir1\CutePDFWriter\converter\Driver\x64\PSCRIPT5.DLL) 1QBHH0SQPIJ2Cx1REJOP1QAUJJK Name = CUTEPDFW.PPD (Full path = $dir1\CutePDFWriter\converter\Driver\CUTEPDFW.PPD) 9M25IA0L5NFKNx60S9M36K0FA6U Name = Cutepdfw.spd (Full path = $dir1\CutePDFWriter\converter\Driver\Cutepdfw.spd) 5CD14FVPVPV2TxFDKG6C6OND5U1 Name = CPWSave.exe (Full path = $dir1\CutePDFWriter\converter\CPWSave.exe) DKARE24NS4V4AxCM8QQ89CSDFDI Name = install.bat (Full path = $dir1\CutePDFWriter\converter\install.bat)

In reply to Re^3: Having problems accessing individual attributes in xml by dasgar
in thread Having problems accessing individual attributes in xml by Gemenon

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.