Re: Parsing pseudo XML files

Not elegant, but kinda cute. It may work, depending on how strictly your data conforms to this format (minus the typo you had at reference1_title).

use Data::Dumper;

my $line = qq(<Reference1> <reference1_name>jvdsj</reference1_name> <r
+eference1_address>1234 gjrdkjpigkdj jkgpifodsjgi</reference1_address>
+ <reference1_title>njhdaslj</reference1_title> <reference1_company> j
+hfdsalh</reference1_company> <reference1_csz>Los Angeles, CA,91406</r
+eference1_csz><reference1_phone> 818-555-1212</reference1_phone> <ref
+erence1_email>wabbit\@acme.com</reference1_email></Reference1>);


my %record = reverse split /<\/(\w+)>/, $line;
foreach (keys %record) {
    $record{$_} =~ s/<[^>]+>//g; # remove start tags
    $record{$_} =~ s/^\s+//; # remove extra whitespace
    $record{$_} =~ s/\s+$//;
    delete $record{$_} unless $record{$_}; # kill the outermost record
+ tag
}

print Dumper(\%record);


PRINTS:
$VAR1 = {
          'reference1_name' => 'jvdsj',
          'reference1_address' => '1234 gjrdkjpigkdj jkgpifodsjgi',
          'reference1_title' => 'njhdaslj',
          'reference1_company' => 'jhfdsalh',
          'reference1_csz' => 'Los Angeles, CA,91406',
          'reference1_phone' => '818-555-1212',
          'reference1_email' => 'wabbit@acme.com'
        };
[download]

BTW, sorry about the unnecessary 'consider' for code tags -- I need more coffee.

Updated: Thanks, ar0n. Blame it on the coffee again.

Comment on Re: Parsing pseudo XML files Download Code

Replies are listed 'Best First'.
(ar0n) Re (2): Parsing pseudo XML files by ar0n (Priest) on Dec 17, 2001 at 23:48 UTC
It'd probably be more efficient to write this `$record{$_} =~ s/<.*?>//g; # remove start tag` [download] as this: `$record{$_} =~ s/<[^>]+>//g;` [download] as `[^>]` will match anything but a '>' (this way it doesn't have to backtrack) [ ar0n -- want job (boston) ]	[reply] [d/l] [select]