in reply to Parsing pseudo XML files

Not elegant, but kinda cute. It may work, depending on how strictly your data conforms to this format (minus the typo you had at reference1_title).
use Data::Dumper; my $line = qq(<Reference1> <reference1_name>jvdsj</reference1_name> <r +eference1_address>1234 gjrdkjpigkdj jkgpifodsjgi</reference1_address> + <reference1_title>njhdaslj</reference1_title> <reference1_company> j +hfdsalh</reference1_company> <reference1_csz>Los Angeles, CA,91406</r +eference1_csz><reference1_phone> 818-555-1212</reference1_phone> <ref +erence1_email>wabbit\@acme.com</reference1_email></Reference1>); my %record = reverse split /<\/(\w+)>/, $line; foreach (keys %record) { $record{$_} =~ s/<[^>]+>//g; # remove start tags $record{$_} =~ s/^\s+//; # remove extra whitespace $record{$_} =~ s/\s+$//; delete $record{$_} unless $record{$_}; # kill the outermost record + tag } print Dumper(\%record); PRINTS: $VAR1 = { 'reference1_name' => 'jvdsj', 'reference1_address' => '1234 gjrdkjpigkdj jkgpifodsjgi', 'reference1_title' => 'njhdaslj', 'reference1_company' => 'jhfdsalh', 'reference1_csz' => 'Los Angeles, CA,91406', 'reference1_phone' => '818-555-1212', 'reference1_email' => 'wabbit@acme.com' };


BTW, sorry about the unnecessary 'consider' for code tags -- I need more coffee.

Updated: Thanks, ar0n. Blame it on the coffee again.

Replies are listed 'Best First'.
(ar0n) Re (2): Parsing pseudo XML files
by ar0n (Priest) on Dec 17, 2001 at 23:48 UTC
    It'd probably be more efficient to write this
    $record{$_} =~ s/<.*?>//g; # remove start tag
    as this:
    $record{$_} =~ s/<[^>]+>//g;
    as [^>] will match anything but a '>' (this way it doesn't have to backtrack)

    [ ar0n -- want job (boston) ]