in reply to Matching specific strings (are subs a good idea?)

Thank you for the tips so far guys, I can definitely stop feeling stuck now :-)

When posting this question I forgot it might actually help for getting the most specific advise if I'd add some snippets of the files used, so here they are

what the .xml-files look like:

<contractor customergroup="PAR"> <person> <firstname>Julie</firstname> <name>Keppen</name> <sex>F</sex> <birthdate></birthdate> <phone>0485651115</phone> <email>juliekeppen@gmail.com</email> </person> </contractor>

What the .txt-file looks like:

FirstName Julie LastName Keppen Email juliekeppen@gmail +.com BirthDate 1987-11-11 FirstName Amaury LastName Reinquin Email a.reinquin@out +look.com BirthDate 1991-08-24 FirstName Pierre LastName Vaucamps Email pierre.vaucamp +s@gmail.com BirthDate 1988-11-26 FirstName Stephanie-Katrien LastName Eggermont-Witpas Emai +l eggermont.st@gmail.com BirthDate 1900-01-01

I have just changed the query to output this result in a more readable way:

Julie;Keppen;juliekeppen@gmail.com;1987-11-11 Amaury;Reinquin;a.reinquin@outlook.com;1991-08-24 Pierre;Vaucamps;pierre.vaucamps@gmail.com;1988-11-26 Stephanie-Katrien;Eggermont-Witpas;eggermont.st@gmail.com;1900-01-01

My gut tells me this last version might be easier when working with hashes.

Have looked up stuff on hashes but it is not clear to me how to use them in this assignment, every time I read something on hashes, the first thing they do is store specific strings in them like this:

%HoA = ( flintstones => [ "fred", "barney" ], jetsons => [ "george", "jane", "elroy" ], simpsons => [ "homer", "marge", "bart" ], );

This is not very helpful to me, I'm not going to manually type in all names, may aswell manually look up the bdates and copy-paste them in the .xml then. But of course, I know I'm missing something essential in the explanation given about the hashes. Just don't know what.

Replies are listed 'Best First'.
Re^2: Matching specific strings (are subs a good idea?)
by choroba (Cardinal) on Sep 08, 2017 at 08:23 UTC
    This works for the data you posted:
    #!/usr/bin/perl use warnings; use strict; use XML::LibXML; my $text_file = shift; open my $In, '<', $text_file or die $!; my $extract = join '\s+(.*?)\s*', qw( FirstName LastName Email BirthDa +te $ ); $extract = qr/$extract/; my %birthday; while (<$In>) { my ($first, $last, $email, $bdate) = /$extract/; $birthday{$email} = $bdate; } while (my $xml_file = shift) { my $dom = 'XML::LibXML'->load_xml(location => $xml_file); my $email = $dom->findvalue('/contractor/person/email'); unless (exists $birthday{$email}) { warn "No birthday for $email!\n"; next } my $person = $dom->find('/contractor/person'); my $bday = $person->[0]->addChild($dom->createElement('birthday')) +; $bday->appendText($birthday{$email}); $dom->toFile($xml_file); }

    See XML::LibXML for documentation.

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
Re^2: Matching specific strings (are subs a good idea?)
by poj (Abbot) on Sep 08, 2017 at 08:11 UTC

    An example using XML::Twig

    #!/usr/bin/perl use strict; use autodie; use XML::Twig; use Data::Dumper; $Data::Dumper::Terse = 1; use constant DEBUG => 1; my $base = 'D:/Some/Specific/Folder'; # birthdate lookup table my $oldqr = 'c:/temp/oldqr.txt'; my $lookup = fetch_birthdates($oldqr); print '$lookup=',Dumper $lookup if DEBUG; # process XML files #my @files = glob( $base.'/ToUpload/Staging/*' ); my @files = ('test.xml'); print '@files=',Dumper \@files if DEBUG; for my $file (@files){ add_birthdate($file,$lookup); } # build birthdate lookup table sub fetch_birthdates { my $infile = shift; my %hash = (); my $count = 0; open IN,'<',$infile; # autodie while (<IN>){ ++$count; chomp; # old format #if (my @f = /FirstName(.*)LastName(.*)Email(.*)BirthDate(.*)/i){ # s/^\s+|\s+$//g for @f; # trim spaces my @f = split ';',$_; # new format my $pk = join ';',@f[0..2]; # create lookup key if (exists $hash{$pk}){ die "ERROR Duplicate record in $infile '$pk' line $count\n"; } else { $hash{$pk} = $f[3]; } #} } close IN; print "$count records read from $infile\n"; return \%hash; } # add birth date from lookup # using firstname;name;email as key sub add_birthdate { my ($file,$lookup) = @_; my $twig = XML::Twig->new( pretty_print => 'indented' ); $twig->parsefile( $file ); my ($person) = $twig->findnodes( 'person' ); my $pk = join ";", $person->findvalue('firstname'), $person->findvalue('name'), $person->findvalue('email'); print "Reading $file : "; if (exists $lookup->{$pk}){ my $birthdate = $lookup->{$pk}; my $node = $person->first_child( 'birthdate' ); $node->set_text( $birthdate ); print "$birthdate added for '$pk'\n"; open my $out,'>',$file.'.modified'; #autodie $twig->print($out); close $out; } else { print "ERROR - no birthdate for '$pk'\n"; } }
    poj