tousifp has asked for the wisdom of the Perl Monks concerning the following question:

Here's a simple question :

I have an xml file containing following lines repeatedly

<event rev="1.2"> <date>2014-01-10-07:59:24.439+05:30I-----</date> <outcome status="0">0</outcome> <originator blade="webseald" instance="default"><component rev="1.4">a +uthn</component> <event_id>101</event_id> <action>0</action> <location>PosIntWebSeal1prod</location> </originator> <accessor name=""> <principal auth="IV_LDAP_V3.0" domain="Default">goldytelecom</principa +l> <name_in_rgy>uid=GOLDYTELECOM,cn=external,cn=Users,o=vodafone,c=in</na +me_in_rgy><session_id>05262372-799f-11e3-96d8-00145ee78c6d</session_i +d><user_location>10.77.50.58< /user_location><user_location_type>IPV4</user_location_type></accessor +><target resource="7"><object></object></target> <authntype>formsPassword</authntype><data> </data> </event>

I want to remove only the xml tags and get the data between the tags. Kindly help monks!

Replies are listed 'Best First'.
Re: XML to CSV conversion
by choroba (Cardinal) on Jan 14, 2014 at 09:50 UTC
    In XML::XSH2, a wrapper around XML::LibXML:
    open file.xml ; echo //text() ;

    I had to fix your XML to be well formed: "se-ssion" should not contain the hyphen and "/user_location" should not start on a new line.

    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      Sorry .. I didn't get your point. What are you trying to echo?

        All the text without tags. It is returned as a nodeset, so you can iterate over it and do whatever else with it.
        for my $t in //text() { perl { chomp $t; print "$t," } }
        لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: XML to CSV conversion
by hdb (Monsignor) on Jan 14, 2014 at 09:53 UTC

    The following translates your xml into a Perl structure that you can extract the data from and format as desired:

    use strict; use warnings; use Data::Dumper; use XML::Simple; $/=undef; my $xml = <DATA>; my $ref = XMLin $xml; print Dumper $ref; __DATA__ <event rev="1.2"> <date>2014-01-10-07:59:24.439+05:30I-----</date> <outcome status="0">0</outcome> <originator blade="webseald" instance="default"><component rev="1.4">a +uthn</component> <event_id>101</event_id> <action>0</action> <location>PosIntWebSeal1prod</location> </originator> <accessor name=""> <principal auth="IV_LDAP_V3.0" domain="Default">goldytelecom</principa +l> <name_in_rgy>uid=GOLDYTELECOM,cn=external,cn=Users,o=vodafone,c=in</na +me_in_rgy><session_id>05262372-799f-11e3-96d8-00145ee78c6d </session_id><user_location>10.77.50.58</user_location><user_location_ +type>IPV4</user_location_type> </accessor><target resource="7"><object></object></target> <authntype>formsPassword</authntype><data> </data> </event>

      Thanks hdb!!! But is there any other alternative to this? Because the output is a bit lengthy!

        Well, you need to pick the pieces you want, like:

        print $ref->{originator}{location},"\n";
Re: XML to CSV conversion
by Discipulus (Canon) on Jan 14, 2014 at 10:54 UTC
    ..or you can even choose to use XML::Twig...
    use warnings; use strict; use XML::Twig; my $t= XML::Twig->new( pretty_print => 'indented', twig_handlers => { '_all_' => sub {print $_->t +ext,"\n"} }, #or print $_->text," " as ypu prefer ); $/=''; $t->parse(<DATA>); __DATA__ <event rev="1.2"> <date>2014-01-10-07:59:24.439+05:30I-----</date> <outcome status="0">0</outcome> <originator blade="webseald" instance="default"><component rev="1.4">a +uthn</component> <event_id>101</event_id> <action>0</action> <location>PosIntWebSeal1prod</location> </originator> <accessor name=""> <principal auth="IV_LDAP_V3.0" domain="Default">goldytelecom</principa +l> <name_in_rgy>uid=GOLDYTELECOM,cn=external,cn=Users,o=vodafone,c=in</na +me_in_rgy><session_id>05262372-799f-11e3-96d8-00145ee78c6d </session_id><user_location>10.77.50.58</user_location><user_location_ +type>IPV4</user_location_type> </accessor><target resource="7"><object></object></target> <authntype>formsPassword</authntype><data> </data> </event>
    Hth
    L*

    PS now we wait for Jenda's XML::Rules solution.. ;=)
    L*
    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

      There's enough on this site already, besides "all text between the tags without any formatting"? I don't think that's the real task.

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

Re: XML to CSV conversion
by Anonymous Monk on Jan 14, 2014 at 13:51 UTC
    Or, if you are using LibXML anyway, you can use XPath expressions to drill right down to the portions of the XML data that you want ... without writing laborious Perl code to do it.
    A reply falls below the community's threshold of quality. You may see it by logging in.