Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

#!/usr/bin/perl use strict; use warnings; my $tag; my $output; my $fh; my $flag =''; my $output_text; while (<DATA>) { chomp; s/[\cA-\cZ]//g; s/\^[A-Z]//g; if(/^{(.*)}$/) # match {METATAG} line { $fh = xml_output($output, $tag, $fh); $output = ""; $tag = $1; } else { # not a {TAG} line next unless($tag); next if(/^\s*$/); $output .= ($output) ? " $_" : "<$tag>$_"; } } # End of While Loop $fh = xml_output($output, $tag, $fh); if($fh) { print $fh "</ROOT>\n"; close($fh); } exit(0); # Subroutine to open the file with the filename as {TAG} sub xml_output { my ($output, $tag, $fh) = @_; if($output) { if($output =~ m/<IT>(.*)/) { if($fh) { print $fh "</ROOT>\n"; close($fh); } open($fh, '>', "$1.xml") or die "$1.xml: $!"; print $fh "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<RO +OT>\n"; } $output =~ s/\s*(?=(<.+>|<.+\/>|<\/.+>|<\/.+><.+>))//g; print $fh "$output</$tag>\n"; } return($fh); } # End of sub sroutine __DATA__ {IT} R {DATE} 050102 {TDATE} Sunday, January 02, 2005 {EDITION} 6 {TAG} 0412270403 {BODY} Certified Financial Planner for DiStefano Finacial Group in Westfield +, MA. {IT} R {DATE} 050102 {TDATE} Sunday, January 02, 2005 {EDITION} 6 {PAGE} H5 {TAG} 0412270405 {BODY} Amdur - Rosenberg < Gabriela Rosenberg, the daughter of Anita and Samu +el Rosenberg of Buenos
{IT} is the start of the file.
Here only one file is created as R.xml. How to create a two separate files with filename as {TAG}. Please tell me how can I do this.

Replies are listed 'Best First'.
Re: Split file based on tag
by holli (Abbot) on Nov 27, 2009 at 07:54 UTC
    I could simply tell you how to quickfix your script. Instead, I will show you how to separate your problem into smaller reusable chunks.

    First we are going to define a class which holds the data of a single record in the input file: Next we write a Parser class which can read our file and return an array of hashes. Now we created a piece of code that whe can use in every skript we want to, not having to think about the internals of the input file anymore. Regarding the original problem such a script would look like. Note the use of Templating for the XML output. This does us 2 favours here. First it makes the xml easier to maintain and the Template module does the work of writing our output files for us.


    holli

    You can lead your users to water, but alas, you cannot drown them.
      sub xml_output { my ($output, $tag, $fh) = @_; if($output) { if($output =~ m/<IT>(.*)/) { if($fh) { print $fh "</ROOT>\n"; close($fh); } open($fh, '>', "$1.xml") or die "$1.xml: $!"; print $fh "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<RO +OT>\n"; } $output =~ s/\s*(?=(<.+>|<.+\/>|<\/.+>|<\/.+><.+>))//g; print $fh "$output</$tag>\n"; } return($fh); } # End of sub sroutine
      Please tell me how can i change in the this subroutine to split the file until it finds {IT} tag and filename as the {TAG} value. Please help me with this
        Go and buy yourself a working brain.


        holli

        You can lead your users to water, but alas, you cannot drown them.
Re: Split file based on tag
by Anonymous Monk on Nov 27, 2009 at 05:00 UTC
    What was wrong with all the previous answers you got? See your previous question file split