Split file based on tag

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

#!/usr/bin/perl
use strict;
use warnings;

my $tag;
my $output;
my $fh;
my $flag ='';
my $output_text;

while (<DATA>) {
    chomp;
    s/[\cA-\cZ]//g;
    s/\^[A-Z]//g;

        if(/^{(.*)}$/)     # match {METATAG} line
                {
                $fh = xml_output($output, $tag, $fh);
                $output = "";
                $tag = $1;
                }
                 else
                {            # not a {TAG} line
                next unless($tag);
                next if(/^\s*$/);
                $output .= ($output) ? " $_" : "<$tag>$_";
                }
        } # End of While Loop


$fh = xml_output($output, $tag, $fh);


if($fh) {
    print $fh "</ROOT>\n";
    close($fh);
}
exit(0);
# Subroutine to open the file with the filename as {TAG}


sub xml_output {
    my ($output, $tag, $fh) = @_;
    if($output) {
        if($output =~ m/<IT>(.*)/) {
            if($fh) {
                print $fh "</ROOT>\n";
                close($fh);
            }
            open($fh, '>', "$1.xml") or die "$1.xml: $!";
            print $fh "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<RO
+OT>\n";
        }


               $output =~ s/\s*(?=(<.+>|<.+\/>|<\/.+>|<\/.+><.+>))//g;
              print $fh "$output</$tag>\n";
             }
    return($fh);
  } # End of sub sroutine
__DATA__
{IT}
R
{DATE}
050102
{TDATE}
Sunday, January 02, 2005
{EDITION}
6
{TAG}
0412270403
{BODY}
Certified Financial Planner for DiStefano Finacial  Group in Westfield
+,
MA.
{IT}
R
{DATE}
050102
{TDATE}
Sunday, January 02, 2005
{EDITION}
6
{PAGE}
H5
{TAG}
0412270405
{BODY}
Amdur - Rosenberg < Gabriela Rosenberg, the daughter of Anita and Samu
+el Rosenberg of Buenos
[download]

{IT} is the start of the file.
Here only one file is created as R.xml. How to create a two separate files with filename as {TAG}. Please tell me how can I do this.

Comment on Split file based on tag Download Code

Replies are listed 'Best First'.
Re: Split file based on tag by holli (Abbot) on Nov 27, 2009 at 07:54 UTC
I could simply tell you how to quickfix your script. Instead, I will show you how to separate your problem into smaller reusable chunks. First we are going to define a class which holds the data of a single record in the input file: Read more... (2 kB) Next we write a Parser class which can read our file and return an array of hashes. Read more... (2 kB) Now we created a piece of code that whe can use in every skript we want to, not having to think about the internals of the input file anymore. Regarding the original problem such a script would look like. Read more... (2 kB) Note the use of Templating for the XML output. This does us 2 favours here. First it makes the xml easier to maintain and the Template module does the work of writing our output files for us. holli You can lead your users to water, but alas, you cannot drown them.	[reply] [d/l] [select]
Re^2: Split file based on tag by Anonymous Monk on Nov 28, 2009 at 03:29 UTC
`sub xml_output { my ($output, $tag, $fh) = @_; if($output) { if($output =~ m/<IT>(.)/) { if($fh) { print $fh "</ROOT>\n"; close($fh); } open($fh, '>', "$1.xml") or die "$1.xml: $!"; print $fh "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<RO +OT>\n"; } $output =~ s/\s(?=(<.+>\|<.+\/>\|<\/.+>\|<\/.+><.+>))//g; print $fh "$output</$tag>\n"; } return($fh); } # End of sub sroutine` [download] Please tell me how can i change in the this subroutine to split the file until it finds {IT} tag and filename as the {TAG} value. Please help me with this	[reply] [d/l]
Re^3: Split file based on tag by holli (Abbot) on Nov 28, 2009 at 13:02 UTC
Go and buy yourself a working brain. holli You can lead your users to water, but alas, you cannot drown them.	[reply] [d/l]
Re: Split file based on tag by Anonymous Monk on Nov 27, 2009 at 05:00 UTC
What was wrong with all the previous answers you got? See your previous question file split	[reply]