inspirio has asked for the wisdom of the Perl Monks concerning the following question:

Ok guys,I'm just really new to perl programming. With the script below, I just wanna place a block of the input data to 2 different output files depending on the existence of a tag within each block. The input data format basically looks like this:
<DIVIDER>file.txt</DIVIDER> <STANDARD>.... .... ... <DIVIDER>file.txt</DIVIDER> >JN.... .... ....
Basically, I just need to print only the data with the <STANDARD> tag to $outfile and those with the >JN to $fixfile. Within the process I'm also trying to strip the <DIVIDER> tag which doesn't seem to work. Please help me out, here's the snippet:
#!/bin/perl $infile=ARGV[0]; $outfile=ARGV[1]; $fixfile="fix.txt" open(INFILE,”$infile”) or die “can’t open $fixfile: $!” ; open(OUTFILE,”>$outfile”) or die “can’t open $outfile: $!”; open(FIXFILE,”>$fixfile”) or die “can’t open $fixfile: $!”; $/= “<DIVIDER>*</DIVIDER>”; $temp=""; while(<INFILE>) { $*=1; $temp=$_; if ( $temp=~ /^<STANDARD>(.|\n)+/ ){ $temp=~s/<(DIVIDER)>.+<\/\1>//; print OUTFILE $temp; } else { print FIXFILE $temp; } } close (INFILE); close (FIXFILE); close (OUTFILE);
Thanks!

Replies are listed 'Best First'.
Re: Writing on different output files.
by idle (Friar) on Feb 22, 2006 at 07:59 UTC
    Hi. I think this code will guide you. Notice switch '-w' and string 'use strict' - it must be there always.
    #!/usr/bin/perl -w use strict; use warnings; while(<DATA>){ if (/<STANDARD>.*/) { print; } if (/>JN.*/) { print; } } __DATA__ <DIVIDER>file.txt</DIVIDER> <STANDARD>.... .... ... <DIVIDER>file.txt</DIVIDER> >JN....
Re: Writing on different output files.
by spiritway (Vicar) on Feb 23, 2006 at 04:32 UTC

    Not quite sure what exactly you're looking for... but in your if statement, you will wind up printing everything to FIXFILE that doesn't meet the if condition. That may not be what you want.

    I note that you also used $*=1; This is deprecated. I am wondering why you are setting the input record separator: $/ = "<DIVIDER>*</DIVIDER>"; That is almost certainly not doing what you want. You would be better off just using regexen to find <DIVIDER> and </DIVIDER>.

      The <DIVIDER> tags content varies, I've set this so that the data may be read by block. My main problem is that the data isn't parsed properly. Data are being placed in the outfile. I previously used this Redexp as record separator but didn't work
      $/=~/<(LN_FILE_DIVIDER_TAG)>.+<\/\1>/;
      Am I using the print command properly?