<ROOT>
<FILE>sourcetag1</FILE>
<NUMBER>00000 11111</NUMBER>
<SOURCE>source1</SOURCE>
<AUTHOR>author1 staff1</AUTHOR>
<HEADLINE>DISPOSABLE DECOR: THE CUTTING EDGE DULLS FASTTYLE AT A SPEED USUALLY ASSOCIATED WITH WARDROBE ITEMS</HEADLINE>
</ROOT>
Sorry...It is not only for the line which has \ at the end.
It should remove all the new line character anything which doesnot match {sometext} at the start.
I mergeed the code with the sample which you provided. But it works for the line
{HEADLINE}
The merged code is below
#!/usr/bin/perl
use strict;
use warnings;
my $output = '';
my $tag;
my $fh;
LINE: while ( my $line = <DATA>) {
chomp $line;
# if line ended with a '\', remove '\' and save line for later out
++put
if ( $line =~ s/\\$// || $line =~ s/\s$//) {
# clean indentation
$line =~ s/^\s+/ /;
# save line
$output .= $line;
next LINE;
}
# we have previous line(s) to consider?
if ( length $output ) {
# clean indentation
$line =~ s/^\s+/ /;
$line = $output . $line;
$output = '';
}
if($line =~ /^{(.*)}/) {
$tag = $1;
}else {
if($tag eq 'FILE') {
if(defined($fh)){
print $fh "</ROOT>";
close($fh);
}
my $filename = $line;
open($fh, '>', "$filename.xml") or die "$filename: $!";
print $fh '<?xml version="1.0"?>',"\n";
print $fh "<ROOT>\n";
print $fh "<FILE>$filename</FILE>\n";
} elsif(defined($fh)) {
if($line ne ''){
$line =~ s/\\//gi;
print "<$tag>$line</$tag>\n";
}
}
}
print $line, $/;
}
__DATA__
{FILE}
sourcetag1
{NUMBER}
00000
11111
{SOURCE}
source1
{KEYWORD}
{AUTHOR}
author1
staff1
{HEADLINE}
DISPOSABLE DECOR: THE CUTTING EDGE DULLS FAST\
TYLE AT A SPEED USUALLY ASSOCIATED WITH WARDROBE ITEMS
{FILE}
sourcetag2
{NUMBER}
00002
{SOURCE}
sourcenam2
{KEYWORD}
{AUTHOR}
author2
staff2
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.