in reply to Re: create an xml file from column file
in thread create an xml file from column file

how would it be if my input was

how o B-NP are o o you o I-NP
now i need annotation tags only for those words which has B-NP and I-NP in the third column

Replies are listed 'Best First'.
Re^3: create an xml file from column file
by hdb (Monsignor) on Jul 24, 2013 at 13:21 UTC

    I pressed the wrong button and shrunk the code accidentially. It still should do what you want.

    use strict; use warnings; print "<text>",join(" ",map{/^(\w+).*?(B-NP|I-NP)?$/; $a.="<annotation><type>NP</type><text>$1</text></annotation>\n"if$2;$1 +}<DATA>),"<text>\n$a"; __DATA__ how B-NP are you I-NP really

      if i my input was this

      how o B-NP are o I-NP you o I-NP some o o really o B-GP
      and i want the output as <key="type">NP</key><text>how are you</text><key="type">GP</key><text>really</text> That is i want the whole text from B-NP to I-NP to occur in between text open and close tags.

        Another change in requirements? Wow. Full rewrite needed for this one.

        use strict; use warnings; use Text::CSV; my $csv = Text::CSV->new( { sep_char => "\t" } ); # assuming tab separ +ated input open my $words, "<", "words.txt" or die "Cannot open words.txt: $!\n"; # the following shows what's in words.txt; it is NOT words.txt itself! =head words.txt how o B-NP are o I-NP you o I-NP some o o really o B-GP =cut my $lasttype; my @text; while( my $row = $csv->getline( $words ) ) { my $text = $$row[0]; my $type = ( $$row[-1] =~ /\w-(\w+)/ ) ? $1 : ""; $lasttype = $type unless @text; # special treatment for first +row if( $type eq $lasttype ) { push @text, $text; } else { print '<key="type">'."$lasttype</key><text>@text</text +>\n" if $lasttype; $lasttype = $type; @text = ( $text ); } } # print what's left over when all input read print '<key="type">'."$lasttype</key><text>@text</text>\n" if $lasttyp +e; close $words;