analys has asked for the wisdom of the Perl Monks concerning the following question:

I'm reading from text file and extracting the data to csv file format. I'm having problem when extracting the comment if it is in the newline, it will created another row in csv. I want the comment to be in the same row. Is there I any way that I can check if the comment is newline, join the comment with newline? Input file
NAME { ABC { @NUM = 1; Aaa { @COMMENT = "Another way"; @TYPE = "AC"; } [0:0]; } DEF { @NUM = 84; Bbb { @COMMENT = "This way"; @TYPE = "DC"; } [1:0]; Ccc { @COMMENT = "This is zero (0 length) Check the SP file 4 details"; @TYPE = "AC"; } [34:28]; } }
Output
NAME NUM S_NAME COMMENT TYPE SIZE ABC 1 Aaa Another way AC [0:0] DEF 84 Bbb This way DC [1:0] Ccc This is zero (0 li) Check the SP file 4 details AC [34:28]
This is my code :
while($line[0]) { #Comment if ($line[0] =~ m/\s*\@COMMENT\s*=\s*.*/i ) { do { chomp $line[0]; $line[0] =~ s/^\s+//; $line[0] =~ s/\s+$//; $co = $line[0]; print "$co"; } until ($line[0] =~ m/\s*\@TYPE\s*=\s*.*;/i); } shift(@line); }

Replies are listed 'Best First'.
Re: looping with conditions and print newline
by betterworld (Curate) on Jan 12, 2016 at 14:24 UTC

    You didn't show us the parser code that produces $line[0]. The best place for this would be in the parser.

    However, here is a quick fix to preprocess the text file while reading it. (I am assuming you read it line by line):

    while (my $line = <$fh>) { if ($line =~ tr/"//) { # This block will be executed if there is exactly one double-quote + in the line while (1) { my $nextline = <$fh>; die unless defined $nextline; $nextline =~ s/^\s+/ /; # remove indent $line .= $nextline; # join lines break if ($nextline =~ tr/"//) % 2 == 1; } } # now parse the $line just like you would normally do using the code + that you did not show us ;-) parse_line(); }
      hi betterworld, thanks for the reply. I read the file as the following (read line by line). I didn't parse when I read it, but I check if the line match with pattern and put it in scalar.
      open (INPUT, "<", $file) || die "File read error"; @line=<INPUT>; close INPUT;
      and do while loop as per above code.
Re: looping with conditions and print newline
by choroba (Cardinal) on Jan 12, 2016 at 18:36 UTC
    Write a proper parser for the format.
    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use Marpa::R2; my $dsl = << '__DSL__'; lexeme default = latm => 1 :default ::= action => list Top ::= AttrName ('{') Structs ('}') action => top Structs ::= Struct+ Struct ::= AttrName ('{') Attr S_structs ('}') action => struct S_structs ::= S_struct+ S_struct ::= StructName ('{') Attrs ('}') Size (';') action => s_stru +ct Attrs ::= Attr+ Attr ::= ('@') AttrName ('=') AttrVal (';') action => attr AttrVal ::= Num action => First | Quoted action => First Quoted ::= ('"') String ('"') action => remove +_newlines Size ::= ('[') Num (':') Num (']') action => size Num ~ [0-9]+ StructName ~ [[:alpha:]]+ AttrName ~ [[:upper:]]+ String ~ [^"]* # No " in comments. :discard ~ Whitespace Whitespace ~ [\s]+ __DSL__ sub list { shift; [ @_ ] } sub top { +{ $_[1] => $_[2] } } sub struct { +{ $_[1] => [ $_[2]{NUM}, { map %$_, @{ $_[3] } +} ] } } sub s_struct { +{ $_[1] => { size => $_[3], map %$_, @{ $_[2] } + } } } sub attr { +{ $_[1] => $_[2] } } sub First { $_[1] } sub remove_newlines { $_[1] =~ s/\s+/ /gr } sub size { "[$_[1]:$_[2]]" } my $grammar = 'Marpa::R2::Scanless::G'->new({ source => \$dsl }); my $parsed = $grammar->parse(\ << '__INPUT__', 'main'); NAME { ABC { @NUM = 1; Aaa { @COMMENT = "Another way"; @TYPE = "AC"; } [0:0]; } DEF { @NUM = 84; Bbb { @COMMENT = "This way"; @TYPE = "DC"; } [1:0]; Ccc { @COMMENT = "This is zero (0 length) Check the SP file 4 details"; @TYPE = "AC"; } [34:28]; } } __INPUT__ say join "\t", qw( NAME NUM S_NAME COMMENT TYPE SIZE ); for my $struct (@{ $$parsed->{NAME} }) { my $name = (keys %$struct)[0]; my $num = $struct->{$name}[0]; my $s_struct = $struct->{$name}[1]; for my $s_name (keys %{ $struct->{$name}[1] }) { say join "\t", $name, $num, $s_name, @{ $s_struct->{$s_name} }{qw{ COMMENT TYPE size + }}; } }
    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,