kshitij has asked for the wisdom of the Perl Monks concerning the following question:

Hi Guys , I am working on one perl script to create one output file based on some data processing on my input file in a different format.

Input file

########################################### pat1 U_TOP_LOGIC/i_reg_2_/Q : # {111111}111 pat2 U_TOP_LOGIC/i_reg_2_/Q : # {110111}111 pat3 U_TOP_LOGIC/i_reg_2_/Q : # {110111}011 pat4 U_TOP_LOGIC/i_reg_2_/Q : # {110111}101 pat5 U_TOP_LOGIC/i_reg_2_/Q : # {110111}110 pat1 U_TOP_LOGIC/i_reg_3_/Q : # {111111}110 pat2 U_TOP_LOGIC/i_reg_3_/Q : # {111111}111 pat3 U_TOP_LOGIC/i_reg_3_/Q : # {111111}101 pat4 U_TOP_LOGIC/i_reg_3_/Q : # {111111}110 pat5 U_TOP_LOGIC/i_reg_3_/Q : # {111111}111 pat1 U_TOP_LOGIC/i_reg_4_/Q : # {111111}111 pat2 U_TOP_LOGIC/i_reg_4_/Q : # {111111}011 pat3 U_TOP_LOGIC/i_reg_4_/Q : # {111111}111 pat4 U_TOP_LOGIC/i_reg_4_/Q : # {111111}011 pat5 U_TOP_LOGIC/i_reg_4_/Q : # {111111}111 ################################################################

As you can see in the input file there are 5 patterns for each register .

Please find the output file which I am looking for .In this output file first section is the name of the register(should come only once) , then the patterns name pat1 pat2 pat3 pat4 and pat5 , Each of this pattern should have a value which is the first split from the section

For example {111111}111

I am just looking for the first value after finishing of curly braces.'

{111111} - dont want this

111 - 1st value from this pattern ie 1

Example : U_TOP_LOGIC/i_reg_2_/Q , The first value after the curly braces for each pattern is 1 1 0 1 1 .

Output file (First split)

######################################################## Reg pat1 pat2 pat3 pat4 pat5 ######################################################## U_TOP_LOGIC/i_reg_2_/Q 1 1 0 1 1 U_TOP_LOGIC/i_reg_3_/Q 1 1 1 1 1 U_TOP_LOGIC/i_reg_4_/Q 1 0 1 0 1 ########################################################

I am not getting exact success with perl . Could you help me out?

Thanks and Regards

Kshitij Kulshreshtha

Replies are listed 'Best First'.
Re: Formatted output file with data processing
by choroba (Cardinal) on Sep 24, 2019 at 11:36 UTC
    Read the input line by line. If the line contains patX, just verify that you have stored X-1 values so far. Otherwise, store the value into an array; once the array has 5 elements, print the register name and the elements.
    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; my @values; while (<>) { if (my ($size) = /^pat([0-9])/) { $size == 1 + @values or die "Unexpected $_"; next } my ($register, $value) = /(\S+).*\}([0-9])/; say join "\t", $register, splice @values if 5 == push @values, $value; }
    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

      choroba, I like how you get at the values without even considering a hash here. Also the uppercase S to capture the first number in the matching pattern.

      Thanks a lot !

      Could you help me out by creating in the format as stated in output file

      Reg pat1 pat2 pat3 pat4 pat5 U_TOP_LOGIC/i_reg_2_/Q 1 1 0 1 1 U_TOP_LOGIC/i_reg_3_/Q 1 1 1 1 1 U_TOP_LOGIC/i_reg_4_/Q 1 0 1 0 1 ########################################################

      Thanks

      Kshitij

        This is an extension of choroba's code here with more tolerance for the blank and comment lines apparently (?) present in the OPed example input file (k.dat below), and reproducing exactly the output format shown here. Add oddball output formats for individual registers as needed.

        Script format_multiline_for_output_1.pl:

        use 5.010; # needs // operator use warnings; use strict; my $default_format = "%-25s%d %d %d %d %d\n"; my %odd_format = ( 'U_TOP_LOGIC/i_reg_2_/Q' => "%-25s%d %d %d %d %d\n", ); print "Reg pat1 pat2 pat3 pat4 pat5\n\n"; my @values; while (<>) { if (/^\s*$/ or /^\s*#/) { # blank or comment line # do nothing } elsif (my ($size) = /^pat([0-9])/) { $size == 1 + @values or die "Unexpected '$_'"; } else { my ($register, $value) = /(\S+).*\}([0-9])/; my $fmt = $odd_format{$register} // $default_format; printf $fmt, $register, splice @values if 5 == push @values, $value; } } print "########################################################\n";
        Output:
        c:\@Work\Perl\monks\kshitij>perl format_multiline_for_output_1.pl < k. +dat Reg pat1 pat2 pat3 pat4 pat5 U_TOP_LOGIC/i_reg_2_/Q 1 1 0 1 1 U_TOP_LOGIC/i_reg_3_/Q 1 1 1 1 1 U_TOP_LOGIC/i_reg_4_/Q 1 0 1 0 1 ########################################################
        (k.dat taken from the OPed example input file.)


        Give a man a fish:  <%-{-{-{-<

Re: Formatted output file with data processing
by tybalt89 (Monsignor) on Sep 24, 2019 at 15:43 UTC
    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11106636 use warnings; local $_ = do { local $/; <DATA> }; print "########################################################\n"; print "Reg pat1 pat2 pat3 pat4 pat5\n"; print "########################################################\n"; print "$1 ", join(' ' x 4, $& =~ /\}(\d)/g), "\n" while /^(.*?) :.*\n(?:.*\n\1.*\n)*/gm; print "########################################################\n"; __DATA__ ########################################### pat1 U_TOP_LOGIC/i_reg_2_/Q : # {111111}111 pat2 U_TOP_LOGIC/i_reg_2_/Q : # {110111}111 pat3 U_TOP_LOGIC/i_reg_2_/Q : # {110111}011 pat4 U_TOP_LOGIC/i_reg_2_/Q : # {110111}101 pat5 U_TOP_LOGIC/i_reg_2_/Q : # {110111}110 pat1 U_TOP_LOGIC/i_reg_3_/Q : # {111111}110 pat2 U_TOP_LOGIC/i_reg_3_/Q : # {111111}111 pat3 U_TOP_LOGIC/i_reg_3_/Q : # {111111}101 pat4 U_TOP_LOGIC/i_reg_3_/Q : # {111111}110 pat5 U_TOP_LOGIC/i_reg_3_/Q : # {111111}111 pat1 U_TOP_LOGIC/i_reg_4_/Q : # {111111}111 pat2 U_TOP_LOGIC/i_reg_4_/Q : # {111111}011 pat3 U_TOP_LOGIC/i_reg_4_/Q : # {111111}111 pat4 U_TOP_LOGIC/i_reg_4_/Q : # {111111}011 pat5 U_TOP_LOGIC/i_reg_4_/Q : # {111111}111 ################################################################

    Outputs:

    ######################################################## Reg pat1 pat2 pat3 pat4 pat5 ######################################################## U_TOP_LOGIC/i_reg_2_/Q 1 1 0 1 1 U_TOP_LOGIC/i_reg_3_/Q 1 1 1 1 1 U_TOP_LOGIC/i_reg_4_/Q 1 0 1 0 1 ########################################################

    Column spacing slightly tweaked.

Re: Formatted output file with data processing
by Don Coyote (Hermit) on Sep 24, 2019 at 13:18 UTC

    Hello kshitij

    I process this by splitting the lines up through a while loop. The logic starts to get confusing as you are essentially dealing with 3 dimensions of data here.

    Tracking the loop through debugger also helped greatly to identify what was being kept track of and where it was ending up.

    I have also added extraction routine to make a table. There are multiple patterns for each register, so in a sense this is a three-dimensional data, as there is a 2 dimensional 3 x 5 pattern table for each register.

    Using sprintf would be next step for a nice layout. Also a little more effort to just get the Registry as a single key to the table. Forgoing those, I hope think this is the kind of logic you are looking for

    #!perl use strict; use warnings; my $patflip = 1; my $pat; my $reppat = 1; my $reg; my @abc; my %regpat = (); while( defined ( my $line = <DATA> ) ){ chomp $line; #print "line: $line"; if($line =~ /\Apat(.)/ ){ $patflip++ <= 5 or $patflip = 0; $pat = $1; # print "pat: $pat "; next; } if($line =~ s/\A.*?reg_([0-7]).*\}//){ $reg = $1; # print "reg: $1 "; # print "line in ifline $line\n"; @abc = split( '', $line, 3 ); # print Data::Dumper->Dump( [\@abc], ['*abc'] ); for my $hi (1..3){ push @{ $regpat{'r'.$reg}{'pat'.$hi} }, shift @abc; } } } use Data::Dumper; #print Data::Dumper->Dump ([\%regpat],['*regpat']); print "Reg p1 p2 p3 p4 p5\n"; #my $reppat = 1; foreach my $regkey ( sort keys %regpat ){ foreach my $patkey ( sort keys %{ $regpat{$regkey} } ){ print $regkey." p" . $reppat++ .' '; for my $ind (0..4){ print $regpat{$regkey}{$patkey}[$ind] . ' '; } print "\n"; } $reppat = 1; print "\n"; } __DATA__ pat1 U_TOP_LOGIC/i_reg_2_/Q : # {111111}111 pat2 U_TOP_LOGIC/i_reg_2_/Q : # {110111}111 pat3 U_TOP_LOGIC/i_reg_2_/Q : # {110111}011 pat4 U_TOP_LOGIC/i_reg_2_/Q : # {110111}101 pat5 U_TOP_LOGIC/i_reg_2_/Q : # {110111}110 pat1 U_TOP_LOGIC/i_reg_3_/Q : # {111111}110 pat2 U_TOP_LOGIC/i_reg_3_/Q : # {111111}111 pat3 U_TOP_LOGIC/i_reg_3_/Q : # {111111}101 pat4 U_TOP_LOGIC/i_reg_3_/Q : # {111111}110 pat5 U_TOP_LOGIC/i_reg_3_/Q : # {111111}111 pat1 U_TOP_LOGIC/i_reg_4_/Q : # {111111}111 pat2 U_TOP_LOGIC/i_reg_4_/Q : # {111111}011 pat3 U_TOP_LOGIC/i_reg_4_/Q : # {111111}111 pat4 U_TOP_LOGIC/i_reg_4_/Q : # {111111}011 pat5 U_TOP_LOGIC/i_reg_4_/Q : # {111111}111 --- Reg p1 p2 p3 p4 p5 r2 p1 1 1 0 1 1 r2 p2 1 1 1 0 1 r2 p3 1 1 1 1 0 r3 p1 1 1 1 1 1 r3 p2 1 1 0 1 1 r3 p3 0 1 1 0 1 r4 p1 1 0 1 0 1 r4 p2 1 1 1 1 1 r4 p3 1 1 1 1 1