in reply to grep the data out of a text file.

Hello DespacitoPerl,

Welcome to the monastery. Well your question can be resolved in many ways. Since I am not good with perlre I came up with another solution.

What I did, is first split each line (based on white space) of given input data (e.g. in.txt). As a second step I keep only the first character of the string (e.g. A_01), alternative ways of doing exactly the same thing What's the best way to get first character of the string?. Then I simply put together all data in a string with the help of join and finally push all strings into an array to have them all together.

I am using Data::Dumper to view the array since it helps me a lot to debug arrays and hashes.

Sample of solution with output and input: See update bellow

#!usr/bin/perl use strict; use warnings; use Data::Dumper; my @final; while (<>) { # Read all files that provided through ARGV chomp; my @tmp = split / /, $_; # print Dumper \@tmp; my $first = unpack 'a', $tmp[0]; push @final, join(' ', $first, $tmp[2], $tmp[3]); } continue { close ARGV if eof; # reset $. each file } print Dumper \@final; __END__ $ perl test.pl in.txt $VAR1 = [ 'A 1 violation', 'A 4 violations', 'B 3 violations' ]; __DATA__ A_01: xxxxxxxxxxxxxxxxxxxx......... 1 violation A_02: xxxxxxxxxxxxxxxxxxxx......... 4 violations B_02: xxxxxxxxxxxxxxxxxxxx......... 3 violations

Update: I read your question again and I understood that you want to sum all violations together and also based on key (e.g. A_01, A_02 etc.). I have posted an updated code bellow using hashes.

#!usr/bin/perl use strict; use warnings; use Data::Dumper; my $num_viol = 'number of violation(s)'; my %hash; while (<>) { # Read all files that provided through ARGV chomp; my @tmp = split / /, $_; my $first = unpack 'a', $tmp[0]; if (exists $hash{$first}) { $hash{$first}{$num_viol} += $tmp[2]; next; } $hash{$first}{$num_viol} = $tmp[2]; } continue { close ARGV if eof; # reset $. each file } my @final; foreach my $violation (sort keys %hash) { push @final, join(' ', $violation, $num_viol, $hash{$violation}{'number of violation(s)'}); } print Dumper \@final; __END__ $ perl test.pl in.txt $VAR1 = [ 'A number of violation(s) 5', 'B number of violation(s) 3' ]; __DATA__ A_01: xxxxxxxxxxxxxxxxxxxx......... 1 violation A_02: xxxxxxxxxxxxxxxxxxxx......... 4 violations B_02: xxxxxxxxxxxxxxxxxxxx......... 3 violations

Update 2: I was thinking that maybe on my previous Update I did not used the correct input data. So I created another script that does what you asked for. Solution provided bellow:

#!usr/bin/perl use strict; use warnings; use Data::Dumper; my $num_viol = 'number of violation(s)'; sub extract_data { my (@lines) = @_; my %hash; foreach my $line (@lines) { my @tmp = split / /, $line; my $first = unpack 'a', $tmp[0]; if (exists $hash{$first}) { $hash{$first}{$num_viol} += $tmp[2]; next; } $hash{$first}{$num_viol} = int($tmp[2]); } return \%hash; } my @lines; my $concatenated; while (<>) { # Read all files that provided through ARGV chomp; if ($_ =~ /^\s*$/) { $. = 0; next; } elsif ($. == 1) { $concatenated .= $_; } elsif ($. == 3) { $concatenated .= $_; push @lines, $concatenated; $. = 0; $concatenated = ''; } } continue { close ARGV if eof; # reset $. each file } my $hash = extract_data(@lines); my @final; foreach my $violation (sort keys %$hash) { push @final, join(' ', $violation, $num_viol, $$hash{$violation}{'number of violation(s)'}); } print Dumper \@final; __END__ $ perl test.pl in.txt $VAR1 = [ 'A number of violation(s) 5', 'B number of violation(s) 3' ]; __DATA__ A_01: xxxxxxx xxxxxxxxxx xxx......... 1 violation A_02: xxxxxxx xxxxxxxxxx xxx......... 4 violations B_02: xxxxxxx xxxxxxxxxx xxx......... 3 violations

Hope this helps, BR.

Seeking for Perl wisdom...on the process of learning...not there...yet!