Re: grep the data out of a text file. (Update 2)

Hello DespacitoPerl,

Welcome to the monastery. Well your question can be resolved in many ways. Since I am not good with perlre I came up with another solution.

What I did, is first split each line (based on white space) of given input data (e.g. in.txt). As a second step I keep only the first character of the string (e.g. A_01), alternative ways of doing exactly the same thing What's the best way to get first character of the string?. Then I simply put together all data in a string with the help of join and finally push all strings into an array to have them all together.

I am using Data::Dumper to view the array since it helps me a lot to debug arrays and hashes.

Sample of solution with output and input: See update bellow

#!usr/bin/perl
use strict;
use warnings;
use Data::Dumper;

my @final;
while (<>) { # Read all files that provided through ARGV
    chomp;
    my @tmp = split / /, $_;
    # print Dumper \@tmp;
    my $first = unpack 'a', $tmp[0];
    push @final, join(' ', $first, $tmp[2], $tmp[3]);
} continue {
    close ARGV if eof; # reset $. each file
}

print Dumper \@final;

__END__

$ perl test.pl in.txt
$VAR1 = [
          'A 1 violation',
          'A 4 violations',
          'B 3 violations'
        ];

__DATA__

A_01: xxxxxxxxxxxxxxxxxxxx......... 1 violation
A_02: xxxxxxxxxxxxxxxxxxxx......... 4 violations
B_02: xxxxxxxxxxxxxxxxxxxx......... 3 violations
[download]

Update: I read your question again and I understood that you want to sum all violations together and also based on key (e.g. A_01, A_02 etc.). I have posted an updated code bellow using hashes.

#!usr/bin/perl
use strict;
use warnings;
use Data::Dumper;

my $num_viol = 'number of violation(s)';

my %hash;
while (<>) { # Read all files that provided through ARGV
    chomp;
    my @tmp = split / /, $_;
    my $first = unpack 'a', $tmp[0];
    if (exists $hash{$first}) {
    $hash{$first}{$num_viol} += $tmp[2];
    next;
    }
    $hash{$first}{$num_viol} = $tmp[2];
} continue {
    close ARGV if eof; # reset $. each file
}

my @final;
foreach my $violation (sort keys %hash) {
    push @final, join(' ',
              $violation,
              $num_viol,
              $hash{$violation}{'number of violation(s)'});
}

print Dumper \@final;

__END__

$ perl test.pl in.txt
$VAR1 = [
          'A number of violation(s) 5',
          'B number of violation(s) 3'
        ];

__DATA__

A_01: xxxxxxxxxxxxxxxxxxxx......... 1 violation
A_02: xxxxxxxxxxxxxxxxxxxx......... 4 violations
B_02: xxxxxxxxxxxxxxxxxxxx......... 3 violations
[download]

Update 2: I was thinking that maybe on my previous Update I did not used the correct input data. So I created another script that does what you asked for. Solution provided bellow:

#!usr/bin/perl
use strict;
use warnings;
use Data::Dumper;

my $num_viol = 'number of violation(s)';

sub extract_data {
    my (@lines) = @_;
    my %hash;
    foreach my $line (@lines) {
    my @tmp = split / /, $line;
    my $first = unpack 'a', $tmp[0];
    if (exists $hash{$first}) {
        $hash{$first}{$num_viol} += $tmp[2];
        next;
    }
    $hash{$first}{$num_viol} = int($tmp[2]);
    }
    return \%hash;
}

my @lines;
my $concatenated;
while (<>) { # Read all files that provided through ARGV
    chomp;
    if ($_ =~ /^\s*$/) {
    $. = 0;
    next;
    }
    elsif ($. == 1) {
    $concatenated .= $_;
    }
    elsif ($. == 3) {
    $concatenated .= $_;
    push @lines, $concatenated;
    $. = 0;
    $concatenated = '';
    }
} continue {
    close ARGV if eof; # reset $. each file
}

my $hash = extract_data(@lines);

my @final;
foreach my $violation (sort keys %$hash) {
    push @final, join(' ',
              $violation,
              $num_viol,
              $$hash{$violation}{'number of violation(s)'});
}
print Dumper \@final;

__END__

$ perl test.pl in.txt
$VAR1 = [
          'A number of violation(s) 5',
          'B number of violation(s) 3'
        ];

__DATA__

A_01: xxxxxxx
xxxxxxxxxx
xxx......... 1 violation

A_02: xxxxxxx
xxxxxxxxxx
xxx......... 4 violations
B_02: xxxxxxx
xxxxxxxxxx
xxx......... 3 violations
[download]

Hope this helps, BR.

Seeking for Perl wisdom...on the process of learning...not there...yet!

Comment on Re: grep the data out of a text file. (Update 2) Select or Download Code