grep the data out of a text file.

DespacitoPerl has asked for the wisdom of the Perl Monks concerning the following question:

Hi i am writing a perl. script to extract out the number of errors with each code as a summary. The input is like this

A_01: xxxxxxx
xxxxxxxxxx
xxx......... 1 violation

A_02: xxxxxxx
xxxxxxxxxx
xxx......... 4 violations
B_02: xxxxxxxx
xxxxxxxxxx
xxx......... 3 violations
[download]

So, from the input, basically i have to grep from A_01 until 1 violation, and from here to there is about 3 lines, and the output is like this:

A number of violations = 5;
B number of violations = 3;
[download]

But, my code is only able to read the violations all though, and sum it out. My code is like tis:

  open(DATA, "<abc.txt, $!";

   $num = 0; 
   
   while (<DATA>){
  
   while (<DATA>){
   if ( (($_ =~ /violation/) || ($_ =~ /violations/)))  {
          if ($_ =~ /(\d+)/) {
          $num = $num + $1;
         }
     }
   
}
   
if ($num == 0) {
print "DM0$x ....... CLEANED\n"
} else {
print "DM0$x ....... Total = $num violations\n"
}
}
[download]

Comment on grep the data out of a text file. Select or Download Code

Replies are listed 'Best First'.
Re: grep the data out of a text file. by choroba (Cardinal) on Jul 05, 2017 at 08:40 UTC
Store the counts in a hash keyed by the category (i.e. A, B in your example). You also need a variable to remember the current category, its scope must be wider than the filehandle reading loop, because the number comes on a different line than the category, i.e. in a different iteration of the loop. #!/usr/bin/perl use warnings; use strict; my %violations_by_category; my $category; while (<DATA>) { if (/^([A-Z])_[0-9]+:/) { $category = $1; } elsif (/([0-9]+) violations?/) { $violations_by_category{$category} += $1; } } for my $category (keys %violations_by_category) { print $category, " ", $violations_by_category{$category}, "\n"; } __DATA__ A_01: xxxxxxx xxxxxxxxxx xxx......... 1 violation A_02: xxxxxxx xxxxxxxxxx xxx......... 4 violations B_02: xxxxxxxx xxxxxxxxxx xxx......... 3 violations [download] ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l] [select]
Re: grep the data out of a text file. by tybalt89 (Monsignor) on Jul 05, 2017 at 08:46 UTC
`#!/usr/bin/perl # http://perlmonks.org/?node_id=1194202 use strict; use warnings; my %totals; local $/ = 'violation'; /^([A-Z])(?:_\d+).* (\d+) violation/sm and $totals{$1} += $2 while <DA +TA>; print "$_ number of violations = $totals{$_};\n" for sort keys %totals +; __DATA__ A_01: xxxxxxx xxxxxxxxxx xxx......... 1 violation A_02: xxxxxxx xxxxxxxxxx xxx......... 4 violations B_02: xxxxxxxx xxxxxxxxxx xxx......... 3 violations` [download]	[reply] [d/l]
Re: grep the data out of a text file. by Discipulus (Canon) on Jul 05, 2017 at 08:13 UTC
Hello DespacitoPerl and welcome to the monastery and to the wonderful world of Perl! I must admit that I do not well understand the question; is the following? > But, my code is only able to read the violations all though, and sum it out. So specify a bit what you want to implement in your program. Anyway I have some general hints: always, that means ALWAYS, `use strict; use warnings;` this force you to write more robust program. `strict` force you to declare variables with `my` and this force you to limit the scope of such variables to the minimum required. This is good. Second: use a more modern and safer way to open your filhadles: `open(DATA, "<abc.txt, $!";` is a non sense.. Always use lexical filehandles (not bareword ones) like: `my $file_path = '/some/path/file.txt'; open my $fh, '<', $file_path or die "Impossible read from [$file_path] +! $!"; # read the file and close it as soon as you do not need it anymore close $fh or die "error closing filehandle! $!";` [download] More; your bareword filehandle `DATA` is very risky: infact `DATA` is a special filehandle referring to the same program you are running and left open for read pointing just after the special token `__DATA__` so do not use it anyway (use lexical filehandle and stop). The `DATA` special filehandle is used this way: `use strict; use warnings; while (<DATA>){ chomp; print "$_ "; } print "\n"; __DATA__ welcome to Perl` [download] L* There are no rules, there are no thumbs.. Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.	[reply] [d/l] [select]
Re: grep the data out of a text file. (Update 2) by thanos1983 (Parson) on Jul 05, 2017 at 08:38 UTC
Hello DespacitoPerl, Welcome to the monastery. Well your question can be resolved in many ways. Since I am not good with perlre I came up with another solution. What I did, is first split each line (based on white space) of given input data (e.g. in.txt). As a second step I keep only the first character of the string (e.g. A_01), alternative ways of doing exactly the same thing What's the best way to get first character of the string?. Then I simply put together all data in a string with the help of join and finally push all strings into an array to have them all together. I am using Data::Dumper to view the array since it helps me a lot to debug arrays and hashes. Sample of solution with output and input: See update bellow Read more... (944 Bytes) Update: I read your question again and I understood that you want to sum all violations together and also based on key (e.g. A_01, A_02 etc.). I have posted an updated code bellow using hashes. Read more... (1233 Bytes) Update 2: I was thinking that maybe on my previous Update I did not used the correct input data. So I created another script that does what you asked for. Solution provided bellow: Read more... (2 kB) Hope this helps, BR. Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]