Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Turning regex capture group variables into arrays, then counting the number of objects in the array

by markong (Pilgrim)
on Dec 04, 2018 at 12:50 UTC ( [id://1226712]=note: print w/replies, xml ) Need Help??


in reply to Turning regex capture group variables into arrays, then counting the number of objects in the array

Assuming that the big regex you posted is *actually* tested against one line of command output and bug free, I'm gonna give you a skeletal example of one simple way to proceed.

But first, Perl is influenced by something called context:

$output = `program args`; # collect output into one multiline string @output = `program args`; # collect output into array, one line per +element
So your code probably end up in an infinite loop, because $output is a string which is evaluated as having the true boolean value inside the while() test.

Try to start with something like this instead:
#!/usr/bin/perl # use strict; use warnings; my @output = `bpdbjobs`; for my $line (@output) { chomp $line; my @matches = $line =~ /(\d+)?\s+((\b[^\d\W]+\b)|(\b[^\d\W]+\b\s+\b[^\d\W]+\b))?\s+((Done)|(A +ctive)|( \w+\w+\-\w\-+))?\s+(\d+)?\s+((\w+)|(\w+\_\w+)|(\w+\_\w+\_\w+))?\s+((b[ +^\d\W]+\ b\-\b[^\d\W]+\b)|(\-)|(\b[^\d\W]+\b))?\s+((\w+\.\w+\.\w+)|(\w+))?\s+(( +\w+\.\w+ \.\w+)|(\w+))?\s+(\d+)?/g; ## <<--- Beware of the global modifier g h +ere! if (@matches) { # @matches now is an array containing the captured matches # Pretty printing time ? } }

It now should be a matter of double checking the regex correctness and maybe using some modules to help with printing, e.g.: Text::Table ?

Good luck!
  • Comment on Re: Turning regex capture group variables into arrays, then counting the number of objects in the array
  • Select or Download Code

Replies are listed 'Best First'.
Re^2: Turning regex capture group variables into arrays, then counting the number of objects in the array
by Djay (Novice) on Dec 04, 2018 at 15:05 UTC
    Its funny, this is actually very close to the first portion of my Powershell script. Does this allow me to call and count the matches in an individual capture group? To do this in Powershell (sorry to bring this here but I find it easier to explain code in code form)
    $output = ./bpdbjobs $Results = @() $ColumnName = @() foreach ($match in $OUTPUT) { $matches = $null $match -match "(?<jobID>\d+)?\s+(?<Type>(\b[^\d\W]+\b)|(\b[^\d\W]+ +\b\s+\b[^\d\W]+\b))?\s+(?<State>(Done)|(Active)|(\w+\w+`-\w`-+))?\s+( +?<Status>\d+)?\s+(?<Policy>(\w+)|(\w+`_\w+)|(\w+`_\w+`_\w+))?\s+(?<Sc +hedule>(\b[^\d\W]+\b\-\b[^\d\W]+\b)|(\-)|(\b[^\d\W]+\b))?\s+(?<Client +>(\w+\.\w+\.\w+)|(\w+))?\s+(?<Dest_Media_Svr>(\w+\.\w+\.\w+)|(\w+))?\ +s+(?<Active_PID>\d+)?\s+(?<FATPipe>\b[^\d\W]+\b)?" $Results+=$matches } foreach ($result in $results) { $Object = New-Object psobject -Property @{ JobID = $Result.jobID Type = $Result.Type State = $Result.State Status = $Result.Status Policy = $Result.Policy Schedule = $Result.Schedule Client = $Result.Client Dest_media_svr = $Result.dest_media_svr Active_PID = $Result.Active_PID FATPipe = $Result.FATPipe } $ColumnName += $Object }
    Powershell already understands that $Result.jobID is referring to the jobID capture group. All this does is put the capture group and results of said capture into an object which I can put into a variable and use any time using the below code as an example
    $Successful = ($ColumnName | where {$_.Status -eq "0"}).count
    This creates a variable which is formed out of the previous codes variable, it then pulls out only the matches in the Status column ($_.Status) and counts the ones that match the value 0.

      If you have fixed format records, then perhaps using unpack is a simpler option. For example

      #!/usr/bin/perl use strict; use Data::Dumper ; my $fmt = 'A13 A8 A8 A10 A6 A20 A6 A12 A16 A5'; my %counts = (); my @col = ('JobID','Col2','Type','State', 'Status','Policy','Schedule','Client', 'Dest Media Svr','Active PID'); # 10 cols while (<DATA>){ next unless /\S/; # skip blank lines next if /^\s+JobID/; # skip header chomp; my @f = unpack $fmt,$_; s/^\s+|\s+$//g for @f; # trim spaces # count each column for my $n (0..$#col){ ++$counts{$col[$n]}{$f[$n]}; } print join "\|",@f,"\n"; # check } print Dumper \%counts; printf "Succesfull = %d\n",$counts{'Status'}{'0'}; __DATA__ JobID Type State Status Policy Schedule + Client Dest Media Svr Active PID 41735 Backup Done 0 Policy_name_here daily + hostname001 MediaSvr1 8100 41734 Backup Done 0 Policy_name_here daily + hostname002 MediaSvr1 7803 41733 Backup Done 0 Policy_name_here daily + hostname004 MediaSvr1 7785 41732 Backup Done 0 Policy_name_here daily + hostname005 MediaSvr1 27697 41731 Backup Done 0 Folicy_name_here daily + hostname006 MediaSvr1 27523 41730 Backup Done 0 Policy_name_here daily + hostname007 MediaSvr1 27834 41729 Backup Done 0 Policy_name_here - + hostname008 MediaSvr1 27681 41728 Backup Done 0 Policy_name_here - + hostname009 MediaSvr1 27496 41727 Catalog Backup Done 0 catalog full + hostname010 MediaSvr1 27347 41712 Catalog Backup Done 0 catalog - + hostname004 30564
      poj

        I've only just had chance to take a look at this, and this is almost certainly what I've been looking for - the output is perfect, IF I can pass the command output through to _DATA_

        The commands output is ALWAYS the same formatting, different data so in theory this should work? If so, can you explain a way for me to pass the commands output over to _DATA_ for unpack?

      I have zero PowerShell knowledge, but from what you describe you probably want to use the named capture groups feature (?<NAME>...) (the original regex you posted didn't contain any...).
      This is still a capture group just like a regular parenthesized grouping, but its name is NAME.

      I'd recommend you to look at some examples and while you're at it, you should also read part 1 of that wonderful tutorial to understand how to access the various bits of information from a successful match!

      It's all there!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1226712]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2024-04-24 10:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found