in reply to Array of strings search

When approaching a task like this, my first inclination would be to write a fairly generic parser for lines that use this format.

The parseline() subroutine code below takes a text line as input and makes a hash table of key,value pairs and returns a reference of that hash to the "main program". I called the stuff at the beginning of the line just "tag". Of course "tag" could be further broken down into a key for "date", "time" and "whatever the stuff after the date/time means".

The main "work horses" for parsing textual data are: split() and "match regex global". The code below uses both techniques.

I tried to be straight-forward, but I am quite sure that understanding the code below will require study.

I will note that is unusual to parse an array of input lines. More normal would be to parse lines as they come in, save what is needed from those lines and move on. That approach is more efficient and scalable.

#!usr/bin/perl use warnings; use strict; use Data::Dumper; my @lines = ('Nov 19 06:31:17 proxy postgrey[2439]: action=pass, reaso +n=triplet found, client_name=r41.newsletter.otto.de, client_address=1 +85.15.51.41, sender=otto@newsletter.otto.de, recipient=some.one@some. +domain' ,'Nov 19 06:37:45 proxy postgrey[2439]: action=pass, reason=triplet fo +und, client_name=uspmta194080.emarsys.net, client_address=217.175.194 +.80, sender=suite17@xpressus.emarsys.net, recipient=other.one@some.do +main'); foreach my $line (@lines) { my $hash_ref = parseline ($line); my %tokens = %$hash_ref; print "line=$line\n"; foreach my $key (keys %tokens) { print "key=$key \t value=$tokens{$key}\n"; } print "\n"; #blank line spacer } # parse line creates a hash of keys and values representing # the contents of the line sub parseline { my $line = shift; my %tokens; my ($beginning_tag, $rest) = split (': ', $line,2); #space after t +he : required %tokens = ($rest =~ /(\S+)=(.+?)(?:,|$)/g); $tokens{tag} = $beginning_tag; return \%tokens; } __END__ Prints: line=Nov 19 06:31:17 proxy postgrey[2439]: action=pass, reason=triplet + found, client_name=r41.newsletter.otto.de, client_address=185.15.51. +41, sender=otto@newsletter.otto.de, recipient=some.one@some.domain key=client_address value=185.15.51.41 key=action value=pass key=reason value=triplet found key=recipient value=some.one@some.domain key=tag value=Nov 19 06:31:17 proxy postgrey[2439] key=client_name value=r41.newsletter.otto.de key=sender value=otto@newsletter.otto.de line=Nov 19 06:37:45 proxy postgrey[2439]: action=pass, reason=triplet + found, client_name=uspmta194080.emarsys.net, client_address=217.175. +194.80, sender=suite17@xpressus.emarsys.net, recipient=other.one@some +.domain key=reason value=triplet found key=recipient value=other.one@some.domain key=action value=pass key=client_address value=217.175.194.80 key=client_name value=uspmta194080.emarsys.net key=tag value=Nov 19 06:37:45 proxy postgrey[2439] key=sender value=suite17@xpressus.emarsys.net