File Manipulation

Mark.Allan has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: File Manipulation by hdb (Monsignor) on Aug 23, 2013 at 11:04 UTC
I would use a simple array of arrays where the first element of each sub-array is the server, like this: `use strict; use warnings; my @logs; while(<DATA>){ push @logs, [ $1 ] if /\[(.)\]/; push @{$logs[-1]}, $1 if /^(\/.)/; } print shift @$_, ":", join( ",", @$_ ), "\n" for @logs; __DATA__ [server1] /tmp/location1/file.log /tmp/location2/file.log [server2] /usr/loc1/file.log /usr/loc2/file.log [server3] /citrix/dir3/file.log` [download]	[reply] [d/l]
Re: File Manipulation by kcott (Archbishop) on Aug 23, 2013 at 11:23 UTC
G'day Mark.Allan, Assuming your input file isn't so large that reading all its data at once causes memory issues, you can do something like this: `$ perl -Mstrict -Mwarnings -Mautodie -e ' open my $fh, "<", "./pm_1050631_in.txt"; my $data = do { local $/; <$fh> }; close $fh; my %server; my $re = qr{\[(\w+)\]\s+([^[]*)}; while ($data =~ /$re/g) { $server{$1} = join "," => split /\s+/ => $2; } for (sort keys %server) { print "$_:$server{$_}\n"; } ' server1:/tmp/location1/file.log,/tmp/location2/file.log server2:/usr/loc1/file.log,/usr/loc2/file.log server3:/citrix/dir3/file.log` [download] -- Ken	[reply] [d/l]
Re: File Manipulation by 2teez (Vicar) on Aug 23, 2013 at 12:46 UTC
Hi, You have been given great solutions, but in the spirit of "tim today", you could also check this (a somewhat modifications to the solutions already given): `use strict; use warnings; my %logger; my $key; while(<DATA>){ s/\s+$//; if(/\[(.*)\]/){ $key = $1; }else{push @{$logger{$key}}, $_;} } print $_,":", join ("," => @{$logger{$_}}),$/ for sort {$a cmp $b} keys %logger; __DATA__ [server1] /tmp/location1/file.log /tmp/location2/file.log [server2] /usr/loc1/file.log /usr/loc2/file.log [server3] /citrix/dir3/file.log` [download] NOTE: Since, the key only changes, when the name of server is seen, and that until the next one. It works perfectly well. You could also see perldsc If you tell me, I'll forget. If you show me, I'll remember. if you involve me, I'll understand. --- Author unknown to me	[reply] [d/l]
Re: File Manipulation by ww (Archbishop) on Aug 23, 2013 at 10:58 UTC
Try reading about the input separator, `$/`, which is explained in http://www.perl.com/pub/2004/06/18/variables.html under the second subhead, "The Field Record Separators." Take note of the fact, however, that you can't use a regex to set the value. Update added sample code (below) #!/usr/bin/perl use 5.016; use warnings; use Data::Dumper; my ($para, @para, $val, @val); my $serverid = ''; local $/ = "[server"; while ($para = <DATA>) { chomp $para; if ( $para =~ /(\d+\])(.)(?:\[server)/s ) { chomp $1; $serverid = $1; push (@para, "server" . "$serverid: "); $val = $2; if ( $val =~ /^\n(.)/s ) { $val = $1; } $val =~ s/\n/, /gs; push (@val, $val); } } my $i; for $i( 0 .. $#para ) { $para[$i] =~ s/[\]]//; # get rid of square brackets +(if you must) say "$para[$i]$val[$i]"; } =head OUTPUT: C:\>1050631.pl server1: /tmp/location1/file.log, /tmp/location2/file.log, server2: /usr/loc1/file.log, /usr/loc2/file.log, server3: /citrix/dir3/file.log, , server17: /etc/bin/dat/file.log, /etc/misc/logs/files3.log, , =cut __DATA__ [server1] /tmp/location1/file.log /tmp/location2/file.log [server2] /usr/loc1/file.log /usr/loc2/file.log [server3] /citrix/dir3/file.log [server17] /etc/bin/dat/file.log /etc/misc/logs/files3.log [server0Xff] /won't be seen/file.log /nor/this/file.log /because/server_name/does not match regex in Ln13 [download] :-(* ... and now, even more belatedly, I see johngg beat me to it, in time and elegance! ++ My apologies to all those electrons which were inconvenienced by the creation of this post.	[reply] [d/l] [select]
Re: File Manipulation by johngg (Canon) on Aug 23, 2013 at 13:26 UTC
Just to bring another bottle to the party and to take up ww's suggestion on input record separator. $ perl -Mstrict -Mwarnings -MData::Dumper -e ' open my $inFH, q{<}, \ <<EOD or die $!; [server1] /tmp/location1/file.log /tmp/location2/file.log [server2] /usr/loc1/file.log /usr/loc2/file.log [server3] /citrix/dir3/file.log EOD my %assoc; { local $/ = q{[}; scalar <$inFH>; # Get rid of first '[' while ( <$inFH> ) { chomp; my( $server, $fileStr ) = split m{]\n}; $assoc{ $server } = [ split m{\n}, $fileStr ]; } } print Data::Dumper->Dumpxs( [ \ %assoc ], [ qw{ *assoc } ] );' %assoc = ( 'server3' => [ '/citrix/dir3/file.log' ], 'server2' => [ '/usr/loc1/file.log', '/usr/loc2/file.log' ], 'server1' => [ '/tmp/location1/file.log', '/tmp/location2/file.log' ] ); $ [download] I hope this is helpful. Cheers, JohnGG	[reply] [d/l]
Re: File Manipulation by locked_user sundialsvc4 (Abbot) on Aug 23, 2013 at 13:12 UTC
Adding my personal “toady” to this, I view such problems in an `awk-`like sort of way. There are two “kinds of” lines here: “those that look like `[servername]`,” and, in the simplest case, “those that don’t.” There is one thing to be done in each case. The data-structure of choice is a hashref, whose elements are arrayrefs containing the file-names. Perl’s “auto-vivification” feature does, as intended, most of the work, viz: (extemporaneous coding follows ... your syntax may vary ... stripped to the bare parts for clarity) `my $server_name; my $results; while (my $line = <$file>) { if ($line =~ /\[(.)\]/) { $server_name = $1; } elsif ($line =~ /(\/.)/) { die "file doesn't begin with servername line!" unless defined($server_name); push @{ $results->{$server_name} }, $1; } } foreach my $k (keys $results) { print "$k: " . join(" ", @{ $results->{$k} } ) . "\n"; }` [download] Notice how, in the `push` statement, we simply rely upon Perl to create a new hash-bucket, if one does not yet exist, and to treat the whole thing as an arrayref upon which we can push things. This is the “auto-vivification” of which I was speaking. Notice that the program will `die` if it detects (and that it does look for ...) that the first line in the file is not a server-name record. The other bits of writing things on multiple source-lines and so forth are just my personal style.
Re: File Manipulation by Laurent_R (Canon) on Aug 23, 2013 at 18:10 UTC
If your file is as nicely ordered as the sample you have shown, you probably don't even need any data structure but can print as you read the lines. Something like this: `use strict; use warnings; my $line; while (<DATA>) { chomp; if (/\[(server\d+)\]/) { print $line, "\n" if defined $line; $line = $1 . ": "; } else { $line .= $_; $line .= ','; } } print $line, "\n"; __DATA__ [server1] /tmp/location1/file.log /tmp/location2/file.log [server2] /usr/loc1/file.log /usr/loc2/file.log [server3] /citrix/dir3/file.log` [download] Output: `$ perl serv.pl server1: /tmp/location1/file.log,/tmp/location2/file.log, server2: /usr/loc1/file.log,/usr/loc2/file.log, server3: /citrix/dir3/file.log,` [download]	[reply] [d/l] [select]