Is this possible

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Is this possible by pc88mxer (Vicar) on Mar 24, 2008 at 02:12 UTC
Untested, but this should work: `my @files = @ARGV; my %table; while (<>) { my ($name, $value) = split(' ', $_); $table{$name}->{$ARGV} = $value; } print join("\t", "NAME", @files), "\n"; for my $n (keys %table) { print $n; for my $f (@files) { print "\t", $table{$name}->{$f}; } print "\n"; }` [download] Then invoke it with: `perl script-name RESULTS_FILE*` [download]	[reply] [d/l] [select]
Re^2: Is this possible by Anonymous Monk on Mar 24, 2008 at 06:22 UTC
It seems to work but it gets a bit messy with the filenames (it prints one over the other, not correctly separated with tabs)... The numbers seem to print ok though...	[reply]
Re^3: Is this possible by pc88mxer (Vicar) on Mar 24, 2008 at 16:02 UTC
Are you sure you have this line entered correctly? `print join("\t", "NAME", @files), "\n";` [download] That's the code responsible for printing the header line, so that's where I'd look. This will print tab characters in between each file name. What you are seeing might be an artifact of your text editor or the way your are viewing the file.	[reply] [d/l]
Re: Is this possible by davido (Cardinal) on Mar 24, 2008 at 03:20 UTC
There's not really any "correct" way. ...or maybe it's more accurate to say, there's no wrong way as long as it works and doesn't gobble all your memory in the process. ;) Here is yet another way to do it: `use strict; use warnings; my @files = @ARGV; my %table; foreach my $filename (@files) { open my $infile, '<', $filename or die $!; while( <$infile> ) { chomp; next unless $_; my( $name, $val ) = split; push( @{$table{ $name }}, $val ); } } my @names = sort keys %table; my $output = join "\t", "NAME\t", @files; foreach my $entry ( @names ) { $output .= join "\t\t", "\n$entry", @{$table{$entry}}; } $output .= map { local $" = "\t\t"; "@{$table{$_}}\n"; } @names; print $output, "\n";` [download] Usage: perl scriptname filename1 filename2 It doesn't bother to check to ensure that all files contain the proper number of entries. That could cause your columns to become misaligned, if for example, the second file doesn't contain an entry for Georgia. Dave	[reply] [d/l]
Re: Is this possible by NetWallah (Canon) on Mar 24, 2008 at 03:22 UTC
MOre untested: `use strict; my %info; for (1..50){ process_file($_); } print "NAME\t"; print "$_\t" for (1..50); print "\n"; for my $k (sort keys %info){ print "$k\t"; print "$info{$k}->[$_]\t" for (1..50); print "\n"; } sub process_file{ my $filenbr = shift; open my $f, "<" , "RESULT_FILE$filenbr" or die "Cannot read file $_\ +n$!"; while(<$f>){ my ($name,$val) = split; $info{$name}->[$filenbr] = $val; } close $f; }` [download] "As you get older three things happen. The first is your memory goes, and I can't remember the other two... " - Sir Norman Wisdom	[reply] [d/l]
Re: Is this possible by jwkrahn (Abbot) on Mar 24, 2008 at 06:28 UTC
The main problem with the first two solutions presented is that they rely on the glob sort order so that `'RESULTS_FILE2'` will be between `'RESULTS_FILE19'` and `'RESULTS_FILE20'`, etc. This will glob the files in the correct order: `#!/usr/bin/perl use warnings; use strict; @ARGV = map glob( 'RESULTS_FILE' . ( '[0-9]' x $_ ) ), 1 .. 4; my @headers = ( 'NAME', @ARGV ); my %data; while ( <> ) { my ( $name, $value ) = split; push @{ $data{ $name } }, $value; } print join( "\t", @headers ), "\n"; for my $name ( keys %data ) { print join( "\t", $name, @{ $data{ $name } } ), "\n"; }` [download]	[reply] [d/l] [select]
Re^2: Is this possible by pc88mxer (Vicar) on Mar 24, 2008 at 16:29 UTC
Good point - depending on the OP's needs I can see where this would be useful. One issue I have with the solutions which use `push` is that if a file doesn't contain a name (or contains a name not found in other files), then the outputted data is not going to be lined up correctly. When implemented with a hash, the script can be used on non-uniform data files, and it makes it easy to spot data files which don't have the same names as the others (i.e. missing data.)	[reply] [d/l]
Re: Is this possible by BrowserUk (Patriarch) on Mar 24, 2008 at 08:20 UTC
Golf? `perl -e"BEGIN{@ARGV=map qq[RESULTS_FILE.$_],1..50}" -nle"$h{$_->[0]}.=qq[\t$_->[1]]for[split]}{print qq[@$_]while@{$_=[eac +h%h]}"` [download] Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply] [d/l]
Re^2: Is this possible by ikegami (Patriarch) on Mar 24, 2008 at 08:30 UTC
You don't need a seperate -e for the BEGIN (saves 5 chars)	[reply]
Re: Is this possible by locked_user sundialsvc4 (Abbot) on Mar 24, 2008 at 15:54 UTC
For the record, if you do have enormous amounts of information that you need to, say, “cross-tabulate” in this way, you can accomplish the task for arbitrary amounts of data by: Dump the data into a temporary flat-file, as tuples like `{'Kostas', 'RESULTS_FILE2', 3.1415926}.` Sort the file, using an external disk sort such as the venerable `sort` command or an equivalent CPAN module. Sort it by two keys: first by the 'Kostas' field and then (i.e. within 'Kostas') by 'RESULTS_FILE2.' Now process the sorted file. Everything for 'Kostas' will be together. Everything for Kostas' columns will be already-arranged in ascending column order. Anything that's not in the right place does not exist, period. Virtually no memory is required. Sorting is an “unexpectedly fast” algorithm that, for large amounts of data, consistently beats-the-pants off of any random-access algorithm. When you saw all those tapes spinning back and forth (actually, if you watched very closely, the tapes never changed direction...) that's what the computers were doing. Back before computers existed, that's also how it was done ... with punched cards. The same principles still apply. They still work like nothing else does.
Re: Is this possible by radiantmatrix (Parson) on Mar 24, 2008 at 15:11 UTC
Well, you can get a lot of this done using File::Find and Text::CSV_XS. Some example code: use strict; use warnings; use Text::CSV_XS; use IO::File; use File::Find; my (%data, @file); my $in_csv = Text::CSV_XS->new({ sep_char => " " }); #space-sep in my $out_csv = Text::CSV_XS->new({ sep_char => "\x09" }); #tab-sep out my $out = IO::File->new('report.tab','>') or die "Can't open report.tab for writing: $!\n"; find(\&parse_file, '.'); #find files, running parse_file() for each $out_csv->print($out, [ 'Name', @file ]); ## write out report foreach my $name (sort keys %data) { my @row = ($name); foreach (@file) { push @row, $data{$name}{$_}; } $out_csv->print($out, \@row); } sub parse_file { # only deal with files that match naming convention return unless -f $_ && m/RESULTS_FILE\.(\d+)/; my $file_number = $1; # captured number from match my $filename = $_; push @file, $filename; my $io = IO::File->new($filename,'<') or do { warn "Can't open $filename: $!\n"; return; }; while (my $row = $in_csv->getline($io)) { # e.g. first row in your example results in: # $data{Kostas}{RESULT_FILE.1} = Number1 $data{ $row->[0] }{ $filename } = $row->[1]; } } [download] I think you'll find that very fast and maintainable. <–radiant.matrix–> Ramblings and references The Code that can be seen is not the true Code I haven't found a problem yet that can't be solved by a well-placed trebuchet	[reply] [d/l]