steamerboy has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I'd be VERY interested in any ideas for an approach to ordering the data from some results files I have, producing something useful..

I have results files that look like this;

"S (00exp)", "2/16/105", "19:59:6", "131.111.249.177", "18 (04Age)", "M (05sex)", "british (07Cn)", " (17Com)", "8 (SvAA)", "9 (SvAE)", "10 (SvAH)", "10 (SvAO)", "9 (SvAW)", "10 (SvAY)", "5 (SvB)", "3 (SvCH)", "4 (SvD)", "2 (SvDH)", "9 (SvEH)", "9 (SvER)", "8 (SvEY)", "2 (SvF)", "8 (SvG)", " (SvHH)", "2 (SvIH)", "9 (SvIY)", "3 (SvJH)", "5 (SvK)", "8 (SvL)", "7 (SvM)", "8 (SvN)", "9 (SvNG)", "10 (SvOW)", "9 (SvOY)", "4 (SvP)", "8 (SvR)", "2 (SvSH)", "8 (SvT)", "2 (SvTH)", "10 (SvUH)", "9 (SvUW)", "6 (SvV)", "7 (SvW)", "9 (SvY)", "4 (SvZ)", "6 (SvZH)", "complete"

And this;

"DH (00exp)", "2/21/105", "16:43:11", "62.6.139.12", "42 (04Age)", "F (05sex)", "uk (07Cn)", " (17Com)", "8 (DHvAA)", "9 (DHvAE)", "8 (DHvAH)", "9 (DHvAO)", "9 (DHvAW)", "9 (DHvAY)", "3 (DHvB)", "3 (DHvCH)", "2 (DHvD)", "8 (DHvEH)", "4 (DHvER)", "8 (DHvEY)", "4 (DHvF)", "4 (DHvG)", "3 (DHvHH)", "8 (DHvIH)", "8 (DHvIY)", "2 (DHvJH)", "5 (DHvK)", "2 (DHvL)", "5 (DHvM)", "3 (DHvN)", "5 (DHvNG)", "9 (DHvOW)", "9 (DHvOY)", "4 (DHvP)", "4 (DHvR)", "3 (DHvS)", "3 (DHvSH)", "2 (DHvT)", "1 (DHvTH)", "8 (DHvUH)", "4 (DHvUW)", "2 (DHvV)", "3 (DHvW)", "3 (DHvY)", "3 (DHvZ)", "3 (DHvZH)", "complete"

----

i have 39 such files.

Each score corresponds to the name in brackets after it e.g (DHvT)

For DHvT, there will, in another file be a value which represents the same point in the MATRIX, this would be TvDH, in the file for all T values.

Each files has 38 values There are 39 files. I want to convert this data into something I can paste into a spreadsheet (maybe not all at once) resulting in the matrix.

The matrix will be 39 by 39

39by39 = 1521

39 files, with 38 values = 1482.

The difference is 39, corresponding to the already know values for AvA, DvD.. etc.

A basic solution for me, would be a perl script that for one of the data files - returns the values, in the same order, but each value on a new line (accompanied by the name, so i know what the result corresponds to - i will delete this evntually though!)

I could mass replace all non-required items with a blank to give me the necessarry data - but i need each name-value pair on a new line.

Also if anyone has any other approaches to suggest, please do so! Thank you for any help!

  • Comment on Any help for this problem. Parsing/constructing results

Replies are listed 'Best First'.
Re: Any help for this problem. Parsing/constructing results
by saintmike (Vicar) on Feb 26, 2005 at 16:52 UTC
    What you need is a matrix, as a data structure to store the mappings in. In Perl, this can be done easily with a hash of hashes.

    Here's a code sample that should get you started: It reads in a single file from the DATA section, parses the mappings with a regular expression and stores them in a matrix.

    The following for loops are iterating through the matrix, printing out the mappings.

    Just repeat this for all of your files and add some code to produce a comma separated format, suitable for your spreadsheet program.

    my $data = join '', <DATA>; my $matrix = {}; while($data =~ /"(\d+) \s+ \( (\w+)v(\w+) \)" /gx) { $matrix->{$2}->{$3} = $1; } for my $x (keys %$matrix) { for my $y (keys %{$matrix->{$x}}) { print "$x / $y => $matrix->{$x}->{$y}\n"; } } __DATA__ "S (00exp)", "2/16/105", "19:59:6", "131.111.249.177", "18 (04Age)", "M (05sex)", "british (07Cn)", " (17Com)", "8 (SvAA)", "9 (SvAE)", "10 (SvAH)", "10 (SvAO)", "9 (SvAW)", "10 (SvAY)", "5 (SvB)", "3 (SvCH +)", "4 (SvD)", "2 (SvDH)", "9 (SvEH)", "9 (SvER)", "8 (SvEY)", "2 (SvF)", "8 (SvG)", " (SvHH)", "2 (SvIH)", "9 (SvIY)", "3 (SvJH)", "5 (SvK)", "8 (SvL)", "7 (SvM)", "8 (SvN)", "9 (SvNG)", "10 (SvOW)", "9 (SvOY)", "4 (SvP)", "8 (SvR)", "2 (SvSH)", "8 (SvT)", "2 (SvTH)", "10 (SvUH)", "9 (SvUW)", "6 (SvV)", "7 (SvW)", "9 (SvY)", "4 (SvZ)", "6 (SvZH)", "complete"
      is there a way i could get this to work if I just pasted the data from all (38) files, and also could it write to a file? Some of the files contain more than one set of the data, not sure if that makes a difference. - Thanks for the help! (steamerboy, not logged in)
Re: Any help for this problem. Parsing/constructing results
by sh1tn (Priest) on Feb 26, 2005 at 17:14 UTC
    my $data; map { $data->{$1} = [$2,$3] while /((\d?)\s+\((\w+)\))/g } <DATA>; use Data::Dumper; print Dumper($data); __END__ ... '3 (DHvY)' => [ '3', 'DHvY' ], ...