a little problem with sorting my data

aash08 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: a little problem with sorting my data by ikegami (Patriarch) on Jul 27, 2009 at 17:17 UTC
From the prompt, `sort data > sorted_data` [download] Alternatively, the following would be an efficient Perl solution: `my %grouped; while (...) { my @fields = ...; push @{ $grouped{$fields[0]} }, \@fields; } for my $group (values %grouped) { ... }` [download]	[reply] [d/l] [select]
Re^2: a little problem with sorting my data by DStaal (Chaplain) on Jul 27, 2009 at 17:53 UTC
On your second one: That will group by ID, but I don't believe it will sort by ID. That is to say: You will get all the records with a certain ID together, but you won't get the ID's themselves in any particular order. (Unless `values` does some subtle sorting I'm not aware of.)	[reply] [d/l]
Re^3: a little problem with sorting my data by ikegami (Patriarch) on Jul 27, 2009 at 18:25 UTC
I had written a line about that, but I must have deleted it by accident. Indeed, it groups but it doesn't sort. I had already provided a solution for sorting. If the OP also wants to sort the ids and then do something with them, he can do: `for ( sort <$fh> ) { my @fields = ...; ... }` [download]	[reply] [d/l]
Re: a little problem with sorting my data by moritz (Cardinal) on Jul 27, 2009 at 17:18 UTC
It would be helpful to know what exactly you have tried, and how it failed. Just feeding the data as-is line by line to sort groups them by ID (although it doesn't sort them numerically if the ID is of variable width).	[reply]
Re^2: a little problem with sorting my data by aash08 (Initiate) on Jul 27, 2009 at 17:54 UTC
Ok. here is what makes my output file; #!usr/bin/perl $k = 1; $file_name = "QoEWeb_DB_Log.txt"; #An input LOG file which holds infor +mation about different users. open(SW,$file_name) or die "ERROR OPENING IN FILE"; open FILE, ">output.txt" or die "ERROR..unable to write" ; #Will write + the result into an OUTPUT file. while (<SW>) { chomp($eachline); @file_name1 = ("Carlo_Arnold_2.232_Final.txt", "Sohaib_Ahmad_2.225 +_Final.txt"); #Input LOG files; Each file holds informaton about Indi +vidual user. I will be adding about 30 files here. @logarray = split(/:/,$_); # Taking required fields from the first + input file. $field1 = @logarray[2]; $field2 = @logarray[4]; $key1 = @logarray[8]; $field5 = @logarray[6]; $x = scalar(@file_name1); for($j=0;$j<$x;$j++) { open(RW,@file_name1[$j]) or die "ERROR OPENING IN FILE"; while (<RW>) { chomp($eachline); @ff_array = split(/:/,$_); #Taking required fields from the s +econd set of input files. $key2 = @ff_array[0]; $field3 = @ff_array[1]; if( $key1 == $key2) # Finding a match between the two input f +iles { print FILE "$field1:$field2:$key1:$field5:$field3"; #P +rinting the desired result from both batch of input files. } } } } close FILE; close SW or die "Cannot close"; close RW or die "Cannot close"; [download] Problem is - i want the data in output file to be in sequence according to the field $field1 - let me assure you that all the $field1 is not of variabe length ... its between the range of 2.221 and ends at 2.252 ... which means that 2.XXX .. while the XXX changes. - it would be so much better if these lines are in arranged manner in my OUTPUT file. any views ?	[reply] [d/l]
Re^3: a little problem with sorting my data by moritz (Cardinal) on Jul 27, 2009 at 18:03 UTC
Instead of `while (<RW>)` you can write `for (sort <RW>)` and be done. But I strongly recommend to `use strict; use warnings;` and to declare your variables with my.	[reply] [d/l] [select]
Re^4: a little problem with sorting my data by aash08 (Initiate) on Jul 27, 2009 at 18:21 UTC
Re^5: a little problem with sorting my data by ig (Vicar) on Jul 28, 2009 at 18:07 UTC
Re: a little problem with sorting my data by bichonfrise74 (Vicar) on Jul 27, 2009 at 21:41 UTC
Based on your question, I thought of using the Schwartzian Transform to solve the problem. I'm not sure if this is an overkill. Here's the code. `#!/usr/bin/perl use strict; my $string; while( <DATA> ) { $string = $string . join " ", split( /\:/ ); } my $data = join "\n", map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [$_, (split)[0]] } split( /\n/, $string); print $data; __DATA__ 2.225:0:1248266065752:Y:282 2.232:0:1248266069770:Y:500 2.225:1:1248266072861:Y:438 2.232:1:1248266075785:Y:328 2.225:1:1248266081283:Y:297 2.232:1:1248266082035:Y:328 2.232:1:1248266087410:Y:281 2.225:1:1248266088768:Y:296 2.232:1:1248266091426:Y:281` [download]	[reply] [d/l]
Re: a little problem with sorting my data by i-blis (Novice) on Jul 28, 2009 at 00:03 UTC
The use of a Schwartzian Transform gives you indeed more flexibility : you can perform the sort on any field, handle cases were sort is not trivial, sort on many fields etc. It is certainly an idiom you won't regret to have learnt. A common way to read a whole file to a scalar, is to "slurp" it by getting rid of the input record separator's ($/) value (newline by default). I rewrote it with clean file opening, slurp and sort on the first and fourth field, in order to help you get the logic, in case you did not already. `#!/usr/bin/env perl use strict; use warnings; open my $fh, '<', 'file.txt' or die "$!\n"; my $raw = do { local $/; <DATA>}; my $sorted = join "\n", map { $_->[0] } sort { $a->[1] <=> $b->[1] } sort { $a->[2] <=> $b->[2] } map { [$_, (split /:/)[0,4]] } split( /\n/, $raw); print $sorted;` [download]	[reply] [d/l]
Re: a little problem with sorting my data by ig (Vicar) on Jul 28, 2009 at 18:39 UTC
Re-reading all your user data files once for each record in the log file is inefficient. It would be more efficient to build a hash of the data on individual users first, then process the log file, pulling user data from the hash. Read more... (2 kB)	[reply] [d/l]