Re: a little problem with sorting my data
by ikegami (Patriarch) on Jul 27, 2009 at 17:17 UTC
|
sort data > sorted_data
Alternatively, the following would be an efficient Perl solution:
my %grouped;
while (...) {
my @fields = ...;
push @{ $grouped{$fields[0]} }, \@fields;
}
for my $group (values %grouped) {
...
}
| [reply] [d/l] [select] |
|
|
On your second one: That will group by ID, but I don't believe it will sort by ID. That is to say: You will get all the records with a certain ID together, but you won't get the ID's themselves in any particular order. (Unless values does some subtle sorting I'm not aware of.)
| [reply] [d/l] |
|
|
I had written a line about that, but I must have deleted it by accident.
Indeed, it groups but it doesn't sort. I had already provided a solution for sorting. If the OP also wants to sort the ids and then do something with them, he can do:
for ( sort <$fh> ) {
my @fields = ...;
...
}
| [reply] [d/l] |
Re: a little problem with sorting my data
by moritz (Cardinal) on Jul 27, 2009 at 17:18 UTC
|
It would be helpful to know what exactly you have tried, and how it failed. Just feeding the data as-is line by line to sort groups them by ID (although it doesn't sort them numerically if the ID is of variable width). | [reply] |
|
|
Ok. here is what makes my output file;
#!usr/bin/perl
$k = 1;
$file_name = "QoEWeb_DB_Log.txt"; #An input LOG file which holds infor
+mation about different users.
open(SW,$file_name) or die "ERROR OPENING IN FILE";
open FILE, ">output.txt" or die "ERROR..unable to write" ; #Will write
+ the result into an OUTPUT file.
while (<SW>)
{
chomp($eachline);
@file_name1 = ("Carlo_Arnold_2.232_Final.txt", "Sohaib_Ahmad_2.225
+_Final.txt"); #Input LOG files; Each file holds informaton about Indi
+vidual user. I will be adding about 30 files here.
@logarray = split(/:/,$_); # Taking required fields from the first
+ input file.
$field1 = @logarray[2];
$field2 = @logarray[4];
$key1 = @logarray[8];
$field5 = @logarray[6];
$x = scalar(@file_name1);
for($j=0;$j<$x;$j++)
{
open(RW,@file_name1[$j]) or die "ERROR OPENING IN FILE";
while (<RW>)
{
chomp($eachline);
@ff_array = split(/:/,$_); #Taking required fields from the s
+econd set of input files.
$key2 = @ff_array[0];
$field3 = @ff_array[1];
if( $key1 == $key2) # Finding a match between the two input f
+iles
{
print FILE "$field1:$field2:$key1:$field5:$field3"; #P
+rinting the desired result from both batch of input files.
}
}
}
}
close FILE;
close SW or die "Cannot close";
close RW or die "Cannot close";
Problem is
- i want the data in output file to be in sequence according to the field $field1
- let me assure you that all the $field1 is not of variabe length ... its between the range of 2.221 and ends at 2.252 ... which means that 2.XXX .. while the XXX changes.
- it would be so much better if these lines are in arranged manner in my OUTPUT file.
any views ? | [reply] [d/l] |
|
|
| [reply] [d/l] [select] |
|
|
|
|
Re: a little problem with sorting my data
by bichonfrise74 (Vicar) on Jul 27, 2009 at 21:41 UTC
|
Based on your question, I thought of using the Schwartzian Transform to solve the problem. I'm not sure if this is an overkill.
Here's the code. #!/usr/bin/perl
use strict;
my $string;
while( <DATA> ) {
$string = $string . join " ", split( /\:/ );
}
my $data = join "\n",
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [$_, (split)[0]] }
split( /\n/, $string);
print $data;
__DATA__
2.225:0:1248266065752:Y:282
2.232:0:1248266069770:Y:500
2.225:1:1248266072861:Y:438
2.232:1:1248266075785:Y:328
2.225:1:1248266081283:Y:297
2.232:1:1248266082035:Y:328
2.232:1:1248266087410:Y:281
2.225:1:1248266088768:Y:296
2.232:1:1248266091426:Y:281
| [reply] [d/l] |
Re: a little problem with sorting my data
by i-blis (Novice) on Jul 28, 2009 at 00:03 UTC
|
The use of a Schwartzian Transform gives you indeed more flexibility : you can perform the sort on any field, handle cases were sort is not trivial, sort on many fields etc. It is certainly an idiom you won't regret to have learnt.
A common way to read a whole file to a scalar, is to "slurp" it by getting rid of the input record separator's ($/) value (newline by default).
I rewrote it with clean file opening, slurp and sort on the first and fourth field, in order to help you get the logic, in case you did not already.
#!/usr/bin/env perl
use strict; use warnings;
open my $fh, '<', 'file.txt'
or die "$!\n";
my $raw = do { local $/; <DATA>};
my $sorted = join "\n",
map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
sort { $a->[2] <=> $b->[2] }
map { [$_, (split /:/)[0,4]] }
split( /\n/, $raw);
print $sorted;
| [reply] [d/l] |
Re: a little problem with sorting my data
by ig (Vicar) on Jul 28, 2009 at 18:39 UTC
|
Re-reading all your user data files once for each record in the log file is inefficient. It would be more efficient to build a hash of the data on individual users first, then process the log file, pulling user data from the hash.
| [reply] [d/l] |