in reply to Capturing Non-Zero Elements, Counts and Indexes of Sparse Matrix
I'm guessing the array is held in rows. I suggest reading your the non-zeros of the matrix into an array of arrays and sorting by column & row number, along these lines (untried, untested):
The array of arrays can now be scanned to get the cumulative column count Ap, and is in the order you require for Ai and Ax.my @sparse = () ; while (...whatever...) { my $value = ... ; # next value my $col = ... ; # in this column my $row = ... ; # and this row if ($value) { # keep only the non-zeros push @sparse, [$col, $row, $value] ; } ; } ; @sparse = sort { ($$a[0] <=> $$b[0]) || ($$a[1] <=> $$b[1]) || die } + @sparse ;
If there are too many non-zero values to be held in memory, then you have a serious problem -- but you could partition it by reading the input 'n' times, processing a batch of columns each time.
If 'n' is big, then you'll need to consider ways to reduce the size of the array of arrays... but that's going to require a deeper understanding of the original data.
Update: corrected the sort line and added proof of concept code:
output:use strict ; use warnings ; my @sparse = () ; my $col = 0 ; # emerges from loop set to # of columns my $row = 0 ; # emerges from loop set to # of rows while (<DATA>) { s/\s+/ /g ; s/^ // ; s/ $// ; $col = 0 ; foreach my $val (split / /) { if ($val) { push @sparse, [$col, $row, $val] ; } ; ++$col ; } ; ++$row ; } ; @sparse = sort { ($$a[0] <=> $$b[0]) || ($$a[1] <=> $$b[1]) || die } @ +sparse ; my $Ap = 0 ; print "Ap = $Ap" ; my $c = 0 ; foreach my $a (@sparse) { while ($c < $$a[0]) { print " $Ap" ; ++$c ; } ; ++$Ap ; } ; while ($c++ < $col) { print " $Ap" ; } ; print "\n" ; print "Ai =" ; foreach my $a (@sparse) { print " $$a[1]" ; } ; print "\n" ; print "Ax =" ; foreach my $a (@sparse) { print " $$a[2]" ; } ; print "\n" ; __DATA__ 2 3 0 0 0 3 0 4 0 6 0 -1 -3 2 0 0 0 1 0 0 0 4 2 0 1
Ap = 0 2 5 9 10 12 Ai = 0 1 0 2 4 1 2 3 4 2 1 4 Ax = 2 3 3 -1 4 4 -3 1 2 2 6 1
|
|---|