comment on

Greetings, Monks. I hope you all had a very pleasant Kwanzaa. I am seeking your advice on the most (or at least, a very) efficient method to perform the following transformation:

Given a matrix of rank 2, perform individual consolidation functions on each column, resulting in one "row" of data, with each field having been transformed by its corresponding function.

F1 F2 F3 F4 F5 F6 :

a1 b1 c1 d1 e1 g1    ->   A1 B1 C1 D1 E1 G1
a2 b2 c2 d2 e2 g2
a3 b3 c3 d3 e3 g3
a4 b4 c4 d4 e4 g4
a5 b5 c5 d5 e5 g5
a6 b6 c6 d6 e6 g6
a7 b7 c7 d7 e7 g7
a8 b8 c8 d8 e8 g8

Parameters: The source data always has the same number of columns, and the same reduction functions are always applied to the columns. The data is supplied in "chunks", at which point the consolidation functions need to be applied to that "chunk" of data and output one "record." Examples of consolidation functions are min(), max(), avg(), sum(), etc...

(This is not homework, it's just a really potentially computationally complex problem, that has nearly an infinite number of Bad™ solutions, and I'd like to do the Right Thing™ and perhaps learn and share something new in the process).

Update: Code sample in this reply: Re^2: An efficient, scalable matrix transformation algorithm

My current, already-implemented and workable solution follows roughly the following steps:

I have an arrayref map that maps each element to a coderef for the appropriate function for that column.
Transpose the matrix (a slow process in and of itself).
Supply the individual rows (former columns) to the appropriate function and get the return value for that function. (for speed, where possible, I use List::Util, since it's written in XS).
Collect the individual return values and rebuild them into the resulting "record", returning this.

It works "fair-to-middling", but the matrix transposition step itself adds a significant overhead (I used a decent algorithm, though admittedly, it's in PurePerl, as I couldn't find an XS implementation).

The issue is that this is an O(n*m) solution, and as it should be well-covered/considered territory, as I cannot be the first person to ever have to perform this type of transformation in the history of computing/mathematics/stream/signal processing, etc., I can't help thinking that there must be a more efficient solution.

NOTE: Technically, the transposition step, while adding significant overhead, may actually result in an aggregate time reduction, as some of the reduction functions do not operate over the full set of supplied data, and may return immediately or after partial processing. So, it ends up being not quite as bad as a full O(n*m). If it were not transposed, this would not be the case.

I thus humbly throw myself upon the mercy of the monks to discuss possible alternative solutions to this problem. TBH, it's the algorithm that I am mostly interested in, as a brute-force solution will always do the speedup, but within the algorithm lies the beauty.

------

FYI, here is a list of some other things I have considered, tried, or looked into:

PDL, the Perl Data Language. Specifically, the functionality provided by the PDL::Reduce module.
- PDL is speedy! However, the PDL::Reduce functions appear to work by reducing the rank of the matrix by the number of dimensions specified, by applying the same reduce() function across ALL columns. I need them able to be individually specified.
Offloading it to the GPU with some sort of a GPGPU implementation. GPUs offer massively scalable stream-processing function across multiple data streams simultaneously. If I were to treat the columns as "streams"
- Still appears to require the matrix transposition step to feed the "streams" into the GPU.
- Shares the same drawback as the PDL reduce() solution - without lots of legwork, the same vector function appears to be applied across all columns ("streams").
RRD, the Round-robin Database Tool, which is sort of geared for this type of data reduction with its Consolidation Function interface.
- It doesn't have enough consolidation functions, and after digging into the source code looking into adding new ones, it is FAR from a trivial task.
Setting up individual processes that perform the operations as optimally as possible and return their results to the calling application.
- Still requires matrix transposition step.
- IPC overhead may obviate any speed gains.

(You can see from these that that I am more an engineer than a mathematician, but I am not stupid and willing to learn.)

In reply to An efficient, scalable matrix transformation algorithm by Luftkissenboot

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.