The shape of the data in the file is of almost no interest at all in determining the data structure you need. That is dictated by how you want to use the data. When you know how you want to use the data (and therefore have some idea of the data structure required) then you can figure out how to transform the file representation of the data into the internal data structure.
DWIM is Perl's answer to Gödel
| [reply] |
First, I will traverse the sorted array of parent lines. This could be derived from the keys of a hash, or from an array.
When a parent has children, I'll need to process/aggregate each of those elements, and associate the result with the parent.
I'm leaning towards this structure:
$a->[n]->[0] = arrayref to split parent lines
$a->[n]->[1] = arrayref to arrayref of split children lines
Where do you want *them* to go today?
| [reply] [d/l] |
You could load an AoA from your data file like this, which will retain the original ordering (if thats important), and allow you to sort the data pretty much any way you need to. Quite what you mean by
Data will be sorted on the d(n) field
is far from clear to me? Neither is why you think a "hashref" is relevant?
#! perl -slw
use strict;
my @data;
while( <DATA> ) {
if( /^\s/ ) {
push @{ $data[ -1 ] }, split;
}
else {
push @data, [ split ];
}
}
print "@$_" for @data;
=output
__DATA__
a1 b1 c1 d1 e1 f1
a2 b2 c2 d2 e2 f2
a3 b3 c3 d3 e3 f3
p3 q3 r3
s3 t3 u3
a4 b4 c4 d4 e4 f4
a5 b5 c5 d5 e5 f5
p5 q5 r5
s5 t5 u5
a6 b6 c6 d6 e6 f6
Output C:\test>junk4
a1 b1 c1 d1 e1 f1
a2 b2 c2 d2 e2 f2
a3 b3 c3 d3 e3 f3 p3 q3 r3 s3 t3 u3
a4 b4 c4 d4 e4 f4
a5 b5 c5 d5 e5 f5 p5 q5 r5 s5 t5 u5
a6 b6 c6 d6 e6 f6
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
- As for the d(n) field, let's say this represents a date on each parent line. I want to sort all of the parent lines by date.
- If lines are not immediately split, then perhaps the entire line could be used as a hash key?
Sorry for not explaining this better...
BTW,
You have concatenated the children lines to the parent line -- this is not what I'm trying to do.
The children lines should be a sub-array of their parent.
Where do you want *them* to go today?
| [reply] |
- If lines are not immediately split, then perhaps the entire line could be used as a hash key?
If you were to use the whole string as the key to the hash, what would it buy you?
It would make for problems in building the data structure because for the compound lines, you wouldn't have the entire key when you read the first line. You'd either have to employ some readahead, or delete and re-store compound lines under new keys each time you found an extension line.
- As for the d(n) field, let's say this represents a date on each parent line. I want to sort all of the parent lines by date.
It would be possible to sort the data prior to spliting it, but if the fields are complex (like dates) then it's much easier to do the sort after the split.
Without making any attempt to be efficient, sorting by field n (a more normal term for your d(n) nomenclature), can be very simple. This sorts the data by the (additional) last character of the 4th field:
#! perl -slw
use strict;
my @data;
while( <DATA> ) {
if( /^\s/ ) {
push @{ $data[ -1 ] }, split;
}
else {
push @data, [ split ];
}
}
print "@$_" for sort{
substr( $a->[ 3 ], -1 ) cmp substr( $b->[ 3 ], -1 )
}@data;
__DATA__
a1q b1w c1e d1r e1t f1y
a2u b2i c2o d2p e2a f2s
a3d b3f c3g d3h e3j f3k
p3 q3 r3
s3 t3 u3
a4l b4z c4x d4c e4v f4b
a5n b5m c5q d5w e5e f5r
p5 q5 r5
s5 t5 u5
a6t b6y c6u d6i e6o f6p
Sorting by a date field is slightly more complex, but not much. I'd give an example, but as you've given dummy data, I'd have to make up the dates and I've no idea what format your data is in.
If you posted an example of your real data and explained what you are actually trying to achieve, rather than all this abstract stuff, you'd doubtless get much better answers.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |