...what you're doing is assigning each file it's category name and carrying that forward right?...

I'm not really assigning anything. In your example each row from ID's corresponds to exactly one column in Attributes. So I used this to keep the code simple:

1.file.ext Square --> corresponds to column 2 in Attributes
2.file.ext Triangle --> corresponds to column 3 in Attributes
...
16.file.et Square --> corresponds to column 17 in Attributes

...Also, is $j an arbitrary variable, or is it special? And $i is a special variable right?...

There is nothing 'special' about $i and $j. They are just used to traverse the data array and multi dimensional attrs array. In this case I used $j to address each attribute set in attrs. I used $i to address each element in data and each individual attribute of data sub sets inside attrs

...I was hoping to shoehorn the attribute ID into the data structure in order to use it in an output at the end of this...

To get the ID in the data set, you can make these changes. I'm just adding it to the final result set with the key 'ID' in this case. (Line number followed by: < = remove and > = add):

18 < shift @attrs ; 35 > $subres{ID} = $attrs->[0] ; 36 < for( my $i = 0 ; $i < @{$attrs->[$j]} ; ++$i ) { 36 > for( my $i = 1 ; $i < @{$attrs->[$j]} ; ++$i ) { 38 < ++$subres{ $data->[$i]} ; 38 > ++$subres{ $data->[$i-1]} ;

On line 18 the row ID was removed from the attribute set. So we no longer do that. That means that in the for loop we need to start at index 1 instead of 0 (Line 36). However, the indexing in data has not changed so we have to subtract 1 $i-1 (Line 38).

...If I get rid of the first line in the second file, I'll lose the file name associated with the binary...

I'm not sure what you mean with this association. If it is the order of appearance inside data that changes? Then I suggest a small piece of code that alters that order based on column order inside Attributes

...Which would make them not be able to be grouped by category?...

What needs to be grouped? Do you have examples?

...And it's probably also important to point out that the attribute numbers aren't arbitrary, they are defined....

What do you mean with defined? In you example you show attributes that are binary, they are either 0 or 1? If there is something specific that needs to be done can you try to visualize that?


In reply to Re^5: Best way to store/access large dataset? by Veltro
in thread Best way to store/access large dataset? by Speed_Freak

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.