17 comes before 12 or just a typo?
hist:x 1 ## 4 #### 5 #### 17 ########## 12 #
I cannot use any known database engine
Really? Why not?
If true, the first thing I would do is write a short program to do a single pass over your 2e9 silly format files and output a single file formatted like so:
1: 1(3) 3(4) 5(7) 17(1) 21(1) 2: 1(2) 3(2) 17(5) 20(1) 22(2) 3: 3(1) 10(3) 12(1) ...
Then I could dump all those silly format files.
Then I'd look to reformat that single file into some kind of consistent record format, but then I read this bit of your description:
each in real case scenario containing approx 300 columns and there is a maximum of 8000 possible column labels(values)) i thought i should create a consensus histogram from all subject ones. such that this histogram has all 25 columns (now i am again talking about my example) and each column having the maximum number of data points (this is computed from the subject set- if the max number of data points for column 1 is 100 then this how large column 1 in my consensus hist will be.)
And got completely lost in the number of columns and ranges of values for each column...
In reply to Re: Similarity searching
by BrowserUk
in thread Similarity searching
by baxy77bax
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |