baxy77bax has asked for the wisdom of the Perl Monks concerning the following question:
what transforms can i apply to homogenize them by either:aaaaaaaaabababacbbbbbbbaccaabc aaaaaaaaaaaaaaaaacbbbbbbbaaaca aaaaaaaaabaaaaabbaaaaabaaccccc
a) grouping same characters together within each string
b) grouping same characters together across all strings (e.g. have most aaa's in one, bbb's in the other, etc. - of course, sorting does not count as the number of character occurrences does not change per string)
c) transform (map) characters e.g. a-> b
d) something else
However, conditions are:
a) the size of the original input data cannot be smaller than its transform, meaning the index needs to be implicitly built into the data (in the same way BWT does it).
b) Types of queries that need to be supported:
What is the character on i-th position of the original string. where 0<=i<=l.
Some of the obvious solutions are:
a) BWT
b) BWT using all strings
c) replace a with b and record coordinates using either bitstrings or ints (but this violates the size condition unless there is a smart way to record positions within the strings itself somehow)
d) something else
Has anyone encountered this problem before? How did you solve it?
thnx
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Problems with strings
by QM (Parson) on May 20, 2019 at 10:55 UTC | |
by baxy77bax (Deacon) on May 20, 2019 at 12:39 UTC | |
by LanX (Saint) on May 20, 2019 at 12:52 UTC | |
by QM (Parson) on May 20, 2019 at 14:20 UTC | |
|
Re: Problems with strings
by AnomalousMonk (Archbishop) on May 20, 2019 at 16:06 UTC | |
by LanX (Saint) on May 20, 2019 at 18:51 UTC | |
|
Re: Problems with strings
by thanos1983 (Parson) on May 20, 2019 at 13:26 UTC | |
|
Re: Problems with strings
by johngg (Canon) on May 20, 2019 at 15:53 UTC |