Thank you for your patience.
I have a large table relating "synoym_ids", "concept_ids", and "synonym_strings". The table is ordered by synonym_id. One concept can have many synonyms spread throughout the table I am interested in 400 of the 1 million synonym_strings. I loop through the table and find one of the strings I want:
while (<>){
if(/pattern_I _want/){
($syn_id, $con_id) = /^(\d+)\t\d\t\(\d+)/;
# now that I'm here, pull out pieces of the strings
while (/(pattern_I_want/)g){
++chars{$1};
push @line, $1;
...
I want to make a hash of concept_id -> synonym_id. The concept_ids are scattered throughout the table. That hash is the smallest piece of the problem I can describe right now. Eventually all of these elements I have describe will be in a nested structure; the top level of that nested structure is the hash of distinct con_ids to syn_ids. |