in reply to Re^3: Tag protein names in sentences
in thread Tag protein names in sentences
@parts is a list of the 'words' in a protein name to be matched. The code builds %proteinLU as chains of nested keys. The value for each key is another hash except for _name_ keys whose value is a complete protein name.
$parent = $parent->{$part} ||= {}; sets the value of a new key to an empty hash. Using ||= in that way avoids an explicit if ! exists $parent->{$part} test.
The match code works by 'walking' down a chain of nested hash keys. Each time a new key is matched its value becomes the next 'parent'. The assignment to @best 'remembers' the last protein name that matched. @best is an array because two values need to be remembered for the match: the protein name ($parent->{_name_}) and the number of words to remove ($wIndex).
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: Tag protein names in sentences
by sinlam (Novice) on Feb 18, 2010 at 19:34 UTC | |
by GrandFather (Saint) on Feb 18, 2010 at 19:53 UTC |