in reply to Homologene BioPerl
There is a bioperl module that knows how to talk to NCBI's E-Utilities: see http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook (it mentions homologene - I suppose it works, but I haven't tried it). You can also use the EUtilities directly. Both approaches have a slight learning curve.
Another, third approach is to download homologene into a local database. The NCBI E-Utilities work well, but working with homologene, I find it handier (and faster) to have all data locally, and use the file provided by NCBI in:
The file 'homologene.data' there, when stored in a database, looks like this (just showing 10 random rows):ftp://ftp.ncbi.nih.gov/pub/HomoloGene/current/
homologene_group_id | tax_id | geneid | symbol | protein_gi + | protein_accession ---------------------+--------+---------+-----------------+----------- +-+------------------- 3 | 9606 | 34 | ACADM | 187960098 + | NP_001120800.1 3 | 9598 | 469356 | ACADM | 114557331 + | XP_524741.2 3 | 9615 | 490207 | ACADM | 73960161 + | XP_547328.2 3 | 9913 | 505968 | ACADM | 115497690 + | NP_001068703.1 3 | 10090 | 11364 | Acadm | 6680618 + | NP_031408.1 3 | 10116 | 24158 | Acadm | 8392833 + | NP_058682.1 3 | 7955 | 406283 | acadm | 47085823 + | NP_998254.1 3 | 7227 | 38864 | CG12262 | 24660351 + | NP_648149.1 3 | 7165 | 1276346 | AgaP_AGAP005662 | 58387602 + | XP_315683.2 3 | 6239 | 181757 | acdh-10 | 17569725 + | NP_510788.1 (
What you want is to look up your human gene or accession (human: tax_id=9606), take the group_id, and see if there is a Drosophila melanogaster (fly: tax_id=7227) record within the same group id.
In case you have basic database skills, here is a way to load that file into a postgresql database:
#!/bin/sh wget ftp://ftp.ncbi.nih.gov/pub/HomoloGene/current/homologene.data; < homologene.data psql -c " drop table if exists my_homologene_data; create table my_homologene_data ( homologene_group_id integer , tax_id integer , geneid integer , symbol text , protein_gi integer , protein_accession text ); copy my_homologene_data from stdin csv delimiter E'\t'; "; echo "select count(*) from my_homologene_data" | psql;
The records that have the same group id are homologs.
select * from my_homologene_data where homologene_group_id = 31015
With that group id, you can easily construct links into specific NCBI homologene pages too:
http://www.ncbi.nlm.nih.gov/homologene/?term=31015
hth
P.S. Re zoological nomenclature: in the binomial name Drosophila melanogaster, 'melanogaster' is the epitheton and must *always* be lower case; only genus names must be capitalised.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Homologene BioPerl
by ZWcarp (Beadle) on Dec 02, 2011 at 19:53 UTC | |
by erix (Prior) on Dec 02, 2011 at 20:36 UTC |