ammalu89 has asked for the wisdom of the Perl Monks concerning the following question:

I have a file that has about 10000 SNP ids. I need to compare this file in my local directory to an Illumina array file in the Ftp server of the UCSC database, to check for overlaps.The Snp array Illumina file is in .txt.gz format. I do not want to download the file from UCSC. Is there any other way in perl. I started writing a code but I am not sure how to compare a local file to a file in the remote server without downloading. Any ideas would be helpful. Thank you

#!/usr/bin/perl use strict; use warnings; use Net::FTP; my $ucsc = "hgdownload.cse.ucsc.edu"; my $ucscPathPrefix = "/goldenPath//hg19/database/"; my $ftp = Net::FTP->new($ucsc, Debug => 0) or die "Cannot connect to $ +ucsc: $@"; $ftp->login("anonymous",'-anonymous@') or die "Cannot login", $ftp->me +ssage; $ftp->binary; $ftp->cwd($ucscPathPrefix) or die "Cannot change working directory to +$ucscPathPrefix", $ftp->message;

Replies are listed 'Best First'.
Re: Retrieve SNP information from UCSC genome browser
by marto (Cardinal) on Jan 10, 2014 at 14:27 UTC
      "FTP is for transfering files, it has no ability to compare them."

      Correction;

      ftp perlmonks.com Connected to perlmonks.com. 220 perlmonks.com FTP Server (Version ...) ready. Name (perlmonks.com:joebloe):root 331 Password required for root. Passoword: 230 User root logged in. ftp> siz (remote-file) Usage: siz remote-file ^D 221 Goodbye.

      "but I am not sure how to compare a local file to a file in the remote server without downloading."

      Please see above -- the ftp command is: siz
      which you can "compare" to, as you requested. However, if you wish to compare file contents. It is possible, but is more elaborate, and depends on the ftp servers capabilities.

      --Chris

      ¡λɐp ʇɑəɹ⅁ ɐ əʌɐɥ puɐ ʻꜱdləɥ ꜱᴉɥʇ ədoH

        The documentation you link to provides no usage or in fact mention of a command called siz. If this simply compares the size of files your assertion that this is a file comparison tool is wrong.

        A reply falls below the community's threshold of quality. You may see it by logging in.
Re: Retrieve SNP information from UCSC genome browser
by bioinformatics (Friar) on Jan 11, 2014 at 01:43 UTC

    Are you trying to see if there is an updated file in the UCSC database? Or are you looking to see whether your file from another source contains similar data? Is there any reason you don't want to download the file from UCSC?

    Bioinformatics
      I am trying to see whether my file has similar data as the UCSC file. The files are too huge to download. so i would like to avoid it if there is some other way to go around it