gurpreetsingh13 has asked for the wisdom of the Perl Monks concerning the following question:
Reading Randal's articles on minicpan, I decided to try the same. But due to some proxy issues on my office machines, I couldn't directly use the given modules.
So I created one of my own with following steps:
1. Download all three files and save those on required directories.
2. Read packages.details file line by line.
3. Check if package tar file exists at specified location. If not then download the same.
4. Finally, remove any of the old files which are not a part of packages.
Code I have posted below. What my problem is that the size of cpan mirror directory is approaching nearly 2.5 GB. I need to know whether the current size is correct or there is some problem with my code that I am downloading multiple files or something like that, because as mentioned in Randal's article of year 2002 - the main purpose of minicpan is to burn all that into a single CD or some portable device. Please help me in that.P.S. - I am using cygwin on a windows machine.
use strict; use warnings; use utf8; use bigint; use Array::Utils qw(:all); my $cpanPath = "/cygdrive/d/Softwares/cpanmirror"; my $remoteMirror = "http://mirrors.neusoft.edu.cn/cpan"; ##Get package file `cd /home/Gurpreet && rm -rf 01mailrc.txt*`; `cd /home/Gurpreet && rm -rf 02packages.details.txt*`; `cd /home/Gurpreet && rm -rf 03modlist.data*`; print "Deleted old package files\n"; print "=" x 30, "\n"; `cd /home/Gurpreet/ && wget $remoteMirror/authors/01mailrc.txt.gz`; `cp -f /home/Gurpreet/01mailrc.txt.gz $cpanPath/authors/`; `cd /home/Gurpreet/ && wget $remoteMirror/modules/02packages.details.t +xt.gz`; `cp -f /home/Gurpreet/02packages.details.txt.gz $cpanPath/modules/`; `cd /home/Gurpreet/ && wget $remoteMirror/modules/03modlist.data.gz`; `cp -f /home/Gurpreet/03modlist.data.gz $cpanPath/modules/`; print "Updated package files \n"; print "=" x 30, "\n"; #`cd /home/Gurpreet && gunzip 02packages.details.txt.gz`; print "Extracted package file \n"; print "=" x 30, "\n"; #Get total files excluding top lines in package file my $totalFiles = `cat 02packages.details.txt|wc -l`; chomp($totalFiles); $totalFiles = $totalFiles - 9; print "Total files = $totalFiles\n"; print "=" x 30, "\n"; #Get all packages names my @packageNames = `cat 02packages.details.txt|tail -$totalFiles|rev|cut -d " " -f1|r +ev|sort|un + iq`; chomp($_) foreach (@packageNames); print "Total unique packages = ", scalar(@packageNames), "\n"; print "=" x 30, "\n"; #Start with numbers print "Enter starting point of download\n"; chomp( my $startPoint = <STDIN> ); print "Enter ending point of download\n"; chomp( my $endPoint = <STDIN> ); #Start getting package files print "Starting update of cpanmirror. Press enter\n"; print "=" x 30, "\n"; <STDIN>; my $ctr = $startPoint; foreach my $val ( @packageNames[ $startPoint .. $endPoint ] ) { my @vals = split /\//, $val; my $packageName = $vals[ scalar(@vals) - 1 ]; my $dirName = join "/", @vals[ 0 .. scalar(@vals) - 2 ]; print "$ctr)Dir=$dirName Package=$packageName\n"; `mkdir -p $cpanPath/authors/id/$dirName` unless -d $dirName; `cd $cpanPath/authors/id/$dirName && wget $remoteMirror/authors/i +d/$dirName + /$packageName` unless -e "$cpanPath/authors/id/$dirName/$packageName"; $ctr++; } #Now get the list of old files and delete them print "Print Y to delete all extra files\n"; chomp( my $todo = <STDIN> ); my @allFileNames = `cd $cpanPath/authors/id && find -type f|grep -v CH +ECKSUMS`; foreach my $existingFile (@allFileNames) { chomp($existingFile); my $exactName = substr $existingFile, 2; unless ( $exactName ~~ @packageNames ) { print "Deleting $cpanPath/authors/id/$exactName\n"; ` rm -rf $cpanPath/authors/id/$exactName` if $todo eq "Y" || $ +todo eq "y + "; } }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Minicpan size issue
by marto (Cardinal) on Nov 02, 2013 at 12:01 UTC | |
|
Re: Minicpan size issue
by keszler (Priest) on Nov 02, 2013 at 16:49 UTC | |
|
Re: Minicpan size issue (2gb)
by Anonymous Monk on Nov 02, 2013 at 10:25 UTC | |
|
Re: Minicpan size issue
by glasswalk3r (Friar) on Nov 21, 2017 at 14:54 UTC |