Hmmm ... you're complaining that it gets very slow. But a straightforward implementation shouldn't take much time at all except for the actual creation of the appended files. Since you're not presenting any code, I can't tell if you've got an error in your program that causes the low performance or not.
Try commenting out the part of your program that actually creates the appended data files and measure how long it takes to run with 10000 file numbers. If it runs quickly, then the performance issue is due to the amount of data you're reading and writing. In that case, you may want to use tricks like putting your output files on a different disk drive than your input files to reduce your I/O time.
If, on the other hand, it takes a long time to run, then I'd expect either:
You can differentiate between these two cases with a script that simply reads all the filenames from the directory and does nothing with them. If that script runs quickly, then you have a problem with your algorithm that splits up your filenames. If it runs very slowly, then you may want to change the filesystem you're using or perhaps partition files up into different subdirectories.
I did a quick test: I generated roughly 50,000 files and then grouped them by their numeric prefix, then deleted all of the files, like so:
roboticus@Boink:~/funkytest$ ls genfiles.pl groupfiles.pl roboticus@Boink:~/funkytest$ time ./genfiles.pl real 0m4.937s user 0m1.644s sys 0m3.292s roboticus@Boink:~/funkytest$ ls | wc -l 49993 roboticus@Boink:~/funkytest$ time ./groupfiles.pl 6712: 6712_WRW, 6712_DIK, 6712_FRB 8563: 8563_FHL, 8563_AAE, 8563_TSL, 8563_LCU 5006: 5006_SLA, 5006_PZK, 5006_PUB, 5006_FCK 8434: 8434_HNX, 8434_HPB, 8434_YED, 8434_SCB, 8434_KBS, 8434_CEH, 8434 +_JCH, 8434_NVN, 8434_VPN, 8434_GFM, 8434_BNJ 3509: 3509_EAY, 3509_WNU, 3509_MUI, 3509_NPX, 3509_LHX 7652: 7652_IMC, 7652_GMN 4863: 4863_MTN, 4863_RGD, 4863_BFT, 4863_LSF, 4863_KNJ, 4863_JGE Files: 49991, Groups: 9922 real 0m0.736s user 0m0.664s sys 0m0.080s roboticus@Boink:~/funkytest$ time rm {0,1,2,3,4,5,6,7,8,9}* real 0m6.451s user 0m3.044s sys 0m3.212s roboticus@Boink:~/funkytest$
As you can see, it takes little time to split the files into groups (on my machine, anyway). If the files had data in them and I did the concatenation you mention, then the runtime would be totally dominated by the act of making the concatenated files.
...roboticus
In reply to Re: Appending multiple files into one or more files
by roboticus
in thread Appending multiple files into one or more files
by iamravikanth
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |