comment on

If I understand correctly, you have a file with a number of records, each one beginning with a line:

INPUT SEQUENCE=XXX
[download]

and ending with:

______________________________________________________________________
+___________________________________
[download]

And you want to split the whole file in smaller files containing N or these records

The easiest way I can imagine doing this is using Tie::File. Try this script:

#!/usr/bin/perl

use strict;
use warnings;
use Tie::File;

my ($file,$recs_X_file) = @ARGV;
die "Usage: $0 <input_file> <recs x file>" if (@ARGV != 2);
tie my @arr, 'Tie::File', $file, recsep => "__________________________
+_____________________________________________________________________
+__________",autochomp=>0;

my $from=0;
my $to=$recs_X_file-1;
while ($from < $#arr){
  my $ofile = "file.$from-$to";
  open F,">",$ofile or die $!;
  print "printing records $from to $to in $ofile\n";
  print F @arr[$from..$to];
  $from=$to+1;
  $to = $from+$recs_X_file-1;
}
[download]

This script interfaces the file as an array, but in the way that you want it to do: Each record in the array corresponds with one logical record in the file. Once done, it splits the array (i.e. the records in the file) N by N records and outputs them in sub-files

For example, if you call the script "split_records.pl" you can invoke it with:

perl split_records.pl inputfile 10
[download]

Outputs:

printing records 0 to 9 in file.0-9
printing records 10 to 19 in file.10-19
printing records 20 to 29 in file.20-29
... and so on (depending on the number of records of the original file
[download]

Of course, the files containing the records are created too

Hope this helps!

citromatik

In reply to Re: split of files by citromatik
in thread split of files by boby

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.