Win has asked for the wisdom of the Perl Monks concerning the following question:

I remember that there was a Linux command that split files on a given expression. I am now working with MS DOS and would like to split files. My guess is that I need a Perl solution. Is there a one liner for this job?

Replies are listed 'Best First'.
Re: Splitting files on regex
by davidrw (Prior) on Nov 02, 2005 at 14:52 UTC
    What was the linux command? There is split but that's for splitting into chunks of certain line or byte size. Can you give a sample expression and file? For one liners, look at perlrun, and specifically at the -e, -n, -p, and -a flags. Here's a (WARNING -- very inefficent) quick example:
    perl -ne "/foo(\d+)/ && do { open FILE, '>>', 'split'.$1.'.txt'; print + FILE $_; close FILE }" blah.txt

      {rummages through dusty portions of memory} I think the command was csplit, which would split on regex. There is probably a Windows version of it at the GnuWin32 site

      emc

      Hi davidrw,

      I'm afraid I cannot see where this would work. If the delimiter is somewhere in a line and if multilines are possible, it does not work for me say to split a file like the following at 'abc':

      1111111111111abc2222222222 33333abc333333333 444444444444abc444_ 5555abc5555555555555555 6666666666666666666 777777777abc77777777 8888888888888888888abc00000000000000

      or am I doing something completely wrong here?
      Regards, svenXY

        can you define "split a file"? My example above would take any line with "fooN" in it and add it to the file 'splitN.txt' ... Since OP didn't specify, i assumed by "split" he meant "send certain lines to certain files" .. From your sample data it seems that by "split" you mean "create separate files for each 'column' of data". A quick & dirty (and again, WARNING, very inefficient) way for that would be:
        perl -lne "chomp; @x = split /abc/, $_; do { open FILE, '>>', 'col'.$_ +.'.txt'; print FILE $x[$_]; close FILE } for 0 .. $#x" blah.txt
Re: Splitting files on regex
by jesuashok (Curate) on Nov 02, 2005 at 14:47 UTC
Re: Splitting files on regex
by svenXY (Deacon) on Nov 02, 2005 at 14:55 UTC
    Hi,
    this would do the job (splitme is the filename and abc is (here) the delimiter:
    perl -e'open(IN, $ARGV[0]);my $file = do {local $/;<IN>};my @parts=spl +it(/$ARGV[1]/, $file);for (0..$#parts){open(OUT, ">$ARGV[0].$_");prin +t OUT $parts[$_];close OUT}' splitme abc

    Regards,
    svenXY