Re: how to split a file.txt in multiple text files

You can use $/ :)

$ ls -l zzz
-rw-rw-rw- 1 tux users 60892 Feb 12 15:54 zzz
$ perl -CS -Mautodie -wE'$/=\3000;my$i="0000";while(<>){open my $fh, "
+>:encoding(utf-8)", "zz".$i++;print $fh $_}' < zzz
$ ls -l zz0*
-rw-rw-rw- 1 tux users 3624 Feb 12 15:58 zz0000
-rw-rw-rw- 1 tux users 3681 Feb 12 15:58 zz0001
-rw-rw-rw- 1 tux users 3661 Feb 12 15:58 zz0002
-rw-rw-rw- 1 tux users 3655 Feb 12 15:58 zz0003
-rw-rw-rw- 1 tux users 3652 Feb 12 15:58 zz0004
-rw-rw-rw- 1 tux users 3634 Feb 12 15:58 zz0005
-rw-rw-rw- 1 tux users 3640 Feb 12 15:58 zz0006
-rw-rw-rw- 1 tux users 3646 Feb 12 15:58 zz0007
-rw-rw-rw- 1 tux users 3631 Feb 12 15:58 zz0008
-rw-rw-rw- 1 tux users 3631 Feb 12 15:58 zz0009
-rw-rw-rw- 1 tux users 3692 Feb 12 15:58 zz0010
-rw-rw-rw- 1 tux users 3659 Feb 12 15:58 zz0011
-rw-rw-rw- 1 tux users 3647 Feb 12 15:58 zz0012
-rw-rw-rw- 1 tux users 3648 Feb 12 15:58 zz0013
-rw-rw-rw- 1 tux users 3634 Feb 12 15:58 zz0014
-rw-rw-rw- 1 tux users 3643 Feb 12 15:58 zz0015
-rw-rw-rw- 1 tux users 2514 Feb 12 15:58 zz0016
[download]

Enjoy, Have FUN! H.Merijn

Comment on Re: how to split a file.txt in multiple text files Select or Download Code

Replies are listed 'Best First'.
Re^2: how to split a file.txt in multiple text files by saulnier (Initiate) on Feb 14, 2019 at 14:51 UTC
Thank you tux. Your script works well but I also obtain a series of warnings such as: utf8 "\xCE" does not map to Unicode at split2.pl line 9, <> chunk 3. utf8 "\x94" does not map to Unicode at split2.pl line 9, <> chunk 4. Wide character in print at split2.pl line 9, <> chunk 2. ... and above all many of the files created are filled with unintelligible characters instead of having fragments of my greek text. Any idea?	[reply]
Re^3: how to split a file.txt in multiple text files by Tux (Canon) on Feb 14, 2019 at 16:24 UTC
What is your OS? What is your perl version? (`perl -v`) Did you invoke the script with the required `-CS` command-line option? `$ perl -CS split2.pl < inputfile` My example was used on UTF-8 encoded files that contained quite a few characters outside of the `iso-8895-1` range, so I should have noted the same warnings if my example was seriously flawed. Is your data secret, or is it sharable, in which case, some of us might want to download it (in a zip) to check. As you converted my command-line example to a script, maybe it would be a goor idea to show what the script looks like. You might have missed a crucial issue. It might look a bit like this: `use strict; use warnings; use autodie; local $/ = \3000; my $i = "0000"; while (<>) { my $fn = "zz" . $i++; open my $fh, ">:encoding(utf-8)", $fn or die "$fn: $!"; print $fh $_; close $fh; }` [download] Enjoy, Have FUN! H.Merijn	[reply] [d/l] [select]
Re^4: how to split a file.txt in multiple text files by saulnier (Initiate) on Feb 14, 2019 at 21:18 UTC
OS: Windows 10 Home perl 5, version 14, subversion 2 (v5.14.2) built for MSWin32-x86-multi-thread This is my script split2.pl `use strict; use warnings; use autodie; $/=\3000; my$i="000"; while(<>){open my $fh, ">:encoding(utf-8)", "input".$i++.".txt"; print $fh $_; close $fh;}` [download] If I invoke the script in this way: `perl -CS split2.pl <input.txt` I obtain this message `utf8 "\xE1" does not map to Unicode at split2.pl line 11, <> chunk 2. Close with partial character at (eval 21) line 67, <> chunk 2.` [download] and only the first fragment is created "input000.txt" If I run the script without `-CS`, no warning message and all the files are created. But they include inintelligible characters and not my greek text splitted. I can share my greek text (346 kB) but I do not exactly in which way I can do from here.	[reply] [d/l] [select]
Re^5: how to split a file.txt in multiple text files by choroba (Cardinal) on Feb 14, 2019 at 21:52 UTC
Re^5: how to split a file.txt in multiple text files by saulnier (Initiate) on Feb 15, 2019 at 07:34 UTC
Re^6: how to split a file.txt in multiple text files by Tux (Canon) on Feb 15, 2019 at 09:51 UTC
Some notes below your chosen depth have not been shown here