create separate output files based on the matched values

tariqahsan has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: create separate output files based on the matched values by ikegami (Patriarch) on Sep 27, 2005 at 21:31 UTC
unpack is great for fixed width fields. `# @fields = unpack('a5 a6 a7 a5', $_); # @fields = unpack('a4 x1 a5 x1 a6 x1 a5', $_); open(my $fh_in, '<', ...) or die("Can't open input file: $!\n"); while (<$fh_in>) { chomp; my ($file_name, $rest) = unpack('a4 x1 a*', $_); $file_name .= '.out'; open(my $fh_out, '>>', $file_name) or die("Can't open $file_name for append: $!\n"); print $fh_out ("$rest\n"); }` [download] By the way, your data is not fixed width. The 5th record is shorter.	[reply] [d/l]
Re: create separate output files based on the matched values by GrandFather (Saint) on Sep 27, 2005 at 21:35 UTC
The following does the trick. Note that the line ending for the created files is Windows rather than Mac or nix style. `use warnings; use strict; my %files; while (<DATA>) { my ($name, $data) = /^(\w+)\s+(.)/; last if ! defined $data \|\| ! length $data; open $files{$name}, '>', "$name.out" if (! defined $files{$name}); syswrite $files{$name}, $data . "\r\n"; } close $files{$_} for (keys %files); __DATA__ T001 Test1 012354 Abcde T001 Test1 013456 bcdef T002 Test2 024567 xxxxx T001 Test1 012354 yyyyy T003 Test3 02345 cdefg T002 Test2 000000 56789` [download] Perl is Huffman encoded by design.	[reply] [d/l]
Re^2: create separate output files based on the matched values by Skeeve (Parson) on Sep 27, 2005 at 23:12 UTC
Are you sure you will get CR LF at the line end on every system? read perldoc perlport: In most operating systems, lines in files are terminated by newlines. Just what is used as a newline may vary from OS to OS. Unix tradition- ally uses "\012", one type of DOSish I/O uses "\015\012", and Mac OS uses "\015". Perl uses "\n" to represent the "logical" newline, where what is logi- cal may depend on the platform in use. In MacPerl, "\n" always means "\015". In DOSish perls, "\n" usually means "\012", but when accessing a file in "text" mode, STDIO translates it to (or from) "\015\012", depending on whether you're reading or writing. Unix does the same thing on ttys in canonical mode. "\015\012" is commonly referred to as CRLF. `$\=~s;s.;q^\|D9JYJ^^qq^\//\\\///^;ex;print`	[reply] [d/l]
Re^3: create separate output files based on the matched values by GrandFather (Saint) on Sep 27, 2005 at 23:44 UTC
I don't think that the line endings are a particular problem for the OP. The trick is generating the various output files. However replacing the `syswrite` with: `my $fh = $files{$name}; print $fh "$data\n";` [download] fixes the problem. What do you expect from a reply written before morning coffee :). Perl is Huffman encoded by design.	[reply] [d/l] [select]
Re^2: create separate output files based on the matched values by tariqahsan (Beadle) on Nov 10, 2005 at 19:02 UTC
What if I want to put a header line for each of the generated files? What's the best way to do this using this script? Thanks for the help!	[reply]
Re^3: create separate output files based on the matched values by GrandFather (Saint) on Nov 10, 2005 at 19:22 UTC
Change: `open $files{$name}, '>', "$name.out" if (! defined $files{$name});` [download] to `if (! defined $files{$name}) { open $files{$name}, '>', "$name.out"; syswrite $files{$name}, "This is a header line for file $name.out\ +r\n"; }` [download] Perl is Huffman encoded by design.	[reply] [d/l] [select]
Re: create separate output files based on the matched values by izut (Chaplain) on Sep 27, 2005 at 21:43 UTC
If I got your specs correctly, and your data is separated by spaces, you can split the line and use a hash to store the opened filehandles. This code should work: Update: Updated `split` - Thanks Skeeve. `open my $fh_input, "<", "input.txt" or die "$!"; my %fh = (); while (<$fh_input>) { chomp; my ($filename, $content) = split /\s+/, $_, 2; my $fh = undef; $fh = $fh{$filename} if defined $fh{$filename}; if (not defined $fh{$filename}) { open $fh, ">", "$filename.out" or die "$!"; $fh{$filename} = $fh; } else { $fh = $fh{$filename}; } print $fh $content, "\n"; } foreach (keys %fh) { close $fh{$_}; }` [download] Igor S. Lopes - izut surrender to perl. your code, your rules.	[reply] [d/l] [select]
Re^2: create separate output files based on the matched values by Skeeve (Parson) on Sep 27, 2005 at 23:16 UTC
Haven't looked at all of your code, but it fails here: `my ($filename, $content) = split /\s+/;` [download] You will loose everything but the first 2 columns. You should have used: `my ($filename, $content) = split /\s+/,$_,2;` [download] `$\=~s;s.;q^\|D9JYJ^^qq^\//\\\///^;ex;print`	[reply] [d/l] [select]