G'day perl_boy,
[A couple of notes on presentation: It's good that you've put code within <code>...</code> tags; please also do the same for data and program output (e.g. error messages) so that we can see a verbatim copy of what you're seeing — HTML can modify what you write, e.g. by collapsing whitespace into a single space, which can make a huge difference in many cases. Please also "linkify" URLs; in this case, changing URL to [URL] would have sufficed; again, this helps us to help you (see "What shortcuts can I use for linking to other information?" for more details about that).]
You've shown your input data as having the same number of characters in each column (columns 1, 2 & 3 have 3 characters: foo, bar, baz; columns 4 & 5 have 4 characters: booz & qaaz; and so on). This could be a realistic representation of your data; for instance, order numbers, product codes, client IDs, and so on, are likely to have the same lengths. If this is the case, the following is a much simpler solution.
#!/usr/bin/env perl use 5.014; use warnings; use autodie; my $infile = 'pm_11140114_tab_align_even.dat'; my $outfile = 'pm_11140114_tab_align_even.out'; { open my $in_fh, '<', $infile; open my $out_fh, '>', $outfile; while (<$in_fh>) { print $out_fh $_ =~ y/\t/\t/rs; } }
pm_11140114_tab_align_even.dat:
foo bar baz booz qaaz + abc foo bar baz booz qaaz abc 123 foo bar baz booz qaaz + abc
pm_11140114_tab_align_even.out:
foo bar baz booz qaaz abc foo bar baz booz qaaz abc 123 foo bar baz booz qaaz abc
Note that this uses the /r option which was introduced in Perl 5.14: "perl5140delta: Non-destructive substitution". If you're using an older version of Perl, change use 5.014; to use strict; and the print statement will need to be split into two statements:
y/\t/\t/s; print $out_fh $_;
This gives exactly the same result.
Your "SHOULD output to" shows two tabs between columns (except for "abc\t123" which I'm going to assume is just a typo). Because y///r and s///r can be chained, you can change
print $out_fh $_ =~ y/\t/\t/rs;
to
print $out_fh $_ =~ y/\t/\t/rs =~ s/\t/\t\t/gr;
Now, pm_11140114_tab_align_even.out will be:
foo bar baz booz qaaz + abc foo bar baz booz qaaz + abc 123 foo bar baz booz qaaz + abc
For older Perls, you'll need to split the print statement into three statements:
y/\t/\t/s; s/\t/\t\t/g; print $out_fh $_;
Again, this gives exactly the same result.
Please either advise whether the input data in you OP is representative or, if not, provide something more realistic such that we can provide better help.
It would also be useful to know what you intend to do with the output; e.g. print to screen, write to a plain text file, use for CSV, generate an HTML table, etc. With this information, we may be able to provide different (better) advice.
— Ken
In reply to Re: misalined TABs using substr,LAST_MATCH_START/END,regex
by kcott
in thread misalined TABs using substr,LAST_MATCH_START/END,regex
by perl_boy
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |