G'day perl_boy,

[A couple of notes on presentation: It's good that you've put code within <code>...</code> tags; please also do the same for data and program output (e.g. error messages) so that we can see a verbatim copy of what you're seeing — HTML can modify what you write, e.g. by collapsing whitespace into a single space, which can make a huge difference in many cases. Please also "linkify" URLs; in this case, changing URL to [URL] would have sufficed; again, this helps us to help you (see "What shortcuts can I use for linking to other information?" for more details about that).]

You've shown your input data as having the same number of characters in each column (columns 1, 2 & 3 have 3 characters: foo, bar, baz; columns 4 & 5 have 4 characters: booz & qaaz; and so on). This could be a realistic representation of your data; for instance, order numbers, product codes, client IDs, and so on, are likely to have the same lengths. If this is the case, the following is a much simpler solution.

#!/usr/bin/env perl use 5.014; use warnings; use autodie; my $infile = 'pm_11140114_tab_align_even.dat'; my $outfile = 'pm_11140114_tab_align_even.out'; { open my $in_fh, '<', $infile; open my $out_fh, '>', $outfile; while (<$in_fh>) { print $out_fh $_ =~ y/\t/\t/rs; } }

pm_11140114_tab_align_even.dat:

foo bar baz booz qaaz + abc foo bar baz booz qaaz abc 123 foo bar baz booz qaaz + abc

pm_11140114_tab_align_even.out:

foo bar baz booz qaaz abc foo bar baz booz qaaz abc 123 foo bar baz booz qaaz abc

Note that this uses the /r option which was introduced in Perl 5.14: "perl5140delta: Non-destructive substitution". If you're using an older version of Perl, change use 5.014; to use strict; and the print statement will need to be split into two statements:

y/\t/\t/s; print $out_fh $_;

This gives exactly the same result.

Your "SHOULD output to" shows two tabs between columns (except for "abc\t123" which I'm going to assume is just a typo). Because y///r and s///r can be chained, you can change

print $out_fh $_ =~ y/\t/\t/rs;

to

print $out_fh $_ =~ y/\t/\t/rs =~ s/\t/\t\t/gr;

Now, pm_11140114_tab_align_even.out will be:

foo bar baz booz qaaz + abc foo bar baz booz qaaz + abc 123 foo bar baz booz qaaz + abc

For older Perls, you'll need to split the print statement into three statements:

y/\t/\t/s; s/\t/\t\t/g; print $out_fh $_;

Again, this gives exactly the same result.

Please either advise whether the input data in you OP is representative or, if not, provide something more realistic such that we can provide better help.

It would also be useful to know what you intend to do with the output; e.g. print to screen, write to a plain text file, use for CSV, generate an HTML table, etc. With this information, we may be able to provide different (better) advice.

— Ken


In reply to Re: misalined TABs using substr,LAST_MATCH_START/END,regex by kcott
in thread misalined TABs using substr,LAST_MATCH_START/END,regex by perl_boy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.