in reply to making a single column out of a two-column text file

As a one-liner:

perl -ne '/(\S*)\s+(\S*)/; push @a,$1 if length $1; push @b,$2 if length $2; END{ print "First column output:\n\n@a\n\nSecond column output:\n\n@b\n\n" }'

And more readably:

while (<>) { # Added a conditional per merlyn's advice /(\S*)\s+(\S*)/ or next; # /(\S*)\s+(\S*)/; push @a, $1 if length $1; push @b, $2 if length $2; } print "First column output:\n\n@a\n\n", "Second column output:\n\n@b\n\n"

Seeking Green geeks in Minnesota

Replies are listed 'Best First'.
•Re: Re: making a single column out of a two-column text file
by merlyn (Sage) on Feb 26, 2003 at 01:54 UTC
    while (<>) { /(\S*)\s+(\S*)/; push @a, $1 if length $1; push @b, $2 if length $2; }
    Broken, if the regex ever not matches. Please don't use $1 except in the conditional testing the regex that you think might match.

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      Beter as next unless /(\S*)\s+(\S*)/?

Re: Re: making a single column out of a two-column text file
by hv (Prior) on Feb 26, 2003 at 02:36 UTC
            /(\S*)\s+(\S*)/ or next;

    The OP was specifically asking about the problem when one of the columns is empty. It seems this solution will just skip any such line, which I don't think was the OP's intention.

    Update as jasonk points out, this is rubbish. I'd misread the * as +.

    Hugo

      No it won't, it matches 0 or more non-white-spaces, followed by one or more white-spaces, followed by 0 or more non-white-spaces. If the first column is empty, that counts as 0 non-white-spaces and $1 will contain the empty string. If the second column is empty that counts as 0 non-white-spaces and $2 will contain the empty string. The only lines that will get skipped are lines that don't contain at least one white-space character.

Re: Re: making a single column out of a two-column text file
by allolex (Curate) on Feb 26, 2003 at 03:41 UTC

    Hi Diotalevi,

    Unfortunately, this one didn't quite do the trick. It matches the first word of each column on the same line and prints out the first word/unit in each line, like this:

    Indice 1. KETER 4. HESED + 131 1. Quando la luce dell'infinito 2. Abbiamo diversi e curiosi orologi 23. L'analogia dei contrar +i 133 24. Sauvez la faible Aisch +a 136 2. HOKMAH 25. Questi misteriosi iniz +iati 139 26. Tutte le tradizioni de +lla terra 141 3. In hanc utilitatem clementes angeli

    Output

    First column output: Indice 1. 2. 3. Second column output: 1. 4. Quando Abbiamo 24. 2. 26. In

    The output reminds me of one of William Burroughs' ideas. Really cool, actually, but not what I had in mind. Obviously better than I could come up with, though. Plus you didn't have the input file to test it. And of course, the big AND... you came up with your code in about five minutes, well 16 minutes, but you were answering other questions, too.

    --
    Allolex

      Oh... yeah, that data is completely different than I expected. Ah well. That unpack solution someone else posted was nice for fixed length fields.