I found the code a bit confusing and this tab delimited format is a bummer to work with. One of the problems will be that when you add "null" fields, if you don't quote these things, then it is really hard to see these fields in file3.

Anyway I would suggest using some version of Text::CSV to parse the data and you can adjust the quoting characters to your liking (this module helps on both input and output). Tabs will probably get mangled in example data below, but I think you'll get the idea.

#!/usr/bin/perl -w use strict; use Text::CSV_XS; open (my $FILE1, '<', "file1.csv") or die "cannot open file1 $!\n"; open (my $FILE2, '<', "file2.csv") or die "cannot open file3 $!\n"; open (my $FILE3, '>', "file3.csv") or die "cannot open file3 $!\n"; my $csv = Text::CSV_XS->new ({ binary => 1, eol => $/ , sep_char => "\t", always_quote =>1}); print $FILE3 "Match\t".<$FILE1>; # header for file3 <$FILE2>; # skip header line of file2 my %file1; my %file2; while (my $row = $csv->getline($FILE1)) { my @fields = @$row; my $id = $fields[4]; $file1{$id}=["",@fields]; } while (my $row = $csv->getline($FILE2)) { my @fields = @$row; my $id = $fields[0]; $file2{$id}=["","","","","",@fields,"","","","",""]; } foreach my $id1 (keys %file1) { if (exists $file2{$id1}) { $file1{$id1}[0] ="both"; #both files $csv->print ($FILE3, $file1{$id1}); } else { $file1{$id1}[0] ="1"; #file1 only $csv->print ($FILE3, $file1{$id1}); } } foreach my $id2 (keys %file2) { if (!exists $file1{$id2}) { $file2{$id2}[0] ="2"; #file2 only $csv->print ($FILE3, $file2{$id2}); } }
file1:
Group Functional_Category LT IA ID Symbol Descriptio +n TID GID LO S1 S2 HL Status V1 V2 IN q w a a 1 AA some description 10 11 1 s1 +a s2a 1 1 1 1 1 q w b b 2 BB another descrp 11 12 1 s1b + s2b 1 1 1 1 1 q w c c 5 CC despript A 12 13 1 s1c s +2c 1 1 1 1 1
file2
ID Symbol Description TID GID LO S1 S2 2 BB another descrp 11 12 1 s1b s2b 4 DD some more stuff 14 14 4 s1d s2d
file3
Match Group Functional_Category LT IA ID Symbol D +escription TID GID LO S1 S2 HL Status V1 V +2 IN "1" "q" "w" "a" "a" "1" "AA" "some description" + "10" "11" "1" "s1a" "s2a" "1" "1" "1" "1" + "1" "both" "q" "w" "b" "b" "2" "BB" "another descrp" + "11" "12" "1" "s1b" "s2b" "1" "1" "1" "1" + "1" "1" "q" "w" "c" "c" "5" "CC" "despript A" "12" + "13" "1" "s1c" "s2c" "1" "1" "1" "1" "1" "2" "" "" "" "" "4" "DD" "some more stuff" "14 +" "14" "4" "s1d" "s2d" "" "" "" "" ""

In reply to Re: Issues with Column headings by Marshall
in thread Issues with Column headings by bluray

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.