Hello Monks! It looks like today is my day in having problems I cannot resolve. It looks like two years without coding + problems in understanding complex structure might generate a lot of problems!
So, here's my problem. I have this kind of input:

frog-n as novelty-n 5.8504 frog-n be yellow-n 6.1961 frog-n be-1 Asia-n 5.0937 frog-n coord zebra-n 5.9279 frog-n coord-1 Canuck-n 6.3363 frog-n nmod-1 mule-n 4.2881 amphibian-n success-1 surprising-j 14.6340 amphibian-n such_as alligator-n 11.5265 amphibian-n than work-n 5.9948 amphibian-n though stalk-n 13.2228

and my output should be a "matrix", as to say, made like the following:

frog-n as_novelty-n,5.8504 be_yellow-n,6.1961 be-1_Asia-n,5.0937 + coord_zebra-n,5.9279 coord-1_Canuck-n,6.3363 nmod-1_mule-n, +4.2881 amphibian-n success-1_surprising-j,14.6340 such_as_alligator-n,1 +1.5265 than_work-n,5.9948 though_stalk-n,13.2228

basically, the first element contained in the first column of the input file is the key and a joint expression between the element contained in the 2nd and 3rd column, with the corresponding score

I managed to do the following:

my $prefix = shift; my $input = shift; my $file = $prefix . ".txt"; if (-e $file) { print STDERR "$file already exists, deleting previous version\n"; `rm -f $file`; } my $debug=0; #Variabile di debug. Vale 1 in fase di debug, si usa per my %seen = (); my @global_els = (); my @row_els = (); my %score_of = (); my $row_el; my $gram; my $col_el; my $score_of; my $score; my $global_el; open INPUT,$input; while(<INPUT>){ chomp; ($row_el,$gram,$col_el,$score) = split "[\t ]+",$_; $global_el=$gram."_".$col_el; if (!($seen{"glob"}{$global_el}++)) { push @global_els,$global_el; } if (!$seen{"row"}{$row_el}++) { push @row_els,$row_el; } $score_of{$row_el}{$global_el} = $score; if($debug){ print "Check:".$row_el."=>".$global_el."=>".$score; } } close INPUT; #@global_els = (); #@row_els = (); open MATRIX,">$file"; #my $score_b=$score_of{$row_el}{$global_el}; foreach $row_el (@row_els) { print MATRIX "\t",$row_el; foreach $global_el (@global_els) { print MATRIX "\t",$global_el; print MATRIX ",",$score_of{$row_el}{$global_el}; } print MATRIX "\n"; } close MATRIX;

But my output is wrong, since all the so-called joined elements appear in both the lines, even if they are not related to the element in that line. For example, the output I get using the data above is like:

frog-n as_novelty-n,5.8504 be_yellow-n,6.1961 be-1_Asia-n,5.0937 + coord_zebra-n,5.9279 coord-1_Canuck-n,6.3363 nmod-1_mule-n, +4.2881 success-1_surprising-j, such_as_alligator-n, than_wor +k-n, though_stalk-n, amphibian-n success-1_surprising-j,14.6340 such_as_alligator-n,1 +1.5265 than_work-n,5.9948 though_stalk-n,13.2228 as_novelty +-n, be_yellow-n, be-1_Asia-n, coord_zebra-n, coord-1_Canu +ck-n, nmod-1_mule-n,

What did I get wrong? How can I improve it?
Thanks everyone,
Giulia


In reply to Problems with complex structure and array of arrays by remluvr

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.