I'm not at all clear about what you are trying to do: do you want to insert the SUBJECT NODE into the QUERY NODE or vice versa? And if so, why? - it appears it is already done.

Based on your column descriptions, you have everything you need to know in the table and in fact the strings are already "joined". Columns 5-8 tell you where to find the overlaps. For example, in row 1 of your sample positions 7653-7682 of the QUERY NODE correspond to positions 32-61 in the SUBJECT NODE.

But I doubt you are asking about something that is already done for you. What exactly are you trying to do? Insert all of the SUBJECT node in the same location where now only a part of it exists?

Assuming that you are trying to insert the whole string where only a part of it exists, then hashes have nothing to do with this. Instead you need to use substr to get the part before and after the overlapping portion, like this:

use strict; use warnings; while(my $line = <DATA>) { chomp $line; my ($sQuery, $sSubject, $iLengthQ, $iLengthS, $iEndSinQ, $iBeginSinQ +) = split(/\s+/, $line); #print "<$sQuery> <$sSubject> <$iEndSinQ> <$iBeginSinQ>\n"; my $sBefore = substr($sQuery, 0, $iBeginSinQ); my $sAfter = substr($sQuery, $iEndSinQ+1); # pipes added around inserted portion to make insertion point # a bit clearer in sample output. print "$sBefore|$sSubject|$sAfter\n"; } __DATA__ abc123def 123xxxx 9 7 5 3 0 2 1234xxxx YYYxxYZ0 8 8 5 4 3 4

The above only illustrates merging two strings. If you intend to do several QUERY-SUBJECT pairs (as I imagine you do), then each insertion changes the position within the string and offsets will no longer match the positions recorded in your dataset. The easiest way to avoid this problem is to build a graph of QUERY-SUBJECT relations and then traverse it depth first, so that you are guaranteed not to need to insert a string into any string that has already been modified by insertion. Building, and traversing such a graph is less about hashes and more about navigation and recursion (or looks and stacks if you want a non-recursive solution).

Best, beth

Update: added note about expanding the solution to a large number of records.


In reply to Re^3: Will hash of hashes work? by ELISHEVA
in thread Will hash of hashes work? by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.