Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,

It's my first post here.Please help me.

I have few text strings that matach to another at certain locations.

Like for example:
NODE1 AT 600-630(of 630 length ) maps to NODE2(of 630 length) at 1-30. NODE1 1-30 maps to NODE3(of 630 length) at 600-630.
Means:I need to connect NODE3->NODE1>NODE3 at the positions said .

I s it posible to do this with hash of hashes?Or how?

All have is a tab delimited tabble with NODES and th e positions.

Please help me-

Thanks,

Replies are listed 'Best First'.
Re: Will hash of hashes work?
by CountZero (Bishop) on Mar 30, 2009 at 15:49 UTC
    It would help all of us if you:
    • Show us a relevant part of your input CSV-file;
    • Let us know what the intended output should look like (based upon the example input file you provide).
    Otherwise we can keep on guessing what you have and what you need and none would be the wiser.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Will hash of hashes work?
by locked_user sundialsvc4 (Abbot) on Mar 30, 2009 at 15:00 UTC

    I agree:   you should step back now and review your entire approach here. (Follow that link for “XY problem...”)

    The bucket of any hash or list can contain one value... but that “value” can be a reference to anything. Therefore, while you do not properly have “hashes of hashes,” hashes can contain hashrefs.

    Nevertheless, if you are now contemplating “funky data structures,” as a rule of thumb it is wise to step back and reconsider just where you are intending to arrive, and whether the path that you have selected is really the right one.

      Hi, I want to merge those nodes that are related to a single string.Thats all I wanted.But I am not sure how to approach that. Please send me some suggestions. thanks in advance,
      Hi, What I want to do is to merge all three nodes to one string. please-give me some suggestions.
Re: Will hash of hashes work?
by Anonymous Monk on Mar 30, 2009 at 14:02 UTC
    Its possible. What do you intend to do with said hashes? Maybe its an XY Problem :)
Re: Will hash of hashes work?
by missingthepoint (Friar) on Mar 31, 2009 at 08:50 UTC
    It's my first post here

    Nonsense! *runs*

    Seriously, is the example you posted part of the tab delimited table you mention?


    "Half of all adults in the United States say they have registered as an organ donor, although only some have purchased a motorcycle to show that they're really serious about it."
      Alright folks,

      Here is a small chunk from my table:

      NODE_104 NODE_2541 7682 61 7682 7653 32 61 NODE_2541 NODE_2313 61 189 1 30 160 189 NODE_2313 NODE_2855 189 61 160 189 1 30

      Col1:Query NODE

      col2 Subject NODE

      col3:length of Query string

      col4:length of Subj string

      Cil5:Query node from(maps on to subj node "from" what char)

      col6:Query node to(maps on to subj node "to" what char)

      col7:Subj node from(maps on to query node "from" what char)

      col8:Subj node to(maps on to query node "to" what char)

      ------- NODE 104 -------- NODE 2541 ------- NODE 2313 -------- NODE 2855

      RESULTING TO |--------------------------|(merge of all the above nodes)

        I'm not at all clear about what you are trying to do: do you want to insert the SUBJECT NODE into the QUERY NODE or vice versa? And if so, why? - it appears it is already done.

        Based on your column descriptions, you have everything you need to know in the table and in fact the strings are already "joined". Columns 5-8 tell you where to find the overlaps. For example, in row 1 of your sample positions 7653-7682 of the QUERY NODE correspond to positions 32-61 in the SUBJECT NODE.

        But I doubt you are asking about something that is already done for you. What exactly are you trying to do? Insert all of the SUBJECT node in the same location where now only a part of it exists?

        Assuming that you are trying to insert the whole string where only a part of it exists, then hashes have nothing to do with this. Instead you need to use substr to get the part before and after the overlapping portion, like this:

        use strict; use warnings; while(my $line = <DATA>) { chomp $line; my ($sQuery, $sSubject, $iLengthQ, $iLengthS, $iEndSinQ, $iBeginSinQ +) = split(/\s+/, $line); #print "<$sQuery> <$sSubject> <$iEndSinQ> <$iBeginSinQ>\n"; my $sBefore = substr($sQuery, 0, $iBeginSinQ); my $sAfter = substr($sQuery, $iEndSinQ+1); # pipes added around inserted portion to make insertion point # a bit clearer in sample output. print "$sBefore|$sSubject|$sAfter\n"; } __DATA__ abc123def 123xxxx 9 7 5 3 0 2 1234xxxx YYYxxYZ0 8 8 5 4 3 4

        The above only illustrates merging two strings. If you intend to do several QUERY-SUBJECT pairs (as I imagine you do), then each insertion changes the position within the string and offsets will no longer match the positions recorded in your dataset. The easiest way to avoid this problem is to build a graph of QUERY-SUBJECT relations and then traverse it depth first, so that you are guaranteed not to need to insert a string into any string that has already been modified by insertion. Building, and traversing such a graph is less about hashes and more about navigation and recursion (or looks and stacks if you want a non-recursive solution).

        Best, beth

        Update: added note about expanding the solution to a large number of records.