in reply to Re: Replace table values from text database
in thread Replace table values from text database

(And, on the face of it, processing the whole file completely to perform each substitution is a nuts wayto approach the problem. Horribly inefficient, when the whole process can be done in a single pass.)

I totally agree, but I do not know how to achieve such thing. Would you please show/link an example I can use to figure out how to do it?

Thanks in advance

  • Comment on Re^2: Replace table values from text database

Replies are listed 'Best First'.
Re^3: Replace table values from text database
by BrowserUk (Patriarch) on Mar 14, 2016 at 14:40 UTC

    Sure. If you answer my question?

    And, as requested elsewhere, post some real data: inputs and expected output. It only need be a dozen lines of each file; preferably that connect with each other.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Sorry, for not answering your question.

      Does the set of replacement names overlap with the set of original names?

      I think both sets of names do not overlap. You can see the example data I posted and tell me if you think different.

      Thanks again.

      *Update*

      You were right, after a few records, names may overlap e.g.(A_fumigatus_1 overlaps A_fumigatus_10 or A_fumigatus_17).

      So I guess that is the main source of error during translation.

        Try this:

        #! perl -sw use strict; use Inline::Files; my %reps = map{ split "\t", $_, 2 } <REPLACEMENTS>; chomp %reps; while( <DATAFILE> ) { my @items = split "\t", $_; $_ = $reps{ $_ } // $_ for @items; print join "\t", @items; } __REPLACEMENTS__ Aspergillus_clavatus_1 XP_001276684.1 pectate lyase, putative [Aspe +rgillus clavatus NRRL 1] Aspergillus_fumigatus_2 XP_001276694.1 conserved hypothetical prote +in [Aspergillus fumigatus NRRL 1] Aspergillus_flavus_3 XP_001276726.1 tyrosinase central domain prote +in [Aspergillus flavus NRRL 1] Aspergillus_terreus_4 XP_001276738.1 endoglucanase, putative [Asper +gillus terreus NRRL 1] __DATAFILE__ Aspergillus_clavatus_1 Aspergillus_flavus_198 Aspergillus_terreus_ +166 Aspergillus_fumigatus_2 Aspergillus_clavatus_1 Aspergillus_flavus_3 Aspergillus_terreus_ +4 Aspergillus_fumigatus_2 Aspergillus_clavatus_3 Aspergillus_flavus_198 Aspergillus_terreu_1 +66 Aspergillus_fumigatus_16

        Output:

        C:\test>1157643 XP_001276684.1 pectate lyase, putative [Aspergillus clavatus NRRL 1] + Aspergillus_flavus_198 Aspergillus_terreus_166 Aspergillus_fumigat +us_2 XP_001276684.1 pectate lyase, putative [Aspergillus clavatus NRRL 1] + Aspergillus_flavus_3 Aspergillus_terreus_4 Aspergillus_fumigat +us_2 Aspergillus_clavatus_3 Aspergillus_flavus_198 Aspergillus_terreu_166 + Aspergillus_fumigatus_16

        Note:the use of Inline::Files is for testing only, you'll need to open your data files in the normal way.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
        In the absence of evidence, opinion is indistinguishable from prejudice.