in reply to Perl regex

You will need to:

  1. Create a full, detailed specification of the problem
  2. Design an algorithm to match the spec
  3. Code up the algorithm
  4. Test the code

If the tests fail you then need to ascertain whether your algorithm is flawed or just the implementation of it and then revisit steps 2 or 3 iteratively until all the tests pass.

Since your statement of the problem is woolly you probably need to start right at point 1. After that someone here might be able to help you with the other parts but only you know the initial problem to be solved.

Replies are listed 'Best First'.
Re^2: Perl regex
by Nicpetbio23! (Acolyte) on Jul 12, 2017 at 14:31 UTC
    1.Create a full, detailed specification of the problem : There is a ton of superfluous information in this file.
    a. File has multiple close relatives for a gene of interest. I only ne +ed one close relative in the output file. b. Some lines have a repeat of same sequence. For example: Metac1_3189 +(Metac1_3189) I want to exclude these from output file b. File has parentheses which I want to exclude from output file c. Close relative and gene of interest are not separated into two dist +inct columns. I want to include these in output file.

      Great, so now you have a full, detailed spec (or so we think). On to step 2 -> create an algorithm to match the spec.