in reply to Re^2: Remove text between two Start and End Tags (Regex)
in thread Remove text between two Start and End Tags (Regex)

Not greedy (although it is), global (meaning find all of them).

Update: Show me to read quickly.

--MidLifeXis

  • Comment on Re^3: Remove text between two Start and End Tags (Regex)

Replies are listed 'Best First'.
Re^4: Remove text between two Start and End Tags (Regex)
by danj35 (Sexton) on Apr 19, 2011 at 16:07 UTC
    One more question related to this. Is there a way to try and find a variable outside of a start and end tag and then replace this with something new? So the start sentence would look like this,

    "The increase in sensitivity of HIV - infected cells to <GENE> Fas </GENE> killing mapped to <GENE> vpu </GENE> , while nef , <GENE> vif </GENE> , <GENE> vpr </GENE> , and second exon of <GENE> tat</GENE> did not appear to contribute"

    And end up looking like this:

    "The increase in sensitivity of HIV - infected cells to <GENE> Fas </GENE> killing mapped to <GENE> vpu </GENE> , while <PGENE> nef </PGENE> , <GENE> vif </GENE> , <GENE> vpr </GENE> , and second exon of <GENE> tat</GENE> did not appear to contribute"

    Notice the addition of a PGENE tag following the match of nef (outside of the start and end gene tags) to a variable.

    I appreciate this might be slightly confusing, but at the moment I'm racking my brain trying to figure out a way to join it all back together correctly if I do this using lots of splits.

    Thanks

      Lots of splits? Use just one:
      my $s = 'vif The increase in sensitivity of HIV - infected cells to <G +ENE> vif Fas </GENE> killing mapped to <GENE> vpu </GENE> , while nef + , vif, <GENE> vif </GENE> , <GENE> vpr </GENE> , and second exon of +<GENE> tat</GENE> did not appear to contribute'; my @ar = split m%(</?GENE>)%, $s; for my $i (0 .. @ar/4) { $ar[4*$i]=~ s%vif%<PGENE>vif</PGENE>%; } $s = join q[],@ar; print "$s\n";