KalaMonkey has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I need to remove the text between a set of apostrophes in a string.

for example

"Can you please remove 'me' as that me is no longer required, but this 'X' should go as well"

would become

"Can you please remove as that me is no longer required, but this should go as well"

I was looking at using s/// regex or substr but the fact that I don't know what will be between the apostrophes (') is causing me issues

Thanks in advance

Replies are listed 'Best First'.
Re: strip text from a string
by Athanasius (Archbishop) on Sep 11, 2014 at 14:58 UTC

    Hello KalaMonkey, and welcome to the Monastery!

    Here is one way:

    0:55 >perl -wE "my $s= qq[Can you please remove 'me' as that me is no + longer required, but this 'X' should go as well]; $s =~ s/(\s*'[^']* +?')//g; say qq[\n$s];" Can you please remove as that me is no longer required, but this shoul +d go as well 0:55 >

    The character class [^'] matches any character other than an apostrophe, and the question mark added to the quantifier *? makes the match non-greedy.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      s/(\s*'[^']*?')//g;

      I'm not sure why you enclose the pattern in parentheses as I don't think a capture is required. Also, you don't need the non-greedy quantifier when using negated character classes, that's the whole point of using them as it means you can avoid the commonly seen .*? pattern.

      I hope this is of interest.

      Cheers,

      JohnGG

        Hello johngg,

        Both excellent points! Thanks for the corrections:

        1:41 >perl -wE "my $s= qq[Can you please remove 'me' as that me is no + longer required, but this 'X' should go as well]; $s =~ s/\s*'[^']*' +//g; say qq[\n$s];" Can you please remove as that me is no longer required, but this shoul +d go as well 1:41 >

        Cheers,

        Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

        That did the trick thanks!
      Hi,

      Thanks for that, it's put me on the correct track, however when I run it inside a perl script using Starwberry Perl

      #!/usr/bin/perl #use strict; use warnings; $str = "select Assigned, null, null, count(*) from wftask where Assign +ed in ('15000847','20005966','20005965','20004711','15120173','150004 +37','15023846','15022744','15062553','15149541','15000245','15000217' +,'15000803','15000636','15000690','15000437','15001069','15119338','1 +5022744','20009455','20001463','15179195','20008500','20004675','1500 +0988','15179195','15119270') and RefClass is not null and RefKey is n +ot null group by Assigned union select null, null, DistGroup, count(* +) from wftask where DistGroup in ('SLT','Budget Signatory/310','Budge +t Signatory/544','Budget Signatory/539','Budget Signatory/518','Budge +t Signatory/321','Budget Signatory/543','Budget Signatory/338','Budge +t Signatory/513','Budget Signatory/6','Budget Signatory/359','Budget +Signatory/358','Budget Signatory/530','Budget Signatory/300','Budget +Signatory/550','Budget Signatory/490','Budget Signatory/491','Budget +Signatory/376','Budget Signatory/377','Budget Signatory/401','Budget +Signatory/178','Budget Signatory/18','Budget Signatory/236','Budget S +ignatory/40','Budget Signatory/246','Budget Signatory/507','Budget Si +gnatory/475','Budget Signatory/464','Budget Signatory/459','Budget Si +gnatory/18','Budget Signatory/310','Budget Signatory/544','Budget Sig +natory/539','Budget Signatory/518','Budget Signatory/321','Budget Sig +natory/543','Budget Signatory/338','Budget Signatory/513','Budget Sig +natory/6','Budget Signatory/359','Budget Signatory/358','Budget Signa +tory/530','Budget Signatory/300','Budget Signatory/550','Budget Signa +tory/490','Budget Signatory/491','Budget Signatory/376','Budget Signa +tory/377','Budget Signatory/401','Budget Signatory/178','Budget Signa +tory/18','Budget Signatory/236','Budget Signatory/40','Budget Signato +ry/246','Budget Signatory/507','Budget Signatory/475','Budget Signato +ry/464','Budget Signatory/459','North Long Term Allocation','North Lo +ng Term Approvals','North Long Term Review','North Long Term Safeguar +ding','Placement Review','RAS Surgery','SLT Awaiting Allocation','SLT + Duty','SLT Duty Senior','SLT OT Equipment','SLT Pending','SLT Review +','SLT Single Service Reviews','KT26099','Budget Signatory/18') and R +efClass is not null and RefKey is not null group by DistGroup union s +elect null, DistDept, null, count(*) from wftask where DistDept in (' +D3347') and RefClass is not null and RefKey is not null group by Dist +Dept"; $str =~ s/(\s*'[^']*+?')//g; print $str
      I get the message Nested quantifiers in regex; marked by <-- HERE in m/(\s*'^'*+? <-- HERE ')/ at C:\Perl\Perl_tests\SQLTest.pl line 6.

      When I run it on the command line it works fine. As I have said in a another reply, I only started to use perl on Monday please excuse my ignorance.

      Thanks
        Nested quantifiers in regex; marked by <-- HERE in m/(\s*'^'*+? <-- HERE ')/ at ...

        The quantifier sequence  *+? is not valid and is termed 'nested'. See johngg's reply.

        Update: See also Quantifiers in perlre. See in particular the paragraph beginning "Note that the possessive quantifier modifier can not be be [sic] combined with the non-greedy modifier."

        ,p>I did not care to read the whole SQL statement. Most likely there may be character(s) in the query that are significant (meta characters) for correct regex. So try to escape the string either via quotemeta or \Q & \E.
Re: strip text from a string
by kennethk (Abbot) on Sep 11, 2014 at 15:01 UTC
    Welcome to the monastery, KalaMonkey. I assume this is a homework assignment. While we are happy to help with homework, it is considered good form to identify it as such, and even better to show us how your code has failed you in previous attempts. See How do I post a question effectively?.

    In this case, you should read Using character classes in perlretut. In particular, you are interested in the set of all characters that are not apostrophes. This is usually expressed as [^']. You will also need to use Quantifiers to specify a character count: in this case *, which gives you zero or more occurrences.


    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      Hi,

      This isn't a homework assignment, I'm trying to strip out column values in various SQL statements, that have been outputted to a log file. The example given was a trivial example to simplify the request, I have a perl script that pulls the statements into a .csv file, so that I can then group the statements together and write indexes based on the SQL.

      The SQL is produced from code, so will always call the same columns in the same order so as long as I strip out the data held in the columns I can work out what indexes are needed.

      I only started to use perl on Monday, I'll have a look at the perldocs. It's a case of on the job learning.

      Thanks

        In addition to kennethk's list of resources I strongly recommend getting a copy of the Perl Pocket Reference.*

        * It must be good, this is the second time I've made the recommendation in as many days (and I'm not the author or on commission).

        Perl is the programming world's equivalent of English