in reply to how to find what's not there with a regex?

This works with your data:
while (<>) { chomp; while ( / (\w+) # Name ($1) \s* # Spaces (optional) = # Equal sign \s* # Spaces (optional) ( ' # Quote [^']* # Non-quotes ' # Quote | # -or- [^'\s]+ # Non-spaces|quotes ) /xg ) { my ($name, $expr) = ($1, $2); $expr = substr($expr, 1, -1) if substr($expr, 0, 1) eq "'"; print("var: $name, expr: $expr\n"); } }

Updated to catch unquoted expressions.

Output:

var: drsubc, expr: agauss(0, <--- Doesn't work :( var: delm1, expr: 0 + 0.045u*distm1 var: delm2, expr: 0 + 0.07u*distm2 var: delm3, expr: 0 + 0.07u*distm3 var: delm4, expr: 0 + 0.07u*distm4 var: delmt, expr: 0 + 0.07u*distmt var: delml, expr: 0.16u + 0.43u*distml var: delam, expr: 0.32u + 0.86u*distam var: dele1, expr: 0 + 0.25u*diste1 var: dele2, expr: 0 + 0.25u*diste2 var: delma, expr: 0.16u + 0.6u*distma var: pmsxt, expr: npmsxt + 12.5u*dpmsxt var: tih, expr: 0.35u <--- Works :) var: capct, expr: 0.50u + 0.13u*xdcapct var: capcti, expr: 0.55u + 0.13u*xdcapct var: m1t, expr: 0.41u + 0.05u*xdm1t var: m1ti, expr: 0.36u + 0.05u*xdm1t var: m2t, expr: 0.48u + 0.057u*dm2t var: m3t, expr: 0.48u + 0.057u*dm3t var: m4t, expr: 0.48u + 0.057u*dm4t var: mtt, expr: 0.48u + 0.057u*dmtt var: qtt, expr: 0.242u + 0.0202u*dqtt var: htt, expr: 0.242u + 0.0202u*dhtt var: mlt, expr: 2.0u + 0.2u*dmlt var: amt, expr: 4.0u + 0.4u*damt var: e1t, expr: 3.0u + 0.5u*de1t var: e2t, expr: 4.0u + 0.5u*xde1mat var: mat, expr: 4.0u + 0.4u*dmat var: m1m2t, expr: 0.35u + 0.05u*dm1m2t

Replies are listed 'Best First'.
Re^2: how to find what's not there with a regex?
by samizdat (Vicar) on Aug 24, 2005 at 13:54 UTC
    That works with the quoted variant, ikegami, but not the unquoted variant, like the first function. How do I say 'anything including spaces up to the first occurrence of more than one space in a row'?

      What follows is a solution which requires the minimum knowledge of the format. It works with the two special cases. Sorry, I must be tired today.

      while (<>) { chomp; while ( / (\w+) # An identifier. \s* = \s* # Equal with opt spaces. ( (?: (?! \s+ \w+ \s* = ) # Stop if we see the next formula. . # A chararacter. )+ ) /xg ) { my ($name, $expr) = ($1, $2); $expr = substr($expr, 1, -1) if substr($expr, 0, 1) eq "'"; print("var: $name, expr: $expr\n"); } }

      Output:

      var: drsubc, expr: agauss(0, 1, 3) <- Works var: delm1, expr: 0 + 0.045u*distm1 var: delm2, expr: 0 + 0.07u*distm2 var: delm3, expr: 0 + 0.07u*distm3 var: delm4, expr: 0 + 0.07u*distm4 var: delmt, expr: 0 + 0.07u*distmt var: delml, expr: 0.16u + 0.43u*distml var: delam, expr: 0.32u + 0.86u*distam var: dele1, expr: 0 + 0.25u*diste1 var: dele2, expr: 0 + 0.25u*diste2 var: delma, expr: 0.16u + 0.6u*distma var: pmsxt, expr: npmsxt + 12.5u*dpmsxt var: tih, expr: 0.35u <- Works var: capct, expr: 0.50u + 0.13u*xdcapct var: capcti, expr: 0.55u + 0.13u*xdcapct var: m1t, expr: 0.41u + 0.05u*xdm1t var: m1ti, expr: 0.36u + 0.05u*xdm1t var: m2t, expr: 0.48u + 0.057u*dm2t var: m3t, expr: 0.48u + 0.057u*dm3t var: m4t, expr: 0.48u + 0.057u*dm4t var: mtt, expr: 0.48u + 0.057u*dmtt var: qtt, expr: 0.242u + 0.0202u*dqtt var: htt, expr: 0.242u + 0.0202u*dhtt var: mlt, expr: 2.0u + 0.2u*dmlt var: amt, expr: 4.0u + 0.4u*damt var: e1t, expr: 3.0u + 0.5u*de1t var: e2t, expr: 4.0u + 0.5u*xde1mat var: mat, expr: 4.0u + 0.4u*dmat var: m1m2t, expr: 0.35u + 0.05u*dm1m2t
        hmmm.... (?! xyz ) { what-bang? :) }

        New one on me, but that's an excellent idea.

        Stop when you see xyz...

        Perl++ never ceases to amaze me.
      How do I say 'anything including spaces up to the first occurrence of more than one space in a row'?
      A literal translation (untested) would be /(?>.*?(?=  ))/s.
        Here's an alternative, in case yours doesn't work:
        # Read these comments from the bottom up. / ( (?: (?! \s{2} ) # which aren't the start of 2 spaces. . # characters )* # zero or more ) # Capture /sx
Re^2: how to find what's not there with a regex?
by pbeckingham (Parson) on Aug 24, 2005 at 13:53 UTC

    Sorry - this doesn't handle the non-quoted element.



    pbeckingham - typist, perishable vertebrate.
      I fixed it while you were replying :)