in reply to parse for string, then print unique

For one, your parentheses are outside of the regular expression so you aren't really capturing anything. For two, I don't think your pattern match is quite right anyway. What does "strings that start with dollar sign" mean? Do you want only those lines that begin with dollar signs or do you want any dollar sign followed by a sequence of alphanumerics? The former would look more like /^$(.*)/ and the latter would look like /($\w+)/. The ^ in your pattern confuses me. Also, since you want the unique entries, you should stick them in a hash. One of the defining properties of a hash is that the keys are unique. So something like ...

while (<>) { chomp; $hash{$1}++ if /($\w+)/; } print map { "$_\n" } keys %hash;

But again, I'm still not sure what you're trying to match really

Replies are listed 'Best First'.
Re: Re: parse for string, then print unique
by mhearse (Chaplain) on Dec 16, 2003 at 04:58 UTC
    Slight change to my post. I need to match any alphanumeric string which contains a dollar sign. $ may be anywhere in the string. I would like to grab the entire value, including the $. An example value is "$LEGEND2" (quotes are part of string).
      You mean something like this?
      use strict; use warnings; use Data::Dumper; my $str = '"$a $a $a" ab bc "c$d e$f$" c$d "$a $a $a"'; my @uniq = keys %{{map {$_ => 1}($str=~m/("[^"]*\$[^"]*")/g)}}; print Dumper(\@uniq);
      And the output is -
      $VAR1 = [ '"c$d e$f$"', '"$a $a $a"' ];
      Note that I kept the regex simple by not trying to escape the string in the example above. If you want to include escaped strings, the regex becomes more complicated.
      ... my $str = '"$a $a $a" ab bc "c$d \"e$f$" c$d "$a $a $a"'; my @uniq = keys %{{map {$_ => 1} ($str=~m/("(?:\\"|.)*?\$(?:\\"|.)*?")/g)}}; ...
      And the new output -
      $VAR1 = [ '"c$d \\"e$f$"', '"$a $a $a"' ];
      Since dollar signs are neither alpha's, nor numeric characters, what are alphanumeric strings which contain a dollar sign? Are "$$LEGEND2", "L$E$G$E$N$D$2", "LEG!$!END2" valid strings?

      Abigail

      If the quotes are part of the string, what else would you allow to be part of the string? Is more than one $ ok? Based on the ^ in your original try, is what you really want entire lines of the file? If so, what lines don't you want?

      I note that your original posting kept the strings in original order (by filtering out duplicates assuming they were in sorted order). If that is what you want, any of the solutions involving keys aren't going to do what you want. A hash approach will still work, but goes like this:

      my %seen; @out = map !$seen{$_}++, @in;
      or just keep your original grep (unless the strings aren't in sorted order).