in reply to Interesting Regex Question

The only function of the commas between the entries is to define the elements of a list - you need () not {} to define a list BTW. I assume you are presenting a list as what you do present will not compile. Once we assign to the @article array the commas between the elements are discarded as they have done their job of separating the list elements.

Thus we can do what you ask like this:

my @article = ( 'author = {Author, J.P }' , 'title = "A paper about things"' , 'etc...' ); # iterate over @article array removing all commas # the commas in the list assignment are gone having # been used to assign the list elements to the array # @article so we have no problem with them now. s/,//g for @article; # this is the same as foreach (@article) { $_ =~ s/,//g; } # in less idiomatic Perl where we do not take advantage of # the aliasing of array elements to $_ we have to write for my $i (0..$#article) { $article[$i] =~ s/,//g; } # to print all the elements of a list # on newlines I would normally just write: print "$_\n" for @article; # this takes advantage of the aliasing to $_ in a for loop # alternatively in long perl foreach my $stuff (@article) { print "$stuff\n"; } # to join all the edited elements will commas my $joined = join ",", @article; print $joined;

Update - Whoops!

Having solved the wrong problem here is a solution to the bibtex code parsing such as I understand it. Thanks to kschwab. Assuming a record as you have shown this does the trick.

my $string = <<'STRING'; @article{ author = {Author, J.P } , title = "A paper, about things" , etc.. } STRING my $open = ''; my $commaless = $1 if $string =~ s/^(\s*@\w+\s*{)//; for (split //,$string) { if (/{|"/ and not $open) { $open = /{/ ? '}' : '"'; $commaless .= $_; next; } $commaless .= $_ unless /,/ and $open; $open = '' if $open eq $_; } print $commaless;

First we eat up the opening @article{ so we get into the guts of the problem. Then the code splits the string into characters. Now, if we find an opening delimiter we set $open to the appropriate closing delimiter '}' or '"' depending on what it is and declare it open season on commas until we find the closing delimiter. We add all the chars which are not commas to $commaless and thus remove the commas as desired.

Trying to do this with a single regex would be difficult, and certainly harder to understand.

hope this helps

cheers

tachyon

s&&rsenoyhcatreve&&&s&n\w+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Replies are listed 'Best First'.
Re: Re: Interesting Regex Question
by kschwab (Vicar) on Jul 02, 2001 at 16:51 UTC
    Lots of good info, but I think you may have confused the original node's example bibtex source as perl code. The piece:

    @article{ author = {Author, J.P } , title = "A paper about things" , etc.. }

    is bibtex source, and not an attempt by the node creator to make a perl array....

      Thanks for the heads up. As you correctly assume, I've got no idea what bibtex is - strange source though. I've modifie the posted code (second bit that does the string) to handle this assuming you get this entire entity as a single string. Hope that's correct

      tachyon

      s&&rsenoyhcatreve&&&s&n\w+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

        For bibtex/latex info, just look here and here.

        Basically, latex is a markup language, a grandchild of SGML so you will and a cousin of HTML. Bibtex is used to describe bibliographic references in separate files, to be included in latex sources.

        Cheers,

        Jeroen
        "We are not alone"(FZ)