Re: Interesting Regex Question

The only function of the commas between the entries is to define the elements of a list - you need () not {} to define a list BTW. I assume you are presenting a list as what you do present will not compile. Once we assign to the @article array the commas between the elements are discarded as they have done their job of separating the list elements.

Thus we can do what you ask like this:

my @article =  ( 
            'author = {Author, J.P }' ,
            'title  = "A paper about things"' ,
            'etc...'
            );


# iterate over @article array removing all commas
# the commas in the list assignment are gone having
# been used to assign the list elements to the array
# @article so we have no problem with them now.

s/,//g for @article;

# this is the same as
foreach (@article) {
    $_ =~ s/,//g;
}

# in less idiomatic Perl where we do not take advantage of
# the aliasing of array elements to $_ we have to write
for my $i (0..$#article) {
   $article[$i] =~ s/,//g;
}

# to print all the elements of a list 
# on newlines I would normally just write:

print "$_\n" for @article;

# this takes advantage of the aliasing to $_ in a for loop

#  alternatively in long perl
foreach my $stuff (@article) {
    print "$stuff\n";
}

# to join all the edited elements will commas
my $joined = join ",", @article;
print $joined;
[download]

Update - Whoops!

Having solved the wrong problem here is a solution to the bibtex code parsing such as I understand it. Thanks to kschwab. Assuming a record as you have shown this does the trick.

my $string = <<'STRING';
@article{ 
        author = {Author, J.P } ,
        title  = "A paper, about things" ,
        etc..
     }
STRING
my $open = '';
my $commaless = $1 if $string =~ s/^(\s*@\w+\s*{)//;
for (split //,$string) {
    if (/{|"/ and not $open) {
        $open = /{/ ? '}' : '"';
        $commaless .= $_;
      next;
    }
    $commaless .= $_ unless /,/ and $open;
    $open = '' if $open eq $_;
}
print $commaless;
[download]

First we eat up the opening @article{ so we get into the guts of the problem. Then the code splits the string into characters. Now, if we find an opening delimiter we set $open to the appropriate closing delimiter '}' or '"' depending on what it is and declare it open season on commas until we find the closing delimiter. We add all the chars which are not commas to $commaless and thus remove the commas as desired.

Trying to do this with a single regex would be difficult, and certainly harder to understand.

hope this helps

cheers

tachyon

s&&rsenoyhcatreve&&&s&n\w+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Comment on Re: Interesting Regex Question Select or Download Code

Replies are listed 'Best First'.
Re: Re: Interesting Regex Question by kschwab (Vicar) on Jul 02, 2001 at 16:51 UTC
Lots of good info, but I think you may have confused the original node's example bibtex source as perl code. The piece: `@article{ author = {Author, J.P } , title = "A paper about things" , etc.. }` [download] is bibtex source, and not an attempt by the node creator to make a perl array....	[reply] [d/l]
Re: Re: Re: Interesting Regex Question by tachyon (Chancellor) on Jul 02, 2001 at 18:26 UTC
Thanks for the heads up. As you correctly assume, I've got no idea what bibtex is - strange source though. I've modifie the posted code (second bit that does the string) to handle this assuming you get this entire entity as a single string. Hope that's correct tachyon s&&rsenoyhcatreve&&&s&n\w+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print	[reply]
Re:{4} Interesting Regex Question by jeroenes (Priest) on Jul 02, 2001 at 20:03 UTC
For bibtex/latex info, just look here and here. Basically, latex is a markup language, a grandchild of SGML so you will and a cousin of HTML. Bibtex is used to describe bibliographic references in separate files, to be included in latex sources. Cheers, Jeroen "We are not alone"(FZ)	[reply]