maha has asked for the wisdom of the Perl Monks concerning the following question:

hello!

i had a problem in string concatenation, i want to concatenate all the authors name into single string

(e.x):

<P_contrib-author>Mariel Miller</P_contrib-author>,<P_contrib-author>Allyson Fiona Hadwin</P_contrib-author>,<P_contrib-author>Jenna Cambria</P_contrib-author>

output:

<alt-title alt-title-type="running-head-verso">Mariel Miller, Allyson Fiona Hadwin, Klauda, and Jenna Cambria</alt-title>

I try to below coding but it does't work properly

while($line =~ /<P_contrib-author>(.*?)<\/P_contrib-author>/) { $auth .=$1; } $line ="<alt-title alt-title-type=\"running-head-verso\">$auth</alt- +title>";
Give some suggestions. Thank in advance

Replies are listed 'Best First'.
Re: string concatenation
by moritz (Cardinal) on Dec 28, 2011 at 07:49 UTC
    I try to below coding but it does't work properly

    That's not a proper error description. What happens, what output do you get, and how does it differ from what you expect? Do red and green flames come out of your PC when you try to run the code?

    while($line =~ /<P_contrib-author>(.*?)<\/P_contrib-author>/)

    I guess you need to use the /g modifier at the end of the regex.

      hi moritz, thanks for ur concern,the modifier will not work and i want the all author content into a single tag but it repeating the tag n times

        ... the modifier will not work ...

        With respect, this is not a good response. Please re-read moritz's reply with particular attention to the questions being asked therein.

Re: string concatenation
by AnomalousMonk (Archbishop) on Dec 28, 2011 at 08:54 UTC

    In general, a regex approach is not the best for parsing HTML, but here's a start anyway:

    >perl -wMstrict -le "my $s = '<P_contrib-author>Mariel Miller</P_contrib-author>,' . '<P_contrib-author>Allyson Fiona Hadwin</P_contrib-author>,' . '<P_contrib-author> Jenna Cambria </P_contrib-author>' ; print qq{[[$s]]}; ;; my $tag = qr{ P_contrib-author }xms; my $open_tag = qr{ < $tag > }xms; my $close_tag = qr{ < / $tag > }xms; ;; while ($s =~ m{ $open_tag \s* (.*?) \s* $close_tag }xmsg) { print qq{name: '$1'}; } " [[<P_contrib-author>Mariel Miller</P_contrib-author>,<P_contrib-author +>Allyson Fiona Hadwin</P_contrib-author>,<P_contrib-author> Jenna Ca +mbria </P_contrib-author>]] name: 'Mariel Miller' name: 'Allyson Fiona Hadwin' name: 'Jenna Cambria'

    Perhaps some monks better versed in XML parsing than I can suggest a more robust approach.

Re: string concatenation
by thomas895 (Deacon) on Dec 28, 2011 at 07:53 UTC

    I would have done it differently. Personal preference dictates that I code like so:

    my $str = "<P_contrib-author>Mariel Miller</P_contrib-author>,<P_contr +ib-author>Allyson Fiona Hadwin</P_contrib-author>,<P_contrib-author>J +enna Cambria</P_contrib-author>"; my @rawauthours = split( ",", $str ); my @authours = (); foreach (@rawauthours) { $_ =~ /<P_contrib-author>(.*)<\/P_contrib-author>/; push( @authours, $1 ); } print '<alt-title alt-title-type="running-head-verso">', join( ", ", @authours ), '</alt-title>';

    If you want to make your script work, you should probably remove the willdcard(?), making it:

    while($line =~ /<P_contrib-author>(.*)<\/P_contrib-author>/)

    ...but I am not sure about that.

    ~Thomas~

      hi Thomas , i got some idea and ur code working partially in my integrated code, i hope it ll come properly.......thank u so much

Re: string concatenation
by Marshall (Canon) on Dec 30, 2011 at 00:55 UTC
    With match global, you can just assign the authors to an array of @authors (no while loop needed).

    There appear to be some special cases for formatting..0,1,2,3,more authors. Some simple "if" logic could solve that.

    The more general situation would be names like: Bob Smith, MD; Porterhouse, A. B.; Freddie, Jr. or some such name with commas in it. Often ";" is used instead of commas so that it is possible to re-parse this concatenated line back into the original names. You should consider the ramifications (if any) of creating a string that is difficult for a program to re-parse.

    Update: To the best of my knowledge, the use of ";" as a separator when one or more names contain commas, is proper English grammar. I would suggest scalar grep.  if (grep{/,/}@authors){ #use ';' instead of ',' for the separator.}

    #!/usr/bin/perl -w use strict; my $str = '<P_contrib-author>Mariel Miller</P_contrib-author>,<P_contr +ib-author>Allyson Fiona Hadwin</P_contrib-author>,<P_contrib-author>J +enna Cambria</P_contrib-author>'; my @authors = ($str =~ /<P_contrib-author>\s*(.*?)\s*<\/P_contrib-author>/g); if (@authors >= 3) { print join (", ", @authors[0..@authors-2]), " and $authors[@authors-1]\n"; } elsif (@authors == 2) { print "$authors[0] and $authors[1]\n"; } else { print "$authors[0]\n"; }