dmaranan has asked for the wisdom of the Perl Monks concerning the following question:

I thought it would be easy to find this solution to my problem, but for some reason it isn't... I have a plugin that receives xml data that uses semicolons as delimiters. For example,  <items>item1;bottle<items>. I'm having a problem parsing this data. Perl interprets this as receiving an end of line whenever I use the data. I was thinking about doing a search for a semicolon and replacing it with a pipe |. Can anyone show me how to do this? TIA

Replies are listed 'Best First'.
Re: Semicolons in webforms
by derby (Abbot) on Dec 30, 2007 at 17:25 UTC

    The semicolon is the lesser known valid parameter separator. If you dump your CGI object, you'll probably see all the data after the first semicolon (take a look at the parse_param method in CGI). Is the xml the only data coming in the submit? If so, your best bet is to see the HANDLING NON-URLENCODED ARGUMENTS section of CGI.

    -derby
Re: Semicolons in webforms
by graff (Chancellor) on Dec 30, 2007 at 16:56 UTC
    Perl interprets this as receiving an end of line whenever I use the data.

    What is the evidence that makes you think perl is interpreting a semicolon as an end-of-line? And what does "use the data" actually mean here?

    In other words, show us some code that demonstrates how you are "using the data", show the particular result that indicates "receiving an end of line", and show us what you want (or expect) the result to be so we can see how it differs from the result you actually get.

      You are correct, I think my assumption that it is interpreting a semicolon might be in correct. The error message I get when I attempt to use the data (right now using the data is simplying printing the data to output) I get the message "No close tag marker." I'll research what that actually means, but if you have any clue I'd definatley be open to help. Thanks again.
        Sounds like derby is likely to be on the right track. The error message is coming from the XML parser, saying that it is reaching an end of string (or end of input) before seeing a closing tag (most likely, it's not seeing </tablefield>). So follow up on the advice below.
Re: Semicolons in webforms
by dsheroh (Monsignor) on Dec 30, 2007 at 16:10 UTC
    That's a rather odd problem to have... How are you doing the parsing? (Names of any CPAN modules and/or example code which demonstrates the problem would be the most useful answer.)
      I'm using CGI and a package that my company has written. When data is received by the perl webservice I assign it to a string variable. For example:
      # $srv is an instantiation of my companies package my %args = $srv->Vars(); # strxmltext is a string variable sent by the webserver and will conta +in a semicolon my $stringvariabe = $args{strxmltext};
        OK, if I understand correctly, the semicolon would have already been changed to a newline within those few lines of code, right? If the client is a normal web browser, I'd say it's probably a bug in $srv's package. If not, then the client is most likely failing to encode its data properly before sending it. Either way, if the semicolons are already changed by the first time you see the data, you're not going to be able to change them to pipe characters to prevent it.

        If you have control of the client, running a tr/;/|/ over the data ($data =~ tr/;/|/;) before sending it will change the semicolons to pipes as you initially requested, but fixing the encoding and/or your company's package would be the better solution.

        # strxmltext is a string variable sent by the webserver and will conta +in a semicolon my $stringvariabe = $args{strxmltext};
        So, if "$stringvariabe" contains a value like "item1;bottle", and you want to "parse" that, do you mean that you want to split on semicolon to get a "name, value" pair -- like this?
        my ( $name, $value ) = split /;/, $stringvariabe;
        If you sometimes get three or more strings separated by semicolons, you probably want to assign the result of split to an array:
        my @strings = split /;/, $stringvariabe;
        Is there something more complicated about the task that you haven't mentioned yet?