footpad has asked for the wisdom of the Perl Monks concerning the following question:

The apprentice, trying to avoid making too many blunders, requests the consideration of the more experienced...

I've been asked to help setup a PayPal-based signup script for an organization's membership sign-ups. They offer four types of memberships at different prices. In reviewing PayPal's documentation, it looks like I'll need to construct a script that determines the selected membership and then uses LWP::Simple to post a query to PayPal's servers. PayPal even provides a sample script that can be adapted to such a thing.

Here's *THEIR* code, with minor editing, for review and understanding of where I'm going with this:

#!/usr/local/bin/perl # read the post from PayPal system and add 'cmd' read (STDIN, $query, $ENV{'CONTENT_LENGTH'}); $query .= '&cmd=_notify-validate'; # post back to PayPal system to validate use LWP::UserAgent; $ua = new LWP::UserAgent; $req = new HTTP::Request 'POST','https://www.paypal.com/cgi-bin/webscr +'; $req->content_type('application/x-www-form-urlencoded'); $req->content($query); $res = $ua->request($req); # split posted variables into pairs @pairs = split(/&/, $query); $count = 0; foreach $pair (@pairs) { ($name, $value) = split(/=/, $pair); $value =~ tr/+/ /; ### What's this doing? ### $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; $variable{$name} = $value; $count++; } # assign posted variables to local variables $receiver_email = $variable{'receiver_email'}; $item_name = $variable{'item_name'}; $item_number = $variable{'item_number'}; $custom = $variable{'custom'}; $payment_status = $variable{'payment_status'}; $payment_date = $variable{'payment_date'}; $payment_gross = $variable{'payment_gross'}; $payment_fee = $variable{'payment_fee'}; $txn_id = $variable{'txn_id'}; $first_name = $variable{'first_name'}; $last_name = $variable{'last_name'}; $address_street = $variable{'address_street'}; $address_city = $variable{'address_city'}; $address_state = $variable{'address_state'}; $address_zip = $variable{'address_zip'}; $address_country = $variable{'address_country'}; $payer_email = $variable{'payer_email'}; if ($res->content eq 'VERIFIED') { # check transaction for uniqueness # process payment } elsif ($res->content eq 'INVALID') { # possible fraud } else { # error }

With this in mind, here are my thoughts and petitions:

I know it seems like a laundry list, but this is (as I noted earlier) research designed to help me avoid common traps, pitfalls, or other insecurity issues. Since it's my wife who asked me to do this, I'd like to avoid make myself (or her) look like a fool in front of the rest of the organization. I'd also like to keep them (and my host) from getting cracked.

--f

Replies are listed 'Best First'.
Re: PayPal Advice
by tachyon (Chancellor) on Jul 06, 2001 at 21:44 UTC

    Well can't answer all, but here is an answer to the easy bit. The regex you have flagged is decoding %XX hex strings into ASCII. In HTTP certain chars have special significance. Spaces are encoded as '+' and special chars in strings (like ? & = % etc) are encoded as as two hex digits with a leading %. Thus this regex is converting every instance of % plus two hex digits to the original pre-encoding ASCII. As it happens the only characters that don't need encoding are a-zA-Z0-9-_.!~*'() Everything else is encoded. For example the # char is encoded %23 as the # symbol has a hex value of 0x23.

    You get soft using CGI.pm all the time and forget all the work it does for you! For a good discussion of this check out Ovid's CGI tutorial which covers query strings, encoding and decoding in some detail.

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: PayPal Advice Sought
by MeowChow (Vicar) on Jul 06, 2001 at 21:53 UTC
    The regex in question is standard issue for converting escaped query params into their actual values, which is something you'll only see in association with hand-rolled CGI processing (as opposed to using CGI.pm).

    The code is ugly and needs serious cleaning up, as you've already noted. Regarding security, I would strictly limit the range of allowable data, and apply all standard tainting practices to this application, as if you were making a system call. By that, I mean you should ignore any extraneous query params, and scrub each POST param to its minimal character set.

    I would also use HTTP::Request::Common instead of manually stringing together the POST, and of course check the returned page from PayPal for error or success.

       MeowChow                                   
                   s aamecha.s a..a\u$&owag.print
      I have a question about that regex. A look at sub unescape in CGI reveals a regex that's nearly identical to the one in question. The first difference is trivial {2}. I'm curious about how significant the use of a signed pack (c) in the CGI regex is, in contrast to the unsigned pack (C) in the other one?
      $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; # carg +o $todecode =~ s/%([0-9a-fA-F]{2})/pack("c",hex($1))/ge; # CGI
      For reference's sake here's sub unescape from CGI.pm version 2.46:
      # unescape URL-encoded data sub unescape { shift() if ref($_[0]); my $todecode = shift; return undef unless defined($todecode); $todecode =~ tr/+/ /; # pluses become spaces $todecode =~ s/%([0-9a-fA-F]{2})/pack("c",hex($1))/ge; return $todecode; }
      thanks - epoptai

      --
      Check out my Perlmonks Related Scripts like framechat, reputer, and xNN.

        epoptai's right -- there's no difference between pack 'c',$number and pack 'C',$number, ever. There is a difference when unpacking, of course.

        What the translations 'c' and 'C' do when packing is to translate an integer to a corresponding character value. Your character values are most likely single-byte numbers. Each corresponds to a specific modulus of integers. In particular, the two most popular ways to assign representative integer values to the 256 bytes are 0..255 ("unsigned char") and -128..127 ("signed char").

        But of course any integer value is congruent (modulo 256) to exactly one byte value, whichever of the 2 ranges you pick. So any integer has a unique translation to a byte. The reverse direction (unpack 'c',$str) is less single-valued: for instance, unpack 'C',(pack 'C',-1) == 255. Here unpack has to chose a specific range from which to pick an integer representing the byte value, and the two letter codes make a difference.

        The same thing occurs for the other signed/unsigned letters for integer conversions in pack/unpack.

        My version (2.752) of CGI.pm's unescape uses chr instead of pack, by the way:
        $todecode =~ s/%([0-9a-fA-F]{2})/chr hex($1)/ge;
        I wonder if it's faster...
           MeowChow                                   
                       s aamecha.s a..a\u$&owag.print
Re: PayPal Advice Sought
by strredwolf (Chaplain) on Jul 06, 2001 at 23:07 UTC
    I'm familiar with Paypal (use it for my sites), but from what you describe it, you're throwing a complex solution at a simple problem. That, and I belive Paypal will have to get some more information from the user directly.

    FurNetwork has implimeneted a different solution: Seperate buttons for seperate types of products. Since you have a different levels of memberships, this may be the best low-end solution.

    You may want to create a redirection page which has the order ready, and redirects to Paypal to let it gather more information.

    --
    $Stalag99{"URL"}="http://stalag99.keenspace.com";