23skiddoo has asked for the wisdom of the Perl Monks concerning the following question:

Hi! It's been a while! So I have a project where I need to have a server running just to ingest XML files POSTed from a vendor's server. I have something basic working but my responses after ingesting the XML are throwing errors on the vendor's side. Here's what I have:

#!/usr/bin/perl use strict; use warnings; { package Four51Listener; use HTTP::Server::Simple::CGI; use base qw( HTTP::Server::Simple::CGI ); use EGA::Utils qw( TimeStamp ); # a set of custom functions my %dispatch = ( '/toMFF' => \&toMFF, ); sub handle_request { my ( $self, $cgi ) = @_; my $path = $cgi->path_info(); my $handler = $dispatch{ $path }; if ( ref( $handler ) eq 'CODE' ) { $handler->( $cgi ); print $cgi->header( -type => 'text/xml', -charset => 'ascii' ) +, response_ok(); } } sub toMFF { my $cgi = shift; my $xml = $cgi->param( 'POSTDATA' ); } sub response_ok { my $timestamp = TimeStamp(); $timestamp =~ s/\s/T/; my $response = <<EOF; <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE cXML SYSTEM "http://xml.cXML.org/schemas/cXML/1.1.009/cXML.d +td"> <cXML payloadID="foo&commat;ceprinter.com" xml:lang="en-US" timestamp= +"$timestamp"> <Response> <Status code="200" text="OK" /> </Response> </cXML> EOF return $response; } } my $pid = Four51Listener->new(8888)->background(); print "Kill $pid to stop server.\n";

The server doing the POST expects XML back but I keep getting errors about "protocol violation". I'm not opposed to using a framework like Dancer2 or Mojolicious, but I'm not quite sure how to do so for something like this.

Any ideas?

Replies are listed 'Best First'.
Re: Simple web server to ingest POST
by haukex (Archbishop) on Jun 08, 2022 at 16:22 UTC
    The server doing the POST expects XML back but I keep getting errors about "protocol violation".

    You may want to trace a good request and a failing request with Wireshark and see where the differences lie. I don't know if this is related, but when I run the XML from your post through xmllint it complains about the entity in the payloadID, if I change that to a regular @ it doesn't complain. Anyway, here's an example with Mojolicious::Lite:

    #!/usr/bin/env perl use Mojolicious::Lite -signatures; use Mojo::DOM; # in case you need to parse the incoming XML sub TimeStamp { scalar gmtime } # dummy post '/toMFF' => sub ($c) { my $body = $c->req->body; my $dom = Mojo::DOM->new->xml(1)->parse($body); # example say "[[[$dom]]]"; # just for debugging $c->render('test', format => 'xml', timestamp => TimeStamp=~s/\s/T/r); }; app->start; __DATA__ @@ test.xml.ep <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE cXML SYSTEM "http://xml.cXML.org/schemas/cXML/1.1.009/cXML.d +td"> <cXML payloadID="foo@ceprinter.com" xml:lang="en-US" timestamp="<%= $t +imestamp %>"> <Response> <Status code="200" text="OK" /> </Response> </cXML>

    Run this via morbo test.pl, I sent a test request via curl -X POST http://127.0.0.1:3000/toMFF -H "Content-Type: application/xml" -d "<foo><bar/></foo>" (I of course don't know if this is representative of your real-world requests), and I get the following response:

    <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE cXML SYSTEM "http://xml.cXML.org/schemas/cXML/1.1.009/cXML.d +td"> <cXML payloadID="foo@ceprinter.com" xml:lang="en-US" timestamp="WedTJu +n 8 16:20:00 2022"> <Response> <Status code="200" text="OK" /> </Response> </cXML>
      Hey, thanks for your response! I gave that a shot, and after a couple of tweaks (due to copy/paste errors), it seems to have worked! Do you have any recommendations for fleshing it out? I can include a sub to parse the $dom after the app->start, yeah? (It would take the contents and push it up into another system's API.) What's best practice for running a Mojo server as a service so I can avoid manually starting it?

        Just for fun (and to stay sharp with Mojo), I went ahead and coded up a little example of calling another HTTP API and getting the response asynchronously:

        #!/usr/bin/env perl use Mojolicious::Lite -signatures; use Mojo::IOLoop; use Mojo::DOM; use Mojo::UserAgent; use Mojo::Util qw/dumper/; my $OTHER_API = 'http://localhost:3000/second_api'; post '/toMFF' => sub ($c) { state $ua = Mojo::UserAgent->new; my $body = $c->req->body; my $dom = Mojo::DOM->new->xml(1)->parse($body); # example my $quz = $dom->at('bar[quz]')->{quz}; $c->render_later; $ua->post_p( $OTHER_API => form => { quz => $quz } ) ->then(sub ($tx) { my $res = $tx->result; my %stash = ( code=>$res->code, msg=>$res->message, timestamp => scalar gmtime, debugmessage=>"" ); if ($res->is_success) { my $j = $res->json; $stash{debugmessage} = "Server response: ".dumper($j); if ( not $j->{response} ) { $stash{code} = 599; $stash{msg} = "Endpoint reported error"; } } else { $stash{debugmessage} = "HTTP error" } $c->render('response', format=>'xml', %stash); })->catch(sub ($err) { # this sends a 500 response to the client: $c->reply->exception("The endpoint could not be reached: $ +err"); }); }; # this API would actually be on a different server post '/second_api' => sub ($c) { my $quz = $c->param('quz'); my $resp = $quz && $quz=~/\S/ ? { response => "Thanks for $quz!" } : { error => "You didn't provide a quz" }; # pretend this API is a bit slow $c->render_later; Mojo::IOLoop->timer(1 => sub { $c->render(json=>$resp) }); }; app->start; __DATA__ @@ response.xml.ep <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE cXML SYSTEM "http://xml.cXML.org/schemas/cXML/1.1.009/cXML.d +td"> <cXML payloadID="foo@ceprinter.com" xml:lang="en-US" timestamp="<%= $t +imestamp %>"> <Response> <Status code="<%= $code %>" text="<%= $msg %>" /> <!-- <%= $debugmessage %> --> </Response> </cXML>

        Test with e.g. curl -X POST http://127.0.0.1:3000/toMFF -H "Content-Type: application/xml" -d '<foo><bar quz="baz"/></foo>'

        Do you have any recommendations for fleshing it out?

        I guess that depends on what else you want to add to it, but I think Mojolicious::Lite is fine for scripts that have up to a couple of pages of code. If it starts growing to more than that, you probably want to split it into a multi-file application, see Mojolicious::Guides::Growing for that. For most things, e.g. authentication, different response types, etc., a read of Mojolicious::Guides::Tutorial is strongly recommended.

        I can include a sub to parse the $dom after the app->start, yeah?

        It's a regular Perl script, so you can include the sub almost anywhere in the script. app->start is just the entry point for the main loop. Of course, code modularity best practices should still be followed, so if the code gets too long you might want to split it out into a separate module and so on.

        It would take the contents and push it up into another system's API.

        See Mojo::UserAgent, or in case it's a database, see Mojo::Pg and several similar modules for other databases. The asynchronous nature of these kinds of APIs are very well represented by Mojolicious, so for example, if you want to delay the sending of the response to your clients until your other API has given you a response, you can use e.g. render_later and then for example send your response from the Mojo::UserAgent callback or promise (example; I've also posted a bunch of Mojo examples, several are linked from my scratchpad). (Update: I just posted an example of the aforementioned promise handling in my other reply.)

        What's best practice for running a Mojo server as a service so I can avoid manually starting it?

        I'm not sure about other best practices in that case, but I run my Mojo apps either as systemd services based on the information in Mojolicious::Guides::Cookbook, or I run hypnotoad in a Docker container (in that case with hypnotoad -f /path/to/script.pl).