Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Errors uncaught by CGI::Carp

by Bod (Parson)
on Oct 13, 2021 at 20:46 UTC ( [id://11137491]=perlquestion: print w/replies, xml ) Need Help??

Bod has asked for the wisdom of the Perl Monks concerning the following question:

This is a follow on from the second part of the question I asked in CGI::Carp fatalsToFile
The collective wisdom of The Monastery solved the first issue but not the second...it seems they are unrelated!

I am getting 500 errors intermittently from most, if not all of the pages of a website. The errors seems to be clumped together over time. In other words, once an error happens it seems likely that it will happen again quite soon. Whereas at other times we go quite a while without errors.

Every script has use CGI::Carp qw(fatalsToBrowser); at the start of the script.
The bit I am most confused about is that my understanding is that, with fatalsToBrowser the only two ways to get a 500 error is that the script has the wrong permissions or that the script doesn't return valid HTTP headers. As this is an intermittent problem, it is neither of these.

This is the code of the modules that are common :

package Site::HTML; use strict; use warnings; use DBI; #use DBD::mysql; use Exporter; use Template; use MIME::Lite; use Digest::SHA qw(sha256_base64); use cPanelUserConfig; use Site::Variables; use Bod::CRM; our @ISA = qw(Exporter); our @EXPORT = qw(%data %file %cookie $dbh $current_user); our $site = $Site::Variables::site; our $template = Template->new( INCLUDE_PATH => $Site::Variables::template_path, PRE_CHOMP => 0, POST_CHOMP => 1, ); our (%data, %file, %cookie, $dbh); our $current_user = 0; my $crm = Bod::CRM->new($Site::Variables::env_crm_env); # Untaint ENV{PATH} for MIME::Lite my $path = $ENV{'PATH'}; $ENV{'PATH'} = undef; foreach my $p(split /:/, $path) { if ($p =~ m!^(/(usr|bin).*)!) { $ENV{'PATH'} .= ':' if $ENV{'PATH'}; $ENV{'PATH'} .= $1; } }
and
package Site::Wayfinder; use strict; use warnings; use cPanelUserConfig; use Digest::SHA qw(sha512_base64); use Facebook::Graph; use GD; use Bod::CRM; use Site::HTML; use Site::Point; my $crm = Bod::CRM->new(Site::HTML->crm_env); sub new { my $class = shift; my %attrs = @_; $attrs{'error'} = ''; $attrs{'fb'} = Facebook::Graph->new( app_id => '12345678', secret => 'abcdefghijklmnop', postback => "https://$ENV{'HTTP_HOST'}/?command=authori +ze_facebook", ); return bless \%attrs, $class; }
I have commented out use DBD::mysql; because on one occasion I got some output from CGI::Carp that showed an error on that line. I have not been able to replicate this error. Not specifying a DBD driver forces DBI::DBD to use the driver in it's config which is setup to connect to the MariaDB database.

My only hunch is that, because errors seem more concentrated at certain times, it is connected with the number of concurrent users. This in turn makes me wonder if is is Facebook::Graph that is the problem as it is deprecated. We only use Facebook for authentication. Users can create an account using Facebook rather than creating a password or can sign into an existing account using Facebook provided the email addresses are the same. This works under normal circumstances.

It isn't just web browsers that are getting the error. Facebook is complaining that our privacy policy page is giving errors and Stripe gets occasion webhook errors.

I am unable to replicate the problem on our test site - it only seems to happen on the live production site. Therefore I am limited on how much I can test or what I can tweak!

Any ideas what I can do to try and get some understanding of what is going on?

UPDATE: - to test my hunch that it is to do with multiple users, I have tried to cross referencer 500 errors with the data collected by Google Analytics and Hotjar. But the errors are not written to the Apache error log so this is proving difficult. I have started having Google Analytics open when I access the site so that I can look at the real time data when I next see an error.

Replies are listed 'Best First'.
Re: Errors uncaught by CGI::Carp
by hippo (Bishop) on Oct 13, 2021 at 22:51 UTC
    The bit I am most confused about is that my understanding is that, with fatalsToBrowser the only two ways to get a 500 error is that the script has the wrong permissions or that the script doesn't return valid HTTP headers. As this is an intermittent problem, it is neither of these.

    That is a bit of a non-sequitur. There are many ways for a script not to return valid HTTP headers in an intermittent way. The most obvious one is an intermittent restriction of some O/S level resource: open files, available RAM, etc.

    But the errors are not written to the Apache error log

    They certainly should be. Are you still running on shared hosting? If so, the real error log is unlikely to be accessible to you as it will (in the case of the event or worker MPM) contain data relevant to any and all of the sites on that shared server. Perhaps have a word with your hoster to see if they can at least give you some of the data which is relevant just to your site. Further, you might consider this as yet another reason to make the move to a VM/VPS instead.


    🦛

      The most obvious one is an intermittent restriction of some O/S level resource: open files, available RAM, etc.

      Once again hippo you seem to have identified the issue when everything tells me it cannot be that!
      Note to self - always listen to hippo's wisdom regardless of how I am interpreting the evidence differently.

      Digging deep into cPanel where I've never needed to go before, I have found a graph of CPU usage over the last 24 hours. It shows an error state several times during that time period. The last time roughly corresponds with the last 500 error I got from the site...

      So, perhaps the reason I don't see errors on other sites is because this site is doing something much more CPU intensive than the other sites. Some of the other sites get much more traffic. Therefore I have been thinking what is different on this site to any of the other:

      1. Perl is in taint mode
      2. The database has tables with spatial data types
      3. The file structure is laid out differently so modules are above the webroot
      4. Scripts are called by the Apache default handler
      5. AJAX is used to deliver (sometimes large) JSON files to the browser for Google/Bing maps APIs
      Other than that, there are no major differences that I can think of. There are less scripts than most other sites and they are laid out better as they were planned out more thoroughly before any code was written. Nothing has been written that I would expect to be especially resource intensive - there are no big sorts or especially complex maths.

      It has to be something that is common to all the scripts as the 500 errors are not limited to always the same script. So modules or something external like the database or the connection to Facebook.

      you might consider this as yet another reason to make the move to a VM/VPS instead

      Yes indeed. This site is a commercial venture that is starting to bring in revenue. I see it as the generator of the funds to enable me to upgrade web services to use across all the businesses and sites. Everything is a bit crazy right now and having time to properly move forward with a hosting upgrade would be wonderful.

      There are many ways for a script not to return valid HTTP headers in an intermittent way. The most obvious one is an intermittent restriction of some O/S level resource: open files, available RAM, etc.

      Good point as usual hippo.
      However, the examples you give would cause the other sites hosted on the same machine, including the test site for this domain, to give random error. They don't - or at least I have not experienced them. It is possible that they do as it is an intermittent problem but it seems unlikely that I would only notice it on one site and no others.

      Are you still running on shared hosting? If so, the real error log is unlikely to be accessible to you...

      Yes, it is on shared hosting with cPanel

      According to the cPanel documentation, the error log I can see comes from /usr/local/apache/logs/error_log.

        Not all resource limits are global. At $WORK we impose per-user limits (effectively per-site limits) on our shared hosting customers precisely so that one poorly-written or heavily-hit site doesn't adversely impact upon the rest. I would expect every other shared hoster to implement something similar. This would explain why you only see it in production - that site will (hopefully) receive orders of magnitude more traffic than your test/demo site.


        🦛

Re: Errors uncaught by CGI::Carp
by Bod (Parson) on Oct 14, 2021 at 20:15 UTC

    In an attempt to get to the bottom of this I have moved away from CGI::Carp qw(fatalsToBrowser) and instead I have tried using CGI::Carp qw(carpout) * by putting this in my BEGIN block.

    BEGIN { use CGI::Carp qw(carpout); if ($RealBin =~ m!^(/home/xxx/yyy/(test|uk)/www)!) { $safepath = "$1/../lib"; } else { die "Illegal use of software - visit www.way-finder.uk to use +this site"; } open LOG, '>>', '/home/xxx/yyy/test_error_log.log' or die "Unable +to open error log"; carpout(\*LOG); }

    I've tested it on our test site which doesn't give the 500 errors. The script returns the HTML as expected and writes a few lines about uninitialised stings to test_error_log.log. I've then added a die statement in the code and tested again - and again it writes that to the log file as expected.

    This version of the script has been migrated to the production environment and, as expected, the same warnings get written to the log file. But, when the 500 error happens nothing gets written to the log file. So I've deleted the log file and refreshed the page and kept repeating until I get the error. The logfile is created but nothing is written to it.

    What could cause the BEGIN block to execute but not the rest of the script without carpout writing anything at all to the log file???

    * CGI::Carp qw(carpout) seems to be the fatalsToFile I was looking for in CGI::Carp fatalsToFile

      What could cause the BEGIN block to execute but not the rest of the script without carpout writing anything at all to the log file???

      Are you sure the whole script isn't running? Remember, fatal errors are not the only cause of 500 errors on webservers -- so it might be that the whole script is running without dying, but you are still getting a 500 error. It's sounding to me like your 500 error is not coming from a fatal error in your script -- so not because of a die or croak or similar -- but more likely from a mistake in the output of your script; if so, then your fatalsToBrowser or carpout cannot do anything to help you debug.

      A quick example of a script that never dies and successfully runs to the end, but still causes a 500 error on the server:

      #!/usr/bin/perl use 5.012; # strict, // use warnings; print "Content-Type: text/plain;\n"; # the missing \n will cause a +500 return code, even though your script has no fatal error, and no d +ie print "Hello World\n";

      If the problem is in the output of your script (and my guess is in the header section... or that the headers aren't separated by a newline from the content), not in the execution of your script, then you are going to need some way to duplicate the STDOUT output to also go into the logfile of your choice. I know of two ways to do that.

      1: If you can install IO::Tee, or can pipe to the tee command on your server, then you should follow the advice in Copy STDOUT and STDERR to logfile and also display to screen for duplicating your STDOUT into your logfile directly.

      2: otherwise, you can redirect your script's STDOUT to an in-memory file in a BEGIN block, then at the END, print the contents of that in-memory variable to a logfile and to the original STDOUT handle. Some example code which will run that on my "erroring script" follows in the spoiler:

      #!/usr/bin/perl use 5.014; # strict, //, s//r non-destructive use warnings; use cPanelUserConfig; { use autodie; my $fh_cgiout; my $memory; BEGIN { open $fh_cgiout, '>&', \*STDOUT; # duplicate real S +TDOUT to fh_cgiout: this means if you print to $fh_cgiout, it will ac +tually go to the webserver close STDOUT; # must close STDOU +T before reassigning it open STDOUT, '>', \$memory; # now when the scr +ipt prints, it will "print" the text into memory, for later use } END { # at the very end, just before exiting, duplicate the contents # of the dupout_filename back to the CGI output filehandle (wh +at was originally STDOUT); # start by closing the fake STDOUT and the fh_dupout, so that +the dupout file is safe to read close STDOUT; # next, append the memory contents to the logfile open my $fh, '>>', 'dupout.log'; # append to yo +ur logfile my $prefix = sprintf "DEBUG %5d %s: ", $$, scalar gmtime; # +I suggest a prefix in your logfile, to easily identify process ID and + time for identifying the run print {$fh} ($memory =~ s/^/$prefix/gmr); # prepend ever +y line of memory with the prefix, without editing the memory variable + contents (/r regex option, v5.14) close $fh; # restore the original CGI STDOUT from $fh_cgiout open STDOUT, '>&', $fh_cgiout; # finally, print the memory contents to the original STDOUT, s +o that it will go to the webserver print $memory; # this time pr +int the memory unedited } } print "Content-Type: text/plain;\n"; print "Hello World\n"; __END__

      I put that onto my cPanel-based shared-host website, and loaded it in my browser twice: once with the wrong single \n and once with the right \n\n. The first time, there was a 500 error; the second time, it ran just fine. The dupout.log contained:

      DEBUG 329772 Fri Oct 15 14:06:30 2021: Content-Type: text/plain; DEBUG 329772 Fri Oct 15 14:06:30 2021: Hello World DEBUG 330420 Fri Oct 15 14:06:42 2021: Content-Type: text/plain; DEBUG 330420 Fri Oct 15 14:06:42 2021: DEBUG 330420 Fri Oct 15 14:06:42 2021: Hello World
      ... which showed that there weren't enough newlines the first time -- though you have to know enough about the rules for headers vs content to be able derive that from the output.

      With that technique, hopefully you can find the difference between a working run and a failing run. (And, like I showed in the example, you should be able to test on your development server, and force a 500 by intentionally creating a bad header, to make sure that the logging is working right and that you've got the printing of the memory block back to the original STDOUT correctly implemented, before trying it live on the server that has the problem.)

        Are you sure the whole script isn't running?

        I came to the conclusion that the whole script isn't running because the logfile I am writing to with carpout has a few warnings written to it on a successful run but is blank on an error run. Surely the warnings would still be there on an error run if the script was actually running?

        I am also getting some strange behaviour that I suspect is due to the file output buffer.

        use FindBin qw($RealBin); my $safepath; BEGIN { use CGI::Carp qw(carpout); if ($RealBin =~ m!^(/home/xxx/yyy/(test|uk)/www)!) { $safepath = "$1/../lib"; } else { die "Illegal use of software - visit www.way-finder.uk to use +this site"; } open LOG, '>>', '/home/xxx/yyy/test_error_log.log' or die "Unable +to open error log"; print LOG localtime . " - $ENV{'HTTP_HOST'}\n\n"; carpout(\*LOG); } use lib "$safepath";

        During an error running of the script, the logfile gets created but nothing is written to it despite the print in the BEGIN block. During a successful execution, the timestamp in the BEGIN block gets written to the logfile after the warnings even through it shows an earlier timestamp.

        Thank you for your suggestions. They give me some things to try. I had run out of ideas!

        before trying it live on the server that has the problem

        They are one and the same server - just different domains in the same shared hosting. But yes, I will test it well first.
        This is why I don't think it is a resource issue as there are other domains on the same hosting that have a lot more traffic and carry out more IO than this site which, apart from loading Perl modules, authenticating with Facebook and connecting to the database, does no IO at all. But I am open to any suggestions.

        you can redirect your script's STDOUT to an in-memory file in a BEGIN block, then at the END, print the contents of that in-memory variable to a logfile and to the original STDOUT handle

        I have tried this approach. It is working in the test environment and has been applied to one of the production scripts that I use quite a bit and from which I regularly get 500 errors. And of course, I keep testing and it isn't giving errors...but I am sure it will before long!

        Here is what the top of the script looks like...

        There's some magic in there that I don't fully understand...can I clarify?

        The braces around the first BEGIN and END block are to limit the scope of use autodie; - is that right and is that 'all' they are doing?

        I can see what print {$fh} ($memory =~ s/^/$prefix/gmr); is doing but could not write it from scratch! The braces are needed because the thing that is printed is being calculated on the fly instead of being a constant or intopolated from a variable - is that right?

        Thanks for your help thus far...I have a rapidly growing log file capturing all the stuff output to people's browsers. Hopefully somewhere in that lot I will find the clue that unlocks my understanding of the error.

        I was beginning to think that the test code redirecting STDOUT had somehow cured the problem. I have been using the site for longer than usual without an error from the most used script - the one with the test code. I have an occasional error from other scripts.

        But - I have just had a 500 error.
        Looking at the logfile, the HTML belongs to the previously loaded page, not the one it is trying to load, and the timestamp shows it was the previous page.

        So it looks like the 500 error is happening without the script starting up.

        Again, I have checked the errors and there is nothing other than a few warnings. I have checked the RAM usage and it peaked at 664Mb out of 1Gb allowance over the last 24 hours.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11137491]
Approved by dorko
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2024-04-19 15:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found