Bod

Introduced to Perl in the early 1990's which quickly became the language of choice. Built many websites and backend applications using Perl including the sites for my property business:
Lets Delight - company site
Lets Stay - booking site
Also a few simple TK based desktop apps to speed things up.

Guilty of only learning what I need to get the job done - a recipe for propagating bad practice and difficult to maintain code...difficult for me so good luck to anyone else!

Now (Nov 2020) decided to improve my coding skills although I'm not really sure what "improve" means in this context. It seems Perl and best practice have come along way since I last checked in and my programming is approach is stuck in the last decade.

20th October 2021 - added to Saint in our Book 😀
2nd October 2022 - promoted to Priest
7th July 2023 - promoted to Vicar
15th December 2023 - promoted to Parson

Nodes I find helpful

Modules

Posts by Bod

UK tax system uses Perl in Meditations 1 direct reply — Read more / Contribute	by Bod on Apr 08, 2024 at 10:47

I've just found out the HMRC (the UK's taxation department) uses Perl for at least some of its operations... The system is troubled today and I received this error revealing the language `Ref: /home/ewf/MODULES/Common/PaymentApiService.pm Error Code 401 at line 30` [download] Update: corrected typo in title
Server Time in Seekers of Perl Wisdom 2 direct replies — Read more / Contribute	by Bod on Apr 07, 2024 at 06:56

We've started the process of moving hosting server to a cheap VPS - just non-essential (read "hobby") sites for now whilst we get to grips with it and I learn how to manage a server... All is going relatively well so far, but I've found an anomaly with time settings. Can anyone explain what is going on? The 'old' server observes DST whereas the 'new' server doesn't. If I run `timedatectl` on the 'new' server it tells me it is set to UTC. I don't have access to run the same command on the 'old' server. Here in the UK we are now on BST (GMT+1) so things that are time dependent (like Google Calendar feeds) that have been moved over are all 1 hour out. If I get the time from the MariaDB database `SELECT NOW()` I get GMT, as expected. However, I have a bit of test code left in a page that only I use. It's a bit of JavaScript `document.write(document.lastModified);` which shows the time in BST. Doen't that time get passed in the HTTP headers from the Perl generated web page? Perl is, of course, also reporting GMT. The obvious solution seems to be to change the server time from UTC to GMT. Will that then observe UK DST? Are there any reasons not to change the server to GMT bearing in mind that the entire codebase was written on a server that observes DST? Update: - the "obvious" option to set would be 'Europe/London' but that is not included in `timedatectl list-timezones`
PERL5LIB not in @INC in Seekers of Perl Wisdom 3 direct replies — Read more / Contribute	by Bod on Mar 23, 2024 at 10:56

I have some common modules that are used in several places so I have created a directory for them at `/usr/lib/perl_modules`. I want to include this location in `@INC` for every user, including CRON. I've added `export PERL5LIB=/usr/lib/perl_modules` in both `/etc/environment` and `/etc/profile`. When I list the environment variables, PERL5LIB is there as expected. But when I try to `use` on of the modules I get an error: `Can't locate my_module.pm in @INC (you may need to install the my_modu +le module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/ +perl/5.36.0 /usr/local/share/perl/5.36.0)` [download] There are other locations in `@INC` but `/usr/lib/perl_modules` is not one of them... I suspect the environment variable is set for root that I'm using to list the variables, but not for whatever process is running the script within Apache. How can I properly set PERL5LIB for all users and processes or is there a better way to get an extra entry in `@INC` for every script without having to use lib in every script?
Holding site variables in Seekers of Perl Wisdom 7 direct replies — Read more / Contribute	by Bod on Mar 21, 2024 at 06:39

We operate a number of websites, all of which operate on the same server. Currently, I am the only developer. But that is likely to change over the next 18 months or so. I'm making some changes that present the opportunity to make some improvements to the internal design and security of the sites. I'm looking for some input on the "best" way to do this. Any input welcome but especially around global site variables. Currently we have this directory structure (plus a few others omitted for simplicity: `site/prod/bin/ site/prod/lib/ site/prod/template/ site/prod/www/ site/test/bin/ site/test/lib/ site/test/template/ site/test/www/` [download] Every site has identical code in `prod` and `test` (except for during development of course) except for one file `site/lib/vars.pm` which declares the site variables needed for that site and environment. Things like the DB credentials, the DB instance to connect to, Stripe keys, API keys, etc. `use strict; use warnings; our $env_db_user = 'dbusername'; our $env_db_pass = 'dbpassword'; our $env_paypal = 'PP username'; # etc, etc, etc` [download] There is no logic code in this module, it just defines variables with `our`. This module is `use`d by a utility module that is used by every script on the website. When we bring another developer onboard, I want to split the site variables into two - those they have access to (test database schema name, text Stripe keys, etc) and those they don't (live Stripe keys, database credentials, etc). I could relocate this file to further up the directory structure where they don't have access, but I feel sure there is a better way to handle this as it must be a common problem in multi-developer environments. What I have works well and it not in need of imminent change. But I have opportunity to make it more robust as I am making other changes. What advise can you give on this matter kind and wise Monks?
MTA for Perl in Seekers of Perl Wisdom 5 direct replies — Read more / Contribute	by Bod on Mar 18, 2024 at 16:51

I need to install a Mail Transfer Agent that will be used with MIME::Lite I understand that the modern choices are Postfix or Exim. However, MIME::Lite says it uses sendmail which is pretty old isn't it? I have never looked at MTA's before so any advice would be appreciated. This is being installed on a RaspberryPi. Does the MTA only deal with outgoing mail or will it manage incoming mail as well? If not, what do I need for that?
DBI and JSON fields in Seekers of Perl Wisdom 2 direct replies — Read more / Contribute	by Bod on Mar 12, 2024 at 08:28

I'm creating user-defined custom fields in Maria DB using the JSON Data Type. I've never had cause to use this data type before so I've been ~~playing~~ learning by running queries directly against a test database to get the syntax right. I thought I'd got this to the point where I can use it and imagined that there could be some issues with getting the DBI placeholders right. But I seem to have hit a more tricky issue... I started with this DBI query: `$row->{'inuse'} = $dbh->selectrow_array("SELECT COUNT() FROM Person W +HERE Account_idAccount = ? AND JSON_EXISTS(custom, ?)", undef, $accou +nt, "$." . $row->{'name'});` [download] But this didn't work. I assumed it was a placeholder issue. However, I slowly removed the placeholders to test where things were going wrong. To the point where I arrived at this with no placeholders... `$row->{'inuse'} = $dbh->selectrow_array("SELECT COUNT() FROM Person W +HERE Account_idAccount = 35 AND JSON_EXISTS(custom, '$.test3')"); die $dbh->errstr if $dbh->err; die $row->{'inuse'} if $row->{'inuse'};` [download] The code doesn't die. But, if I run the same query directly against the same database: `SELECT COUNT(*) FROM Person WHERE Account_idAccount = 35 AND JSON_EXIS +TS(custom, '$.test3')` [download] I get the result `1` Is there a problem using the MariaDB JSON Data Type with DBI or have I missed something else obvious?
Module to extract text from HTML in Seekers of Perl Wisdom 7 direct replies — Read more / Contribute	by Bod on Feb 27, 2024 at 06:10

I've been searching unsuccessfully for a module to extract just the text from an HTML webpage... Any suggestions? Ideally, I want to feed in a URL and return the page's text as plain text - no formatting, tags, etc. Even most of the text would suffice. I'm currently using HTML::TreeBuilder and just extracting the p tags which is not quite good enough: `my $http = HTTP::Tiny->new; my $resp = $http->get($url); my $tree = HTML::TreeBuilder->new; $tree->parse($resp->{'content'}); my @paragraph = $tree->look_down('_tag', 'p'); print "Content-type: text/plain\n\n"; foreach my $line(@paragraph) { print $line->as_trimmed_text . "\n"; }` [download] I thought I'd found a solution with HTML::Extract. But when the sample code in the documentation doesn't compile I knew I was heading down a dead end! Do you know of a module to extract just the text?
Bot vs human User Agent strings in Seekers of Perl Wisdom 2 direct replies — Read more / Contribute	by Bod on Feb 09, 2024 at 13:42

We are wanting to supplement Google Analytics or a few reasons. Not least because we want to have site traffic information held in our own database so we can interrogate it automagically. We've created a database table to hold this data. Within the common header method, we've added some code that sets a cookie with a max age of 2 hours or refreshes the cookie if it is already set. If the cookie isn't already there, we write a row to the database table with the entry time, entry page, etc. If the cookie exists we update the row with exit page, exit time and bump the page count. This approach is working and it's been running for a week. But, it is reading about 11 times higher for site traffic than Google Analytics. I'd expect some discrepancy but not that much. Looking at the visits, we are getting a quite a few with the same or very close timestamp so my best guess is that it's a client that isn't accepting the cookie - perhaps a web crawler. To check this out, I've added IP and User Agent to the database table and sure enough these have a user agent of a crawler/bot. To solve this, I've added a condition to the line that writes the new line to the database: `$dbh->do("INSERT INTO Site_Visit SET firstVisit = NOW(), lastPage = ?, + firstPage = ?, IP = ?, userAgent = ?, orsa = ?, orta = ?, Person_idP +erson = ?", undef, $ENV{'REQUEST_URI'}, $ENV{'REQUEST_URI'}, $ENV{'REMOTE_ADDR' +}, $ENV{'HTTP_USER_AGENT'}, $cookie{'orsa'}, $data{'orta'}, $user) unless $ENV{'HTTP_USER_AGENT'} =~ /bot/i or $ENV{'HTTP_USER_AGEN +T'} =~ /facebook/i or $ENV{'HTTP_USER_AGENT'} =~ /dataprovider/i;` [download] This seems to be working...but...the list of 'blocked' user agent strings could get quite large. Is there a more Perlish way to write this condition? I did think of putting them all in a database table for querying the user string against this table: `SELECT ? IN ( SELECT userAgent FROM Blocked_Users )` [download] untested But, that would mean having the full and exact user agent strings instead of using a regexp. Note that I don't want to block crawlers, I just don't want them written to the site visit logs. This makes it quite difficult to Google because most articles are about blocking crawlers and bots from a website.
Is require still required? in Seekers of Perl Wisdom 6 direct replies — Read more / Contribute	by Bod on Jan 31, 2024 at 17:48

I've been looking at a question I asked 3 years ago in Refactoring webcode to use templates How things have changed since then. We've closed down the part of the business that I refactored all the code for, but I certainly learnt a lot in the process. One of the things I refactored and now do as standard is to have pretty much all common code in modules. Although they were common in my code until a few years back, I now never use the `require` keyword. This got me thinking...is `require` ever still required or is it obsolete in the modern world?
Persistent data in Seekers of Perl Wisdom 2 direct replies — Read more / Contribute	by Bod on Jan 31, 2024 at 17:37

I'm writing an XML Sitemap generator based around WWW::Crawl I want to record the priority to set each entry in the sitemap. My first thought was to use a CSV or similar text file but it could become huge and cumbersome. So what are the alternatives? I could write this server-side where I have a MariaDB instance running so storage is no problem. But I'm thinking I want to run it client side although I don't really know why. So my choice seems to be to hold the data in a Storable object. Run MariaDB, MySQL or similar locally or use DBD::SQLite from within Perl. No doubt there are other choices... Which would you do and why? What would you definitely avoid doing and why?


Perl Monk, Perl Meditation
	PerlMonks