nicku has asked for the wisdom of the Perl Monks concerning the following question:

I have a problem where many people edit a complex Perl hash that is more than two megabytes in size. [I know, it's crazy.] Some human editors know Perl more than others. How to quickly identify that some component has an odd number of hash elements?

I can do that on the main file with:

my $error = qx(perl -we "require qq{$file}" 2>&1);
warn "$error\n" and return if $error;

But this main file also requires many sub files, which use global variables defined in the main file. This trick does not work with these sub files since there are many undefined global variables in these otherwise (normally) syntactically correct files.

Naturally, I am also doing something along the lines of  my $ok = 0 == system '/usr/bin/perl', '-Mstrict', '-c', $file; but that does not catch hashref components with odd numbers of elements.

Can anyone suggest a way of checking for odd number of elements in hashrefs and other such syntactical no nos?

On a related note, can anyone suggest a way of determining the location of the offending problem(s)?

  • Comment on Determining presence of odd number of hash elements, other syntax problems
  • Download Code

Replies are listed 'Best First'.
Re: Determining presence of odd number of hash elements, other syntax problems
by AnomalousMonk (Archbishop) on Jul 20, 2011 at 07:12 UTC

    This seems like an organizational problem. Maybe:

    • Supply each person or group that is responsible for maintaining a 'sub' file or files a custom Perl script for checking the file(s) for which they are responsible. The script begins with
          use strict;
          use warnings FATAL => 'all';
          use vars qw($all $needed $global $variables);
      (some of the fake 'globals' may need to be initialized) and continues with a
          require sub_file_foo.whatever;
      for each sub-file in question. It may be necessary for some sort of fake hash to be built from all the sub-hash-thingies (I don't have a good idea of the organization of all the code in this crazy system) for final verification. Finally, the script prints
          print "sub-file(s) look ok.";
      and exits.
    • Establish a 'release date' or 'release number' for each production build. Require each sub-file to have the line
          # checked by Name of dept. XXX on date DDD
      (or "for release NNN", or whatever) somewhere in the file. This means that each sub-file must be modified for each build, but nothing in life is free.
    • Supply the person responsible for receiving all these nuttsy-cuckoo sub-files a script that will scan each sub-file received for a given build for the presence of the checked-by line required in the previous step and check that the 'checker' for each department is authorized and the release date/number/whatever is correct for the build.
    • Supply the person responsible for receiving all these sub-files a baseball (or, in British-influenced areas, cricket) bat. Place this bat in a prominent position in this person's cubicle or office under a sign that says "For Use On Anyone Who &#@%$ Up A Sub-File For Which They Are Responsible" in big red letters.
    • Finally, do all the other checks on the sub-files that you can think of.
    That oughta do it.

      (some of the fake 'globals' may need to be initialized)

      Aye, there's the rub; yes, there is a custom program that checks the files; that's what I'm going to improve. The trouble is that there are many of these global variables, coming and going. I'd love to find a way to identify them all and initialise them appropriately, in some totally automated way.

Re: Determining presence of odd number of hash elements, other syntax problems
by cdarke (Prior) on Jul 20, 2011 at 07:29 UTC
    Bad idea letting anyone edit code (one wonders at the source code change control system you have). Don't store the hash in code, store it in a file form such as YAML. YAML is fairly easy to read and edit, but it might be worth considering writing an edit tool (in which case Storable would be an option).

    The code to load YAML into a Perl hash is very easy, and now you can remove everyone's write access (except your own) to your program code.
      Bad idea letting anyone edit code

      I confessed to it being crazy! There used to be six of us, all competent, never a problem. Now there are many more, some who know little Perl.

      (one wonders at the source code change control system you have)

      RCS for these files, CVS for all the code that reads them.

      Don't store the hash in code, store it in...

      Well, a database would be better. The problem is it's complex and large; the code that manipulates it will take a while to write.

Re: Determining presence of odd number of hash elements, other syntax problems
by Khen1950fx (Canon) on Jul 20, 2011 at 06:27 UTC
    Just do:
    perl -w /path/to/script.pl
    It'll warn about an odd number and give the location.
      Just do: perl -w /path/to/script.pl

      The problem is that the warnings of Use of uninitialized value in string at... and Use of uninitialized value in concatenation (.) or string at... are quite overwhelming with uninitialised variables. The reported location is often many kilobytes of distance away from the actual error.

      $ cat test-w
      #! /usr/bin/perl
      
      use strict;
      use warnings;
      
      my %hash = (
          fred => {
      	qw( odd number of elem ents ),
          },
      );
      $ perl -wc test-w
      test-w syntax OK
      $ perl -we "require qq{test-w}"
      Odd number of elements in anonymous hash at test-w line 6.
      

      Is there a way to turn those uninitialised warnings off on the command line?

Re: Determining presence of odd number of hash elements, other syntax problems
by bart (Canon) on Jul 20, 2011 at 11:32 UTC
    Just be cause the number of elements is even, doesn't mean there is no problem.

    If this is a nested hash, I'd also check for use of refs (or objects) as the hash key.

    On a related note, can anyone suggest a way of determining the location of the offending problem(s)?
    Well, the best I can think of is to use diff to compare the source before and after. If it worked before, the error lust be in what changed.

    Also: you may want to apply stricter rules to the format of the source code, than perl allows. For example, you could demand that new keys for the hash are always on a new line.

    Perhaps you can build a validator for rules like these, using PPI? Don't spend too much time on it, or it might be faster to convert this to a database. Which is what I'd really recommend. (Maintain the contents with a generic CRUD tool, maybe build one with a CRUD builder toolkit; and access the DB contents with DBIx::Simple/SQL::Abstract, which is my preferred way to exchange data between Perl and an SQL DB.)

Re: Determining presence of odd number of hash elements, other syntax problems
by Anonymous Monk on Jul 20, 2011 at 05:15 UTC