Re^2: Skript help needed - RegEx & Hashes

Glad to help, that's what we're here for :-)

Like I said, don't worry too much about the formatting - it's much more important that you apply whatever formatting style you choose consistently, because no matter what formatting, if indentation isn't applied consistently, it's much easier to make mistakes. Using one of the more common styles might make your code a little easier to read to others, but inconsistent indentation is much more problematic.

$h->{"3p-tRFs"} . How was it possible to replace the variable call from my version with " -> " ?

The Arrow Operator is both the method call operator and the dereferencing operator. For example, I can say:

my %hash = ( hello => "foo", world => "bar");
my $hashref = \%hash;       # store a reference to the %hash
print $hashref->{hello};    # prints "foo"
$hashref->{world} = "quz";  # change "bar" to "quz" in orig. hash
[download]

References are explained quite nicely in perlreftut - if you've heard of the concept of pointers, references are kind of like "safer" pointers, and therefore less scary ;-) Two advantages of references are that (a) instead of copying data structures when they are passed as arguments to functions*, you can just pass a reference instead, which saves memory and allows the function to modify the original data if desired, and (b) you can build complex data structures out of them, for example an array can contain a list of references to hashes, then you have an AoH (array of hashes); hash values can be references to arrays (HoA, hash of arrays), and all sorts of complex data structures. You can see plenty of examples of the latter in perldsc, and references are explained in detail in perlref.

An "anonymous" hash or array is called that because it doesn't have a name. In "my $hashref = \%hash;", the hash being referenced by $hashref has a name, %hash. In "my $hashref = {};", this does basically the same as the previous piece of code, but now the hash referenced by $hashref is newly created and does not have a name (quite useful when building nested data structures).

... Data:Dumper. Isn't it contra-productive to use it when you are working with large data-sets and would end up printing a lot of content into the terminal?

It's just intended for debugging output - I figured that's what you wanted because you were using prints in your original code. It's no problem to comment them out, or probably even better to do something like:

my $DEBUG = 0;  # at the top of the program
...
$DEBUG and print Dumper(...);
# or
print Dumper(...) if $DEBUG;
[download]

I personally prefer the former because it's visually a bit easier to skip those lines when you're skimming the code. Or, if performance is a concern, the following will be optimized away (the disadvantage being it's a bit trickier to change a constant via a command-line option):

use constant DEBUG => 0;
...
DEBUG and print Dumper(...);
[download]

Minor edits for clarity.

* Update 2: This statement applies when using the common ways to access arguments: sub abc { my ($foo,$bar,...) = @_ } and sub abc { my $foo=shift; my $bar=shift; ... }. Don't worry about accessing the elements of @_ directly just yet ($_[0], $_[1], ...), that's a topic for another day.

Comment on Re^2: Skript help needed - RegEx & Hashes Select or Download Code