in reply to hash and arrays
With apologies if a brief tutorial was not what you wanted...
When to use Hash and Arrays in the program ?
Arrays and Hashes are similar in that they are containers for multiple things (scalars), and you access those things using a form of key. In the case of an Array the key (usually called an index) is an integer -- so $foo[25] is item 25 in the array @foo. In the case of a Hash the key is a string -- so $bar{'homburg'} is the homburg item in the hash %bar.
Arrays have an implied ordering: item 0 comes before item 1 and so on. The contents of an array may be treated as a list.
Hashes have no ordering whatsover. So when keys(%bar) gives you a list of all the keys in the hash, they are in no predictable order -- in particular, not in the order things were put into the hash (except by pure chance) -- this is a common trap that people fall into.
Arrays are used where your key values are simple integers (in a reasonable range), or as containers for lists. For example, the array @days = ('Mon', 'Tue', 'Wed', 'Thurs', 'Fri', 'Sat', 'Sun') can be accessed by index: $days[4] gives 'Fri', day 4 of the week, where day 0 is 'Mon'. Or treated as a list: foreach my $d (@days) sets $d to each day name in turn, from 'Mon' to 'Sun'.
Hashes are used where your keys are arbitrary -- noting that the keys you use are converted to strings. For example, the hash %day_num = ('Mon' => 0, 'Tue' => 1, 'Wed' => 2, 'Thu' => 3, 'Thurs' => 3, 'Fri' => 4, 'Sat' => 5, 'Sun' => 6) can be accessed by the name of a day to get the day number: $day_num{'Thurs'} gives 3. Note that $day_num{'sat'} will give undef, because there is no such entry -- hashes are very literal minded about the keys. The function exists will tell you whether a given key exists in the hash, so exists($day_num{'sat'}) would return false, in this case. Other essential functions for using hashes are keys, values and each. You will also find sort used quite a bit with keys, to impose order on chaos.
You can use a Hash to implement a sparse array. So, $sa{79}, $sa{200000}, $sa{123456} could be entries an a sparse array. Noting that the absent entries would all appear as undef. Hash keys are strings, but if you use a number, as above, it will be converted to a string -- Perl is broadminded, and generally does not discriminate between string and numeric values (which has its own little quirks, but that's another story).
This is a big topic. I recommend a little reading !
In which scenario => is used?
As others have said, broadly speaking there is no difference between '=>' and ',' apart from the appearance. I used '=>' between each key and its value in the "literal hash" above. I could just as well have used ',', but the '=>' makes the key/value pairs more obvious.
However, that is not the whole story. Perl has the notion of a "bareword", whose genesis is lost in the mists of time. A bareword is simply a thing that looks like an identifier with no "sigil" (leading '$', '@', '%', ... etc) and no trailing '(). Deciding what a bareword means is a bit of a problem for Perl. If it knows of a subroutine which is defined to take no arguments, Perl will (generally) treat a bareword as a call of that subroutine -- this is how "constants" (as defined by use constant) are implemented. Otherwise Perl either has to know what to do, or must guess. If you use strict (and, frankly, you need a good reason not to), Perl will throw a compile time error rather than guess at the meaning of a bareword.
There are cases where Perl knows what to do with barewords, and hashes are a prime example:
when accessing a hash entry you can write $bar{'homburg'} or $bar{homburg}. Between the '{}' Perl treats a bareword as a literal string. Anything else is treated as an expression, whose result is converted to string form, if required.
Note that in $spa{0001} the 0001 is not a bareword, it is trivial expression, whose result is converted to the string '1'. So $spa{0001} is not equivalent to $spa{'0001'}, appearance notwithstanding.
this is where '=>' and ',' differ. The '=>' tells Perl to treat any bareword before it as a literal string. So the following are equivalent: ('Mon', 0), ('Mon' => 0) and (Mon => 0).
The relationship between "constants" and barewords is slightly tricky. Consider:
whose output is:use strict; use warnings ; use constant HOMBURG => 'HA' ; my %bar = (homburg => 'homburg!', HOMBURG => 'HomBurg!', HOMBURG, 'H +a!') ; print join(', ', %bar), "\n" ; my $k = HOMBURG ; print "\$k = HOMBURG -> \$k=$k\n" ; print "\$bar{homburg}=$bar{ homburg }, \$bar{HOMBURG}=$bar{HOMBURG}, + ", "\$bar{\$k}=$bar{$k}, \$bar{+HOMBURG}=$bar{+HOMBURG}\n" + ;
homburg, homburg!, HA, Ha!, HOMBURG, HomBurg!
$k = HOMBURG -> $k=HA
$bar{homburg}=homburg!, $bar{HOMBURG}=HomBurg!, $bar{$k}=Ha!, $bar{+HOMBURG}=Ha!
which shows a number of things:
where Perl is expecting (but not requiring) a bareword, it does not treat the bareword as a "constant". So in HOMBURG => 'HomBurg!' and $bar{HOMBURG} the HOMBURG is not the "constant" whose value is 'HA'. This may or may not be a disappointment.
this special handling means that $bar{HOMBURG} and $bar{$k} are not equivalent, even though HOMBURG has been assigned to $k. (If there wasn't a faint whiff of magic, it wouldn't be Perl.)
if you want the value of the "constant" HOMBURG as a hash key, you need to persuade Perl that there's an expression. In this example I used '+HOMBURG', which is one convention. You could also write $bar{HOMBURG()}, to force Perl to treat it as the subroutine which the "constant" "is" (mostly, but that's another story). Though not shown, the same applies to +HOMBURG => 'Ha!'
If you're still awake, you may be asking yourself why +HOMBURG doesn't generate an error, given that the value of HOMBURG is manifestly not numeric. (The clever people who know the answer to this one can leave now.) So:
in numeric expressions Perl will generally accept a string that looks like a number, as a number. So if we start with the string '123 456' and split it my ($a, $b) = split(/ /, '123 456') it would be reasonable to suppose that $a, and $b were strings, and in a less enlightened language $a + $b would be an error (or might give '123456'). Perl happily returns the result 579, and you may never have considered that this might be surprising.
of course, if the string was 'zlxq 456' then the addition will fail, because Perl has no defined way of adding 'zlxq' and '456' together. (Which may, or may not, be a surprise.)
some apparently numeric operations, however, are defined to work on strings which do not look like numbers. In particlar unary '+' and '-', so:
gives:my $w = '0001' ; my $x = 'zlqq' ; my $y = '-zlxq' ; my $z = '+zlxq' +; print " \$w=$w, \$x=$x, \$y=$y, \$z=$z\n" ; print "+\$w=", +$w, ", +\$x=", +$x, ", +\$y=", +$y, ", +\$z=", +$z, +"\n" ; print "-\$w=", -$w, ", -\$x=", -$x, ", -\$y=", -$y, ", -\$z=", -$z +, "\n" ;
$w=0001, $x=zlqq, $y=-zlxq, $z=+zlxq +$w=0001, +$x=zlqq, +$y=-zlxq, +$z=+zlxq -$w=-1, -$x=-zlqq, -$y=+zlxq, -$z=-zlxqbecause unary '+' is defined to have no effect whatsover on its operand (so isn't in the slightest bit interested whether the operand looks like a number or like a bunch of bananas). Unary '-', on the other hand, will treat its operand a number, if it can; otherwise it prefixes the string with a '-' character; unless the string starts with '+' or '-' ... (yes, this is all defined behaviour).
unary '-' is defined to accept a bareword. However, unlike '=>', "constants" are evaluated, so:
gives:use strict; use warnings ; use constant HOMBURG => 'HA' ; print "-HOMBURG=", -HOMBURG, ", -foo=", -foo, "\n" ;
Ambiguous use of -HOMBURG resolved as -&HOMBURG() at sigs.pl line 5. -HOMBURG=-HA, -foo=-foo(where the warning indicates the degree of wonderfulness involved here).
conversely, where we have a numeric value Perl will happily convert it to a string if required. So, ($w + 9) . $x yields the string '10zlxq' (given that $w = '0001' and $x = 'zlxq'). This process is known as "stringification", and applies not just to simple numeric values but to almost everything -- in a number of simply wonderful ways. (In less enlightened languages there is a sharp distinction between strings and numbers, and it is up to the programmer to explicitly convert between the two.)
so far, so good. For (most) numeric operations we expect that Perl (helpfully) will convert strings to numbers, if it can. And, for string operations we expect that Perl (helpfully) will convert numbers to strings. So there's no effective difference between numbers and strings that can be converted to numbers...
...up to a point. The bitwise operations behave differently where Perl thinks the operand is a string. For example, unary '~' will take a numeric value (forced to integer form) and return the '1's complement. If the value is a string, however, it will not attempt to convert it to a number, but will return a string with every bit of the original string inverted. Thus:produces:my $x = '0x1234' ; my $y = 0x1234 ; printf "\$x=%s, ~\$x=%s\n", show($x), show(~$x) ; printf "\$y=0x%X, ~\$y=0x%X\n", $y, ~$y ; printf "hex(\$x)=0x%X, ~\hex($x)=0x%X\n", hex($x), ~hex($x) ; printf "\$y=%s,\n ~\$y=%s\n", show($y), show(~$y) ; sub show { return '"'. join('', map { sprintf('\\x%02X', ord($_)) } split(//, + $_[0])) .'"' ; } ;
$x="\x30\x78\x31\x32\x33\x34", ~$x="\xCF\x87\xCE\xCD\xCC\xCB" $y=0x1234, ~$y=0xFFFFFFFFFFFFEDCB hex($x)=0x1234, ~hex(0x1234)=0xFFFFFFFFFFFFEDCB $y="\x34\x36\x36\x30", ~$y="\x31\x38\x34\x34\x36\x37\x34\x34\x30\x37\x33\x37\x30\x39\x35\x34\x36\x39\x35\x35"which shows the difference between the string operand $x and the numeric operand $y. (The last line shows that there is no sleight of hand here -- show() simply renders the string form of its argument in hex.)
The binary bitwise operations '&', '|' and '^' will convert a string operand to numeric form if the other is numeric. But if both arguments are strings, then it will perform the operation, byte-wise, between the strings.
The vec function also treats strings as collections of bits.
I leave as homework what happens with utf8 strings, and whether '<<' and '>>' will operate on strings.
Of course, this begs the question: how can you tell when Perl thinks something is a string or a number. AFAIK this is not defined. However, the result of a numeric operation can be relied on to be a number, and the result of a string operation can be relied on to be a string. I/O operations work with strings, so watch out for:
which gives:use strict; use warnings ; my $z = <DATA> ; printf "\$z=%s, ~\$z=%s\n", show($z), show(~$z) ; sub show { return '"'. join('', map { sprintf('\\x%02X', ord($_)) } split(//, + $_[0])) .'"' ; } ; __DATA__ 0x1234
$z="\x30\x78\x31\x32\x33\x34\x0A", ~$z="\xCF\x87\xCE\xCD\xCC\xCB\xF5"To convert a string to a number you can simply add zero (as in: $x += 0 ; or ($x + 0)), though that assumes base 10. oct works for octal and numbers prefixed 0x and 0b, but it's a small disappointment that there isn't a single function that will convert a string as if it were a general Pel literal number.
(Most numeric operations in Perl do not distinguish between floating point and integer arguments. Bitwise operations are an exception there too, requiring their operand(s) to be converted to integer form -- at which point it may start to matter to you whether your system suports 32 or 64 bit integers.)
Finally, while on the topic of quirks in Perl's handling of strings... when evaluated as a boolean (true/false) value an empty string is false and a non-empty string is considered true, except for '0'. Note that this is not treating the string as a number -- to be false a non-empty string must be exactly one character long, and that character must be '0' ! So:
gives:use strict; use warnings ; foreach my $z ('0', '0000', '+0', '-0', ' 0', '0 ', '0E0') { try($z) + ; } ; sub try { my ($z) = @_ ; my $bool = $z ? 'True' : 'False' ; my $zero = $z == 0 ? '==' : '!=' ; printf "%7s is %5s and %2s 0\n", "'$z'", $bool, $zero ; } ;
'0' is False and == 0
'0000' is True and == 0
'+0' is True and == 0
'-0' is True and == 0
' 0' is True and == 0
'0 ' is True and == 0
'0E0' is True and == 0
You will often see code which tests strings thus: if ($string). This is shorter than if ($string ne ''), and has the advantage of working if $string is undef. If $string can ever be '0', however, you will regret not having written if (defined($string) && ($string ne '')), clumsy though that may appear !
|
|---|