Unearthed Arcana

Replies are listed 'Best First'.
Re: Unearthed Arcana by BrowserUk (Patriarch) on May 13, 2011 at 08:32 UTC
The only really arcane bit is that glob syntax allows you to bypass the sub naming rules.	[reply]
Re: Unearthed Arcana by JavaFan (Canon) on May 13, 2011 at 08:13 UTC
The second one creates a sub 100 that returns a random number, and creates 0 .. 99 that all call (N+1), resulting in subs 0 .. 100 returning N random numbers, with N the argument, and 0 <= N <= 100.	[reply]
Re: Unearthed Arcana by JavaFan (Canon) on May 13, 2011 at 09:34 UTC
The first program generates a random number for each way it tries to match `/.{0,2}.{0,3}.{0,3}.{0,4}+/` against `"uc ﬁﬃ" eq "FIFFI"`, before giving up on the backtracking. It tries 101 times, populating @_ in each round. With the final pop, that results in 100 random numbers. So, do you have any challenging puzzles? `;-)`	[reply] [d/l]
Silly ligatures by educated_foo (Vicar) on May 13, 2011 at 10:30 UTC
Completely off-topic, your post demonstrates the profound stupidity of Unicode ligatures. Ligatures are a typographic trick to make certain sequences of letters like "fi" and "ffi" look pretty when displayed in some media. Comically, the Unicode ligatures not only make life a royal pain for regular expression matching, but they're also ugly as sin (compare the actual "fi" to the "fi"-ligature here). They're even less useful than pages of emoji.	[reply]
Re: Silly ligatures by tchrist (Pilgrim) on May 13, 2011 at 13:42 UTC
Completely off-topic, your post demonstrates the profound stupidity of Unicode ligatures. Ligatures are a typographic trick to make certain sequences of letters like "fi" and "ffi" look pretty when displayed in some media. Comically, the Unicode ligatures not only make life a royal pain for regular expression matching, but they're also ugly as sin (compare the actual "fi" to the "fi"-ligature here). They're even less useful than pages of emoji. The reason Unicode has those particular ligatures is to preserve the originals when doing round‐trip conversions with legacy encodings that allowed such things to be specified with distinct, individual codes. In modern typesetting, such matters should be — and are — taken care of automatically. ¡Fontalicious! On the matter of being ugly as sin, here is my emoji example where I actually use fi ligatures three times, just because that was a posting where I was being extreme in the font games. If you look closely at that example, they do look marginally better there than the unkerned alternatives, although not so much that you would normally even notice them. Which is just as it should be. It certainly isn’t “ugly as sin”; it looks fine. Of course, if you’re using some brutish sans serif font as your default display and that font hasn’t made allowances for these legacy ligatures, so that you have to resort to some fallback font‐substitution glyph, then well that’s the price you pay for brutishness. 😜 On the other hand, in this sample in Adobe Caslon Pro, I use no ligatures at all; all that is figured out for me by the font itself. For a somewhat subtler effect, here’s that sample again, this time in Adobe Garamond Pro. But for real sophistication, there’s just nothing like that same sample rendered in Zapfino. All three of those samples are fine examples of good kerning rules that don’t make the user say how and what and where things are tied together — that is, ligated. (Hey, did you know that that ligar con alguien is Spanish slang for “to hook up”, as in “to get laid”?) It all magically falls out of the OpenType rules built into each respective font. `NFKD($s) =~ /⋯/i` Now, regarding the regex matter. The legacy ligatures are actually doing people a service here, because they make it obvious that you cannot just do blind searches on unnormalized Unicode text. Regexes make no allowances for things like default ignorables, diacritic‐insensitive comparisons, decompositions, or collation‐strength equivalences. And you need all those things. Now, it just so happens that Unicode does have case folds for the legacy ligatures, although these are the one‐to‐many full case folds that next to nobody but Perl even tries to handle. That means this works: % perl -E 'say "E\x{FB03}ciency" =~ /^effi/i \|\| 0' 1 However, because we don’t allow incomplete matches stranding part of a code point, this doesn’t: % perl -E 'say "E\x{FB03}ciency"' Eﬃciency % perl -E 'say "E\x{FB03}ciency" =~ /^eff/i \|\| 0' 0 That shows why you really want a compatibility decomposition for text searching: % perl -MUnicode::Normalize -E 'say NFKD("E\x{FB03}ciency") =~ /^effi/i \|\| 0' 1 % perl -E 'say "3:15 \x{33D8}"' 3:15 ㏘ % perl -MUnicode::Normalize -E 'say NFKD("3:15 \x{33D8}") =~ /\bP\.?M\b/i \|\| 0' 1 I’ll address collation‐strength equivalence, including but not limited to diacritic‐insensitive matching, some other day.	[reply]
Re: Unearthed Arcana (intentions) by tye (Sage) on May 13, 2011 at 19:40 UTC
Both are fine illustrations of the importance of careful formatting and whitespace — and yes, sometimes even comments — to make the intentions clear. I don't see anything even close to "clear intentions". Sure, the first chunks of code were rather hard to discern the intentions of. But a lack of whitespace made the parsing of the code only slightly more difficult. The blown-up examples of code actually did very little to make any intentions clear to me. There seems to be a bunch of added "text" especially in one case that I find mostly contributes confusion. I think you've managed to instead demonstrate that whitespace, formatting, and comments are often not worth spit in the face of bizarre code. You've just reinforced my belief that writing clear code is much more important than any of whitespace, formatting, or comments... probably quite counter to your intentions for the above node. - tye	[reply]
Re^2: Unearthed Arcana (intentions) by tchrist (Pilgrim) on May 13, 2011 at 22:28 UTC
Both are fine illustrations of the importance of careful formatting and whitespace — and yes, sometimes even comments — to make the intentions clear. I don't see anything even close to "clear intentions". Ya think? :) Watch their hands, not their lips. I think you've managed to instead demonstrate that whitespace, formatting, and comments are often not worth spit in the face of bizarre code. You've just reinforced my belief that writing clear code is much more important than any of whitespace, formatting, or comments... probably quite counter to your intentions for the above node. No, you were right the first time. It’s the old Rob Pike thing about how comments don’t do one bit to turn confusing code into clear code. In fact, they can even make it worse. Not a single comment was in any way explanatory. In the first program, the comments are of course there only to daze and confuse. In the second, the comment is there for ironic effect. As you discovered with my first supercited line I opened this missive with, I don’t always lace my words with smirking emojic guideposts: that doesn’t mean they don’t apply. If you can’t laugh without a laugh track, how funny is it, really? And the comments in the first program are not as far from reality as you might think. I’d just written a program in a state of mild pique that very well could have done something like that. See, I was torqued off at `perl -P` being robbed from us with nothing but a big fat gaping left in the documentation in its stead. You can take my `cpp` when you pry it out of my cold, dead fingers So I wrote a program that did this: `#define exec(arg) BEGIN { exec("cpp $0 \| $^X") } # nyah nyah nyah-NYAH nyah!! #undef exec #define CPP(FN, ARG) printf(" %6s %s => %s\n", main::short("FN"), q(AR +G), FN(ARG)) #define QS(ARG) CPP(main::qual_string, ARG) #define QG(ARG) CPP(main::qual_glob, ARG) #define NL say ""` [download] Which worked just fine. Here’s that whole program: Read more... (3 kB) Go ahead, just try writing that one without cpp or any fancy source filters: `ENOFUN`! Simple things should be simple, dang nabbit! The Undiscovered Namespace I was also having fun calling numerically named functions, and in other versions of the code I had numerically named arrays with things like: `@12 = 12->(12);` [download] This was all prompted by a mistake in chromatic’s Modern Perl. It erroneously claims that `my @3;` is an invalid Perl identifier. That’s of course not true. First of all, `my` is not an identifier; `@3` is. And it is a perfectly valid Perl identifier, as evidenced by: % perl -Mstrict -E '@4 = (4) x 4; say "@4"' 4 4 4 4 As you see, you can `strict` it till you choke, but there it remains, perfectly pleased with itself. What `my @3` is, is an invalid declaration of a perfectly healthy Perl identifier. Other sorts of declarations with it work just fine. Here’s a lexically scoped alias: % perl -Mstrict -E 'our @4 = (4) x 4; say "@4"' 4 4 4 4 And here’s a dynamically scoped value: % perl -Mstrict -E 'local @4 = (4) x 4; say "@4"' 4 4 4 4 Whereas here’s a — um, something else: % perl -Mstrict -E 'local our @4 = (4) x 4; say "@4"' 4 4 4 4 But don’t expect a package to protect you. `@4` is an über‐global: % perl -Mstrict -E 'say @4 = __PACKAGE__; { package Innumerable; @4 = __PACKAGE__ } say "@4"' main Innumerable Without even resorting to hyperbole, Perl has billions and billions of these exquisite über‐globals. You could write all your programs just using them, and no strictures will ever wine at you. % perl -Mstrict -E 'say %3 = (1..4); say $3{3}' 1234 4 With functions, all you have to do is name them in a somewhat circuitous fashion: % perl -Mstrict -E '4 = sub { say "\Ufor@_" }; &4' FOR* % perl -Mstrict -E '4 = sub { say "\Ufor@_" }; &4(get=>)' FORGET* % perl -Mstrict -E '4 = sub { say "\Ufor@_" }; 4->(ever::)' FOREVER* You will notice that I even get to call the function using a symbolic dereference, despite `no strict "refs"` being in force — if you can call that “force”. If you’re wondering why this exists, it’s of course an artifact of the way the numbered variables, `$1` &c &c, work. But it also leaves the door open so that we can someday make this work: "800-555-1212" =~ /(\d+-?)+/; say "numbers were: ", join " and ", @1; numbers were: 800- and 555- and 1212 And yeah, this will be hard on the people who write programs using only numbered variables and subroutines, but tough noogies. It’s one thing to present a simplified version of reality, but you can only bend the truth so far before it breaks. Not only is `@3` a perfectly legal Perl identifier, there are a whole lot more where that came from. To say otherwise is — well, let’s just say it’s too chary of the truth for my conscience.	[reply] [d/l] [select]
Re^3: Unearthed Arcana (intentions) by chromatic (Archbishop) on May 13, 2011 at 23:15 UTC
It’s one thing to present a simplified version of reality, but you can only bend the truth so far before it breaks. I go as so far as to say that that line in my book is a deliberate fib. By the time readers know enough Perl 5 to know why what I wrote isn't true in that specific case, they should know enough to know why it isn't true—and, hopefully, why I fibbed without a footnote.	[reply]
Re^3: Unearthed Arcana (intentions) by BrowserUk (Patriarch) on May 13, 2011 at 23:02 UTC
See. True to type. Mastery of the useful obscure, is...well...useful. Only occasionally, but still useful. Mastery of obscurity for its own sake--how is sub 1{ ... } any more useful than `sub one{ .. };`?--is naught more than an puerile attempt at one-upmanship. No attempt to teach or inform, nor even to sportingly challenge. Simply to say: I know; you don't. Pure egotism.	[reply] [d/l]
Re^3: Unearthed Arcana (intentions) by tye (Sage) on May 13, 2011 at 22:36 UTC
Ah. So your post had intentions nearly as unclear as your code. I guess that's... something. - tye	[reply]
Re: Unearthed Arcana by LanX (Saint) on May 13, 2011 at 08:30 UTC
you might wanna use `<spoiler></spoiler>` tags. :) Cheers Rolf	[reply] [d/l]
Re: Unearthed Arcana by lancer (Scribe) on May 17, 2011 at 11:26 UTC
Sometimes I really can't see the point of obfuscating code... Ok, regex-es could be a valid exception. There's only one language for describing regex expressions, as far as I know, and it's quite a rigid and dense language. Maybe a more readable language could be created for regex-es too, but right now it doesn't exist. So I excuse regex-es from being unreadable. But otherwise, I think the best writing style for code is when I can glance at a page of it and see what it does, and move on to the next page.	[reply]

Spoilers Below

First program, elaborated

Second program, elaborated

Summary

¡Fontalicious!

`NFKD($s) =~ /⋯/i`

You can take my `cpp` when you pry it out of my cold, dead fingers

The Undiscovered Namespace

Spoilers Below

First program, elaborated

Second program, elaborated

Summary

¡Fontalicious!

NFKD($s) =~ /⋯/i

You can take my cpp when you pry it out of my cold, dead fingers

The Undiscovered Namespace

`NFKD($s) =~ /⋯/i`

You can take my `cpp` when you pry it out of my cold, dead fingers