Re^4: Perl 5 Optimizing Compiler, Part 4: LLVM Backend?

Much of this stems from the fact that the way the perl sources are structured, C compilers cannot easily optimise across compilation unit boundaries, because they mostly(*) do compile-time optimisations. However, there are a whole class of optimisations that can be done at either link-time or runtime, that would hugely benefit Perl code.
(*)MS compiler have the ability to do some link-time optimisations, and it would surprise me greatly if gcc doesn't have similar features. It would also surprise me if these have ever been enabled for teh compilation of Perl. They would need to be specifically tested on so many platforms, it would be very hard to do.
But, something like LLVM, can do link-time & runtime optimisations, because it (can) targets not specific processors, but a virtual processor (a "VM") which allows its optimiser to operate in that virtual environment. And only once the VM code has been optimised is it finally translated into processor specific machine code.That means you only need to test each optimiation (to the VM) once; and independently, the translation to each processor.

64 bit VC builds have been LTCG from basically day 1 http://perl5.git.perl.org/perl.git/commit/d921a5fbd57e5a5e78de0c6f237dd9ef3d71323c?f=win32/Makefile. A couple months ago I compiled a Perl with ltcg in 32 bit mode, unfortunately I dont remember the VC version, whether it was my 2003 or my 2008. The DLL got slightly (I dont remember how many KB) fatter from inlining, but the inlined functions still existed as separate function calls, and the assembly looked the same everywhere, and I didn't find anything (looking around randomly by hand) that got a non-standard calling convention except what already was static functions. I wrote it off as useless. 2003 vs 2008 for 32bit code might make all the difference though. I decided it wasn't worth writing a patch up for and submitting to P5P to change the makefile.

With Will's LLVM proposal, I believe nothing will come of it unless some or all of the pp_ opcode functions, along with runops are rewritten in "not C", or perl opcodes are statically analyzed and converted to native machine data types with SVs being gone. All the "inter procedure optimizations" mentioned in this thread are gone the moment you create a function pointer, it is simply the rules of C and C's ABI on that OS http://msdn.microsoft.com/en-us/library/xbf3tbeh%28v=vs.80%29.aspx.

I went searching through perl's pre and post preprocessor headers. I found some interesting things which prove that automatic IPO, on Perl, in C with any compiler is simply impossible.

/* Enable variables which are pointers to functions */
typedef void (*peep_t)(pTHX_ OP* o);
typedef regexp* (*regcomp_t) (pTHX_ char* exp, char* xend, PMOP* pm);
typedef I32     (*regexec_t) (pTHX_ regexp* prog, char* stringarg,
                      char* strend, char* strbeg, I32 minend,
                      SV* screamer, void* data, U32 flags);
typedef char*   (*re_intuit_start_t) (pTHX_ regexp *prog, SV *sv,
                        char *strpos, char *strend,
                        U32 flags,
                        re_scream_pos_data *d);
typedef SV*    (*re_intuit_string_t) (pTHX_ regexp *prog);
typedef void    (*regfree_t) (pTHX_ struct regexp* r);
typedef regexp* (*regdupe_t) (pTHX_ const regexp* r, CLONE_PARAMS *par
+am);
typedef I32     (*re_fold_t)(const char *, char const *, I32);

typedef void (*DESTRUCTORFUNC_NOCONTEXT_t) (void*);
typedef void (*DESTRUCTORFUNC_t) (pTHX_ void*);
typedef void (*SVFUNC_t) (pTHX_ SV* const);
typedef I32  (*SVCOMPARE_t) (pTHX_ SV* const, SV* const);
typedef void (*XSINIT_t) (pTHX);
typedef void (*ATEXIT_t) (pTHX_ void*);
typedef void (*XSUBADDR_t) (pTHX_ CV *);

typedef OP* (*Perl_ppaddr_t)(pTHX);
typedef OP* (*Perl_check_t) (pTHX_ OP*);
typedef void(*Perl_ophook_t)(pTHX_ OP*);
typedef int (*Perl_keyword_plugin_t)(pTHX_ char*, STRLEN, OP**);
typedef void(*Perl_cpeep_t)(pTHX_ OP *, OP *);

typedef void(*globhook_t)(pTHX);
//////////////////////////////////////////
/* dummy variables that hold pointers to both runops functions, thus f
+orcing
 * them *both* to get linked in (useful for Peek.xs, debugging etc) */

EXTCONST runops_proc_t PL_runops_std
  INIT(Perl_runops_standard);
EXTCONST runops_proc_t PL_runops_dbg
  INIT(Perl_runops_debug);

////////////////////////////////////////////
START_EXTERN_C

#ifdef PERL_GLOBAL_STRUCT_INIT
#  define PERL_PPADDR_INITED
static const Perl_ppaddr_t Gppaddr[]
#else
#  ifndef PERL_GLOBAL_STRUCT
#    define PERL_PPADDR_INITED
EXT Perl_ppaddr_t PL_ppaddr[] /* or perlvars.h */
#  endif
#endif /* PERL_GLOBAL_STRUCT */
#if (defined(DOINIT) && !defined(PERL_GLOBAL_STRUCT)) || defined(PERL_
+GLOBAL_STRUCT_INIT)
#  define PERL_PPADDR_INITED
= {
    Perl_pp_null,
    Perl_pp_stub,
    Perl_pp_scalar,    /* implemented by Perl_pp_null */
    Perl_pp_pushmark,
    Perl_pp_wantarray,
    Perl_pp_const,
////////////////////////////////////////////
[download]

Now in C++, in theory, calling conventions don't exist unless you explicitly force one. The compiler is free to choose how it wants to implement vtables/etc. MS's Visual C for static C functions does do some pretty good "random" calling conventions for 32bit X86 IMHO. For 64 bit X86, Visual C never deviated from the 1 and only calling convention. The question is, are there any compilers daring enough to create a whole DLL/SO which contains exactly 1 function call in C?

Not any professional compiler. On some OSes (x64 windows), ABI is enforced through OS parsing of assembly code (x64 MS SEH, technically not true, if you are careful, the OS will never have a reason to parse your ASM). And on some CPUs (SPARC) calling conventions are enforced in hardware.

Another danger, there is a fine line between inlining/loop unrolling, and making your L1 and L2 Caches useless. Blindly inlining away all function calls will cause a multi MB object file per Perl script that won't solve anything.

Comment on Re^4: Perl 5 Optimizing Compiler, Part 4: LLVM Backend? Download Code

Replies are listed 'Best First'.
Re^5: Perl 5 Optimizing Compiler, Part 4: LLVM Backend? by BrowserUk (Patriarch) on Aug 28, 2012 at 11:04 UTC
I went searching through perl's pre and post preprocessor headers. I found some interesting things which prove that automatic IPO, on Perl, in C with any compiler is simply impossible. But LLVM isn't a C compiler. It can compile C (amongst many other languages), but it doesn't (have to) follow C conventions. LLVM is a far more an assembler targeting a user definable virtual processor. As an illustration of the sorts of things it can and does do, can you think of any other compiler technology that will generate 832-bit integers as a part of its optimisation pass? You have to stop thinking of LLVM as a C compiler before can even begin to appreciate what it is potentially capable of. It is weird, and to my knowledge unique. In a world where everything -- processors, memory, disk, networking et al. -- are being virtualised; why not virtualise the compiler, have it target a (user configurable) virtual processor, and produce not just platform independence, but processor architecture independence, and source language independence? Can it really tackle a hoarey ol' dynamic language and apply those principles to it successfully? The simple answer is: I do not know. But neither does anyone else! Stop nay-saying based upon your knowledge of what C compilers do, and follow the matra: (Let someone else) Try it! I first installed LLVM here (going by the date on the subdirectory) on the 6th May 2010: `C:\test>dir /t:c .\| find "llvm" 06/05/2010 20:35 <DIR> llvm 06/05/2010 20:55 <DIR> llvm-2.7` [download] I've been playing with it and reading about it on and off ever since, I still keep learning new things about it all the time. It is unlike anything I've come across before, and defies my attempts at description. Virtual compiler, virtual interpreter, virtual assembler. Take your pick; or all 3. Give it a try, (or at least a read) before you summarily dismiss it it out of hand. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP Neil Armstrong	[reply] [d/l]
Re^5: Perl 5 Optimizing Compiler, Part 4: LLVM Backend? by BrowserUk (Patriarch) on Aug 28, 2012 at 11:44 UTC
Sorry for the second reply, but I responded before seeing the stuff below the code block. For 64 bit X86, Visual C never deviated from the 1 and only calling convention. Firstly, that is a good thing. Much better than the previous world of cdecle, pascal, fastcall et al. Whilst the MSC compiler won't vary the calling convention, it provides the hooks to allow the programmer to do so. See the frame attribute of the PROC directive & .ENDPROLOGUE directive Scant information, but the possibility. The question is, are there any compilers daring enough to create a whole DLL/SO which contains exactly 1 function call in C? Just this morning I read that the LLVM JIT had until recently, a 16MB limitation on its JIT'ed code size, which is now lifted. Besides which, I don't believe that you need to optimise across function boundaries to get some significant gains (over C compilers) out of the Perl sources. You pointed out that many of perl's opcodes and functions are huge. Much of the problem is not just that they are huge, but also that they are not linear. The macros that generate them are so heavily nested and so frequently introduce new scopes, and unwieldy asserts, that C compilers pretty much give up trying optimise them because they run out of whatever resources they use when optimising. Too many levels of scope is a known inhibitor of optimisers. That's where inlining can help. Will LLVM fare any better? Once again, we won't know for sure unless someone tries it. Another danger, there is a fine line between inlining/loop unrolling, and making your L1 and L2 Caches useless. Blindly inlining away all function calls will cause a multi MB object file per Perl script that won't solve anything. Once again I ask: are you sure? If the JIT can determine that this variable -- hash(ref), array(ref) or scalar -- is not tied, has no magic, and never changes its type -- IV or NV to PV or vice versa -- within a particular loop, then it can throw away huge chunks of conditional code. Similarly for utf/non-utf string manipulations; similarly for all the context stuff for non-threaded code on threaded builds. Note: I say "can", not will. The only way we collectively will know for sure if it will, is to try it. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP Neil Armstrong	[reply]

Stop nay-saying based upon your knowledge of what C compilers do, and follow the matra: (Let someone else) Try it!