Actually I ran into this problem with my Tree::RB::XS module when I wanted to case-fold the keys. The 'uc' operator doesn't have a clean alternative C API available. There are API calls for single characters like 'toUPPER_utf8' but I didn't dig enough to find out if there's a robust way to call this in a loop for all the different versions of perl. The implementation of the uc operator (grep for "pp_uc" in pp.c) has a bunch of ifdef conditionals which have probably changed a lot over the years.

Since I want to support back to 5.8, I decided to just call out to the perl function with call_pv("CORE::fc", G_SCALAR);. But, as the nearby comments mention, before perl 5.16 that wasn't a function so I needed to wrap the op with a function as sub _fc_impl { lc shift } and then call that.

Since calling perl functions is a decent bit of overhead, if you need this to run in a hot code path you might still be better off with your external unicode library. Or, if you want to avoid that dependency and stick to recent versions of perl you could just copy/paste most of the pp_uc implementation into your own function and call that (but careful with copyrights there).

And... um... if you get a reasonably robust version made with the perl API, I'd love to improve the performance of Tree::RB::XS :-)


In reply to Re^5: Memory Leak with XS but not pure C by NERDVANA
in thread Memory Leak with XS but not pure C by FrankFooty

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.