in reply to substr function

Search for pp_substr in pp.c.

It's not a good idea to want to modify substr on that level. You can use subs (I think) and override substr with your own custom function. That makes it still a bad idea, but at least you can write your replacement in Perl that way.

What problem are you trying to solve that involves replacing substr?

Update: subs seems to be the wrong module, but I'm sure I remember a module that allowed one to replace built-ins. Maybe I'm just thinking of CORE::GLOBAL:: though:

BEGIN{ *CORE::GLOBAL::substr=sub($$$) { die }; } print substr 'foo',2,3;

Replies are listed 'Best First'.
Re^2: substr function
by tej (Scribe) on Jan 13, 2011 at 15:02 UTC

    I want to understand how it is working and then write my own subroutine for my script

    Suppose i have string that contains tags like "<bold>" it should not count this tag. If string has something like "<194>" I want substr to consider it as one character.

    Thank you very much

      Then write your own implementation of substr instead of replacing it. Most likely you'll need to write a parser for whatever language you're parsing. It shouldn't be hard, and "replace substr" clutters your problem with many hard (and weird) problems that are unrelated to the task.

      Write an ordinary Perl program that filters the text, then measures its length using the built-in substr function.

      You're trying to solve an everyday text processing problem in a very peculiar and unconventional way. Writing a custom substr function to measure a particular kind of "string" is like inventing a bathroom scale that knows when your hair is wet and you haven't removed your shoes, yet accurately measures your bone-dry, barefoot weight.

      I'm curious: Do you want to measure the lengths of the strings in bytes, in encoded characters (Unicode code points), or in real-world characters (Unicode extended grapheme clusters)?

      CLARIFICATION: I admit I sort of conflated substr and length in this post. My excuse is that I was fixated on the words "count" and "one character" in tej's restatement of his Y problem:

      Suppose i have string that contains tags like "<bold>" it should not count this tag. If string has something like "<194>" I want substr to consider it as one character.
        I suspect <194> refers to a character, and that this notation is used for non-ASCII characters. If so, your last question is moot.