My guess is that the reason there isn't such a subroutine ready-made in Perl is because that kind of fancy substring extraction is usually handled by a regular expression in Perl. Regular expressions are more capable of handling the huge variety in methods for ending a string: a single character or end of string, a set of terminal characters (space or X or Z, whichever comes first), first occurance of a single character or a maximum number of total characters, a terminal string rather than a single terminal character, and many, many more. Some examples:

#from chr 2 to right before first space or to the end of $str #if no space is found # - ^.{2} = skip past first two characters # - \S = not whitespace, \s=whitespace # - (\S*) captures zero or more non-whitespace characters # - ($str =~ /^.{2}(\S*)\s/) is a list containing one string, # i.e. ($1) where $1=what was captured by (\S*) printf "substr(2, first ' ' or end): %s\n", ($str =~ /^.{2}(\S*)/); #from chr 2 to lessor of 5 character or first space #\S = not whitespace, \s=whitespace printf "substr(2, first ' ' or 5 chars): %s\n" , ($str =~ /^.{2}(\S{0,5})/); #from chr 3 to first X or end of $str printf "substr(3, first 'X' or end): %s\n" , ($str =~ /^.{3}([^X]*)/); #from chr 3 to lessor of first X or 5 chars printf "substr(3, first 'X' or 5 chars): %s\n" , ($str =~ /^.{3}([^X]{0,5})/); #from chr 3 to first occurance of two or more A's or to the end if #no doubled A's are found printf "substr(3,two or more A's or end): %s\n" , ($str =~ /^.{3}(.*?)(AA|$)/); #from chr 10 to lessor of 5 chars or first of run of 2 or more A's printf "substr(10,two or more A's or 5 chars): %s\n" , ($str =~ /^.{10}((?:[^A]|A(?!A)){0,5})/); #from chr 10 to lessor of 5 chars or first of run of 2 or more X's printf "substr(10,two or more X's or 5 chars): %s\n" , ($str =~ /^.{10}((?:[^X]|X(?!X)){0,5})/); #from chr 5 to first occurance of two or more X's or to the end if #no doubled A's are found printf "substr(3,two or more X's or end): %s\n" , ($str =~ /^.{3}(.*?)(?:XX|$)/); #outputs substr(2, first ' ' or end): XCDEFDGHIXTAAGRAAAAAA substr(2, first ' ' or 5 chars): XCDEF substr(3, first 'X' or end): CDEFDGHI substr(3, first 'X' or 5 chars): CDEFD substr(3,two or more A's or end): CDEFDGHIXT substr(10,two or more A's or 5 chars): IXT substr(10,two or more X's or 5 chars): IXTAA substr(3,two or more X's or end): CDEFDGHIXTAAGRAAAAAA theEnd

I grant you the syntax of those regular expressions above is somewhat arcane and cryptic. They aren't as obvious to the untrained eye as substr_chr($str,3,'A'). However, they give you much more flexibility to roll your own string endings with just a few keystrokes.

Have you had a chance to study perlretut and perlre? If not, consider doing so. If you are extract strings based on characters or other textual considerations on a regular basis, you will find regexes a very powerful tool in your toolkit.

Update: fixed typos in output labels


In reply to Re: Substring consisting of all the characters until "character X"? by ELISHEVA
in thread Substring consisting of all the characters until "character X"? by TheMartianGeek

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.