I do alot of fairly intensive file processing, and Perl makes it alot easier. I like the fact that split has the ability to break out fields on patterns instead of just a fixed set of characters (or in some languages just a single character), however, I see room for improvement. I think that the ability to break out data on an array of patterns would be really helpful and a logical extension to split's current power. If the first argument to split was a reference to an array, then the elements in the array would be treated as patterns to be used in splitting out the data. For example:
$string = "a:b::c d"; @patterns = (':','::','\s+'); @fields = split(\@patterns, $string);
would yield ('a','b','c','d'), the first delimeter would be the first ':', the second delimeter would be '::' and the last delimeter would be whitespace. In addition, the ability to use split to break out fixed length fields would be cool, and could be implemented by providing a refernce to an integer similiar to the way that the new input-record-seperator works. So you could say something like:
split([ ':',',',\20,\10 ],$_);
Does anyone else think that any of this would be useful? Are there easy ways to acheive the general ideas here without modifying split? Any comments would be appreciated.

tigervamp

In reply to expanding the functionality of split by tigervamp

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.