Aside from the issues of how split works, it appears to me that you have a situation that is more situated to regex match or regex match global rather than split.

-In general use split when you know what to throw away and that "throw away separator" is an easy to identify sequence in the input.
-Use regex when you know what you want to keep and you can either (a) write one regex that describes all the "hunks" that you want or (b) you can enumerate the patterns easily.
-Sometimes the techniques are best combined and that leads to more complicated regex patterns in the split. As a performance note, in many of my benchmarks, a regex match/match global is faster using a split. A complex regex in a split burdens the "slower but simple" split with something complicated.

It looks to me like you want to "split" when you see the first "-" that is before a number.. and that really means that a regex match solution is in order rather than a split.

There are other regex solutions - I don't claim that this is the best, but I do recommend trying to formulate a single forward pass regex (no look ahead or look behind) wherever possible because it will typically be the fastest.

#!/usr/bin/perl -w use strict; while (<DATA>) { next if /^s*$/; #skip blank lines my ($package,$ver) = /^\s*([a-zA-Z-]+)-(.+)\s*$/; printf "%-15s %s\n", $package,$ver; } =prints mono-basic 2.10 mono 2.10.2-r1 mono 2.10.5 =cut __DATA__ mono-basic-2.10 mono-2.10.2-r1 mono-2.10.5
Update: if you want to know if the regex succeeded, just check if $ver is defined or not. If $ver is defined, then $package will be also. Oh, there is no need to chomp() because the \s*$ will match and throw the trailing \n character(s) away. And oh, the regex substitution operation is very slow, relative to just "match and capture" because the data has to be copied to "make room" for the new characters - a "substitute and then split" strategy will be slow.

In reply to Re: regex behaves differently in split vs substitute? by Marshall
in thread regex behaves differently in split vs substitute? by raygun

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.