in reply to To Match the text

How about using split with a third argument limiting the resultant fields to two, splitting on the first comma and space that is followed by a digit. It would of course fail if the item itself contained <comma><space><digit>. Here, I printf'ed both fields so you can see where the demarcation lies.

use strict; use warnings; while( <DATA> ) { chomp; my( $item, $index ) = split m{,\s(?=\d)}, $_, 2; printf qq{%-42s%s\n}, $item, $index; } __END__ sclerosing 1954, 5-7, 54, 59f-60d, 90, 114 cribriform carcinoma, invasive, 89, 91-94, 112 comedo-type DCIS, 25-26 comedo-type necrosis, LCIS with, 55, 59, 65 complex sclerosing lesions (radial scar), 8-9, 54, 59, 90

Here is the output.

sclerosing 1954 5-7, 54, 59f-60d, 90, 114 cribriform carcinoma, invasive 89, 91-94, 112 comedo-type DCIS 25-26 comedo-type necrosis, LCIS with 55, 59, 65 complex sclerosing lesions (radial scar) 8-9, 54, 59, 90

I hope this is helpful.

Cheers,

JohnGG