comment on

This is a perfect task for the regex engine.

local our %count;
$str =~ /
    (.+)    # or .{N,} where N is minimum length.
    (?(?{ $count{$1} })
        (?!)
    )
    .*
    \1
    (?{ ($count{$1} ||= 1)++ })
    (?!)
/x;
[download]

A more generalized version where you can specify the minimum substring length and minimum number of occurances is

my $min_len = 2; # Substring is at least two chars long.
my $min_count = 3; # Substring occures at least three times.

local our %count;
use re 'eval';
$str =~ /
  (.{$min_len,})
  (?(?{ $count{$1} })
      (?!)
  )
  (?>
    .*?
    \1
  ){@{[ $min_count - 2 ]}}
  .*
  \1
  (?{ ($count{$1} ||= $min_count-1)++ })
  (?!)
/x;
[download]

lodin

Update:

While writing this, ikegami posted a very similar-looking reply. While they look very much alike they work quite differently. ikegami's work by requiring that each match is repeated further into the string, and then goes on to count all those successive matches kind of like a global match. Mine does the counting right away, and then forces the engine to not count those again. So they work rather opposite of each other.

I did a shallow benchmark. It seems that mine is a slight favourite (5-10%) in many, but not all, situations. I also get the impression that ikegami's scales slightly better, see Re^5: how to count the number of repeats in a string (really!).

Update 2:

Added the comment in the regex.

Update 3:

Added the generalized version.

In reply to Re: how to count the number of repeats in a string (really!) [regexp solution] by lodin
in thread how to count the number of repeats in a string (really!) by blazar

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.