fishfork has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

A mathematical program I wrote in C++ multiplies things called B and C together then finds some properties of the new thing.

The output gives the thing, for example
BCBCBCBCBCCCB
then a list of porperties.

I am now writing a perl script to process this output.

I want to replace
BCBCBCBCBCCCB
by
(BC)^{5}CCB
which is obviously a lot nicer.

However, I some of the 'words' start with C and the number of (BC)s ranges from 0 to some very large numbers.

How do I achieve this? I tried a few ideas but none of them worked and now I really don't have a clue what to do!

Thanks for reading,
Richard.
richard@sigma.ndo.co.uk

Edit by tye to preserve formatting

  • Comment on Converting BCBCBCBCBCCCB to (BC)^{5}CCB

Replies are listed 'Best First'.
Re: Converting BCBCBCBCBCCCB to (BC)^{5}CCB
by gmax (Abbot) on Feb 06, 2002 at 14:54 UTC
    If the case is that simple (your string could start by /^BC/ or by /^.*BC/), you can use the replacement operator s/// to get the right answer.
    #!/usr/bin/perl -w use strict; my $source = "BCBCBCBCBCCCB"; my $group = "BC"; # the repeating string # capturing the left part, if any, # and removing it from the source my $left = substr($source,0, index($source,$group)); substr($source,0,index($source,$group)) =""; # counting the occurrence of the repeating group my $exp = $source =~ s/$group//g; #print the "formula" print "$left ($group)^{$exp}$source\n";
    Output:
    (BC)^{5}CCB
    Changing source to "CCBCBCBCBCBCCCB";
    You get the output:
    CC (BC)^{5}CCB
Re: Converting BCBCBCBCBCCCB to (BC)^{5}CCB
by talexb (Chancellor) on Feb 06, 2002 at 15:42 UTC
    I'd use a regexp to capture the longest string with repeated BC elements, then divide the length of that string by the length of the original BC string to get the exponent, and finish by deleting the stuff just exponentiated.
    $Code = "BCBCBCBCBCCCB"; $What = "BC"; $Code =~ m/($What)+?/; $Exp = length ( $1 ) / length ( $What ); $Code =~ s/$1/($What)^($Exp)/e;
    This code not tested .. that's just the general idea.

    --t. alex

    "Of course, you realize that this means war." -- Bugs Bunny.

Re: Converting BCBCBCBCBCCCB to (BC)^{5}CCB
by particle (Vicar) on Feb 06, 2002 at 17:26 UTC
    some good advice has been given above. if performance is of any concern, you'll save a little time by compiling the regex only once, by using the -o modifier, such as my $exp = $source =~ s/$group//go; to borrow gmax's solution for illustration.

    of course, direct interpolation of the pattern will be even faster, such as my $exp = $source =~ s/BC//g;

    my benchmark results look like:

    C:\WINDOWS\Desktop>perl test_regex-o.pl Benchmark: timing 100000 iterations of s_direct, s_witho, s_withouto.. +. s_direct: 4 wallclock secs ( 3.02 usr + 0.00 sys = 3.02 CPU) @ 33 +112.58/s (n=100000) s_witho: 4 wallclock secs ( 3.41 usr + 0.00 sys = 3.41 CPU) @ 29 +325.51/s (n=100000) s_withouto: 5 wallclock secs ( 4.28 usr + 0.00 sys = 4.28 CPU) @ 23 +364.49/s (n=100000)

    ~Particle

Re: Converting BCBCBCBCBCCCB to (BC)^{5}CCB
by belg4mit (Prior) on Feb 06, 2002 at 14:37 UTC
    This is runlength encoding(RLE), searching the web for such might help. There is mention of a perl module Marshal::Packed (by MUIR) which does RLE, but it is does not appear to be available anywhere.