in reply to Generate unique ids of maximum length
A2990_du_1 A2990_du_2 A2990_du_3 A2990_du_4 A2990_du_5 A2990_du_6 A2990_du_7 A2990_du_8 A2990_du_9 A2990_du_10 LXP_01 LXP_02 LXP_03 LXP_04 LXP_05 LXP_06 LXP_07 LXP_08 LXP_09 LXP_10 LXP_11 LXP_12 LXP_13 LXP_14 LXP_15 LXP_16 LXP_17 LXP_18 Len3_ca_A Len3_ca_B Len3_ca_C Len3_du_1 Len3_du_2 Len3_du_3 Len5_ca_1 Len5_ca_2 Len5_ca_3 Len5_du_1 Len5_du_2 Len5_du_3 No_1 No_2 No_3 No_4 No_5 No_6
can be obtained using the following:
use strict; use warnings; sub add { my $p = \shift; $p = \( $$p->{$_} ) for @_; $$p->{''} = 1; } sub shorten_unsplit { our $fixed; local *fixed = \$_[0]; our $unsplit; local *unsplit = \$_[1]; for ($unsplit) { if ( s/^([^A-Za-z]+[A-Za-z]?)// ) { $fixed .= $1; redo; } if ( s/^(?=(.))[A-Z]*[a-z]*//s ) { $fixed .= $1; redo; } } } sub shorten { my @results; local *helper = sub { my ($trie, $fixed, $unsplit) = @_; my $single = ( keys(%$trie) == 1 ); shorten_unsplit($fixed, $unsplit) if !$single || exists($trie->{''}); for my $key ( sort {; no warnings 'numeric'; $a <=> $b || $a cmp $b } keys(%$trie) ) { if ($key eq '') { push @results, $fixed; } elsif ($single) { helper($trie->{$key}, $fixed, "$unsplit$key"); } else { helper($trie->{$key}, "$fixed$key", ''); } } }; my $trie; add($trie, /\d+|./sg) for @_; return if !$trie; helper($trie, '', ''); return @results; } { chomp( my @data = <DATA> ); print("$_\n") for shorten(@data); } __DATA__ ...
Useful test case: Add A2990_dualplayer_10.
The code can manipulated to remove less when desired.
Update: Fixed a bug.
Update: And another.
Update: Improved sorting (1,2,...,9,10 vs 1,10,2,...,9).
Update: Input strings can now be substrings of other input strings. Simplified code at the same time.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Generate unique ids of maximum length
by lima1 (Curate) on Apr 13, 2010 at 12:00 UTC | |
|
Re^2: Generate unique ids of maximum length
by choroba (Cardinal) on Apr 13, 2010 at 12:24 UTC | |
by ikegami (Patriarch) on Apr 13, 2010 at 15:05 UTC |