Specific instance of a repeated string

Ionizor has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Specific instance of a repeated string by antirice (Priest) on Aug 09, 2003 at 22:00 UTC
What is switches supposed to contain beyond the particular group of numbers it should split upon? Anyhow, this will be messy: #!/usr/bin/perl -wl use Data::Dumper sub get_split { my $filename = shift; my $switches = shift; my %filesplit; my @digits = $filename =~ /(\d+)/g or die "Error: Could not extract +a number from filename '$filename'.\n"; $filesplit{digit} = $digits[$switches->{numindex}]; # how many $filesplit{digit} can we find before this sucker? my $splits = 2 + grep($_ eq $filesplit{digit},@digits[0..$switches-> +{numindex}-1]); my @temp = split (/$filesplit{digit}/, $filename, $splits); $filesplit{suffix} = pop(@temp); $filesplit{prefix} = join $filesplit{digit}, @temp; return \%filesplit; } print Dumper(get_split("01-file01.html",{numindex=>1})); print Dumper(get_split("01-file01and01.html",{numindex=>1})); print Dumper(get_split("02-file01tom34bill01.html",{numindex=>3})); __DATA__ $VAR1 = { 'digit' => '01', 'suffix' => '.html', 'prefix' => '01-file' }; $VAR1 = { 'digit' => '01', 'suffix' => 'and01.html', 'prefix' => '01-file' }; $VAR1 = { 'digit' => '01', 'suffix' => '.html', 'prefix' => '02-file01tom34bill' }; [download] That is some ugly code. Hope this helps. antirice The first rule of Perl club is - use Perl The ith rule of Perl club is - follow rule i - 1 for i > 1	[reply] [d/l]
Re^2: Specific instance of a repeated string by Ionizor (Pilgrim) on Aug 09, 2003 at 22:47 UTC
Switches contains all the command line switch settings: SWITCHES All switches are optional. -n, --numeric-index - force the script to increment a specific number The `numeric-index` switch should only be used when a filename has multiple numbers in it, e.g. 01-file01.html. This switch defaults to -1 which is the last number in the filename. Specifying the index as 1 will force the script to increment the first set of numbers. Specifying the index as 2 will force the script to increment the second set of numbers (which is redundant since the last set of numbers is the default anyway). Again, you get enough rope to hang yourself so don't use an index higher than the number of numbers in the file name. -p, --precision - how many digits of precision to use The `precision` switch controls how many zeros are prepended to 'short' numbers, i.e. should the first file be file1.html, file01.html, file001.html, etc. For default values, the script first looks at the precision of `min` if it's present, then `max`. If neither value is specified, the script defaults to the precision in the input URI, meaning if you use the filename file23.html you'll get two digits of precision whether you want them or not. -r, --reverse - print the list in reverse order The `reverse` switch simply prints out the list of URIs in order from `max` to `min` rather than from `min` to `max`. -v, --verbose - turns on some warnings and diagnostics The `verbose` switch turns on some basic warnings such as the detected precision and whether or not the min and max values were swapped. -- Grant me the wisdom to shut my mouth when I don't know what I'm talking about.	[reply] [d/l] [select]
Re: Specific instance of a repeated string by graff (Chancellor) on Aug 09, 2003 at 22:29 UTC
Unless there's some more definite design or convention for defining what the prefix and suffix are supposed to be, this endeavor is going to smell bad. Perhaps your notion of "prefix-number-suffix" began when there were only file names like "file-01.html" -- but now someone has invented file names like "01-file-01.html", and maybe next month, they'll come up with "01-file-01-new-01.html", "file-01-03-01-new.html", and who knows what else. Folks here may be able to help more if given a bit more information about your situation.	[reply]
Re^2: Specific instance of a repeated string by Ionizor (Pilgrim) on Aug 09, 2003 at 22:51 UTC
The prefix can contain numbers, as can the suffix. The idea is to increment one number to generate a list, e.g. `01-file-01.html 01-file-02.html 01-file-03.html ...` [download] Only one number is ever going to form the basis for the list so the other numbers in a filename should be part of either the prefix or the suffix. I get the feeling I'm not explaining this very well... -- Grant me the wisdom to shut my mouth when I don't know what I'm talking about.	[reply] [d/l]
Re: Specific instance of a repeated string by Ionizor (Pilgrim) on Aug 09, 2003 at 22:43 UTC
I was hoping what I was trying to accomplish would be clear without having to post all the code but apparently it's not. Here is the full code for my script, which is a rewrite of the script in this node to create a sequential list of files. Read more... (4 kB) -- Grant me the wisdom to shut my mouth when I don't know what I'm talking about.	[reply] [d/l]
Re: Re: Specific instance of a repeated string by shenme (Priest) on Aug 10, 2003 at 08:50 UTC
Wow, it's 03:45, so I'm going to have to tickle the back of my throat and spew what I've got so far. Besides the comments and debug stuff there's really only 8-10 lines of real code in the loop that does the good stuff. Anyway, pardon the mess, it's an inspiration still steaming from the source... The idea I had was getting an RE to create an array of string pieces, alternating number and non-number chunks. If you could then figure out the array index of the requested number chunk you'd operate on that. And then collapse the pieces back together into the updated filename string when needed. Read more... (3 kB) This is a lot just to show an idea, but you might've not seen an RE do something like this before ... `my @a = $filename =~ m/ (\D+)? (\d+)? /xg;` [download]	[reply] [d/l] [select]
Re^3: Specific instance of a repeated string by Ionizor (Pilgrim) on Aug 13, 2003 at 02:19 UTC
Some excellent inspiration. Thank you. -- Grant me the wisdom to shut my mouth when I don't know what I'm talking about.	[reply]
Re: Re: Specific instance of a repeated string by hangmanto (Monk) on Aug 10, 2003 at 11:55 UTC
Instead of using a regex followed by a split, try using just a regex. `if ( $filename =~ /(.)(\d+)\.(.)$/ ) { $prefix = $1; $digit = $2; $suffix = $3; } else { print "Error etc\n"; }` [download] You may need to tweak the regex. I am unable to test it from my current location.	[reply] [d/l]
Re^3: Specific instance of a repeated string by Ionizor (Pilgrim) on Aug 13, 2003 at 02:12 UTC
Unfortunately this won't do what I need it to do. This will only work for files that look like `foo01.bar`. Some of my files look like: `foo10bar.baz`. This regex also assumes that it's the last number in the filename that I want to operate on which isn't always the case. Thanks for the suggestion though, it is appreciated. -- Grant me the wisdom to shut my mouth when I don't know what I'm talking about.	[reply] [d/l] [select]
Re: Specific instance of a repeated string by CombatSquirrel (Hermit) on Aug 10, 2003 at 16:41 UTC
If you want to be able to change any number in the file name, you would probably like to have it split into chunks. So far my idea; I realize that shenme has already written a piece of code utilizing this, but TIMTOWDI, and so I decided to write another piece of code: #!perl -w use strict; for my $name (<DATA>) { chomp $name; my %parts; $name =~ s/(.)(\.[^.])/$1/; $parts{"suffix"} = $2 or die "Not a valid file name: $name"; $parts{"prefix"} = []; $parts{"number"} = []; while ($name =~ s/^(\D*)(\d+)//) { push @{$parts{"prefix"}}, $1; push @{$parts{"number"}}, $2; } $name and die "Invalid file name format. Rest '$name' remained"; print "Filename splits as follows: [" . join("][", map { "(" . $parts{"prefix"}->[$_] . ")(" . $parts{"number"}->[$_] . ")" } 0..@{$parts{"nu +mber"}}-1) . "]<" . $parts{"suffix"} . ">\n"; } __DATA__ 01-html02.html 01-htm23-43.htm 01-file-01.html [download] The program first extracts the file suffix (file ending after the dot, I hope that I didn't misunderstand you here) and then loops through the file name, taking (possibly) a prefix and (definitely) a number from it and storing it in anonymous arrays in `$parts{"prefix"}` and `$parts{"number"}`. If you want to increment the `$i`th number now, you would just have to write `++$parts{"number"}->[$i]; $filename = join('', map { $parts{"prefix"}->[$_] . $parts{"number"}->[$_] } 0..@{$parts{"number"}}-1) . $parts{"suffix"};` [download] Hope that helped.	[reply] [d/l] [select]
Re^2: Specific instance of a repeated string by Ionizor (Pilgrim) on Aug 13, 2003 at 02:17 UTC
I guess I wasn't clear. The file suffix is the part of the file after the number I'm operating on (for the file `foo10bar.baz` the suffix would be `bar.baz`. In most cases the suffix is just the file extension but I'm trying to make the script handle any filename regardless of the position of the digits. I've gotten some inspiration from your code though, so thanks! I'll let you know how it turns out. -- Grant me the wisdom to shut my mouth when I don't know what I'm talking about.	[reply] [d/l] [select]

SWITCHES