Re: File Parsing

Is this a Perl question? I am asking because you seem to be talking about $1, $2, ... as being the result of a split operation, but this is not the case in Perl, where $1, $2, ... occur in a totally different context (regex matches). S1, $2, etc. appear in the split context in other languages such as awk, but I doubt awk is the right tool to sort version numbers.

If you intend to do it in Perl, then you probably want to create a data structure where each node contains the name of the original file on one hand and the various components of the name on the other hand. Then you can sort on the various parts and store the sorted order into an array which you can then use to figure out what you want to keep live and what you want to set aside.

Step one: splitting the names. Maybe something like this:

my @to_be_sorted;
foreach my $filename (@filelist) {
     my ($root, $version) = $filename =~ /([a-z]+)_(\d+\.\d+\.\d+)/;
     my ($major, $minor, $third) = split /\./, $version;
     push @to_be_sorted,  [$filename, $root, $major, $minor, $third];
}
[download]

It could be done with shorter code, but I preferred to break the process into small parts for better comprehension. Now the records in the @to_be_sorted array look like this:

0  ARRAY(0x600500678)
      0  'bar_123.10.0_deb'
      1  'bar'
      2  123
      3  10
      4  0
[download]

Now you can sort on elements 1, 2, 3 and 4 of each record and store into a new sorted array element 0 of each item. Something like this (not really tested):

my @sorted_array = map {$_->[0]} 
                   sort { $a->[1] cmp $b->[1] 
                       || $a->[2] <=> $b->[2] 
                       || $a->[3] <=> $b->[3] 
                       || $a->[4] <=> $b->[4] }
                   @to_be_sorted;
[download]

The whole code shown above could be reduced to a single instruction using the clever Schwartzian Transform (see also Efficient sorting using the Schwartzian Transform), but I would not necessarily recommend it in this case, because the initial splitting is a bit tedious.

Please note that I fully agree with the previous post by Anonymous Monk, I have just chosen one plausible way of sorting the version numbers, you may have to change it in accordance to the Debian version number conventions.

Comment on Re: File Parsing Select or Download Code

Replies are listed 'Best First'.
Re^2: File Parsing by perlfan (Parson) on Jun 22, 2014 at 20:23 UTC
>Is this a Perl question? I am asking because you seem to be talking about $1, $2, ... as being the result of a split operation, but this is not the case in Perl, where $1, $2, ... occur in a totally different context (regex matches). S1, $2, etc. appear in the split context in other languages such as awk, but I doubt awk is the right tool to sort version numbers. I would hope that any Perlmonk knowing awk would have know what is being meant here. I had no trouble "parsing" the intent. Just saying.	[reply]
Re^3: File Parsing by Laurent_R (Canon) on Jun 23, 2014 at 06:36 UTC
Well, understanding the intent was not the problem, but reading the OP, I was truly wondering whether the poster really wanted to do it in Perl. Besides, assuming the OP wanted to do it in Perl, I thought it was useful to remind the OP that the split does not store its results into $1, $2, etc. In addition, I have taken the time to provide actual code to solve the solution, so that I would think the OP does not have to complain about my post.	[reply]
Re^4: File Parsing by kel (Sexton) on Jun 24, 2014 at 09:42 UTC
Dear Laurent, many thanks to yourself and others for your most noble efforts. But I do believe the most effective manner is : Debian::Dpkg::Version I did not know the right question to ask the Oracle of CPAN. I am an humble bookseller, not an enlightened programmer. So I frequently have found need to Perl to manage my database scripts, and found it a wunderful tool, as well as an iinstructive one. The project would surely be a Perl effort. Actually i know it better than AWK! What is there not to love about it, It has everything, inclusing GOTO. The passage of time has diminished my grief over line numbers.	[reply]