WOW!
I wasn't expecting such abundance of answers!
Thanks to everybody, folks! It's a real pleasure asking something in this Monastery.
Actually, I would say that the very idea of posting a question here has a truly beneficial effect, long before I start writing my stuff. Knowing that I am going to submit my thoughts to such attentive and wise public has an healthy effect on my code, pushing me to clean it and to check every detail before posting. When I first decided to ask for your advice, my code was much messier than this. Just preparing for posting had the positive side effect of improving the unclean code to the shape that you have seen. I leave to your imagination to decide how it was before :-)
And now, some cumulative comments on your useful contributions:
Lucky and
japhy offered basically the same advice
print unless $should_not_print++;
print if !$seen++;
This idea has crossed my mind, but I rejected it on the grounds that I am incrementing a variable without need. Influence from C and Pascal, I guess, where you learn that by incrementing an integer to no end you can eventually reach the upper boundary of the integer itself and turn it into 0 again. This was something that used to be true when integers had an upper boundary of 0xffff. It is not the case in Perl, and I can see that these solutions have some grace that is lacking in my implementation. However, I have some sort of internal taboo against this incrementing, and I can't resolve myself to adopt it. Excessively careful or simply fool? I don't know the internals of Perl well enough to decide. Pending further knowledge, I would avoid it.
jeffa offered an OO solution.
It does exactly what I need, at the price of calling a method. I have been using OOP for long time, and I am fully convinced that this should be the right approach.
A further improvement came from
clintp
The Tie solution is definitely my favorite. The same idea as jeffa, but a tied scalar in this context improves the semantics of the program:
tie my $first, "Trueonce";
while (<DATA>) {
print if !$first;
}
I love it. Thanks a lot.
Rhandom gave me food for thought, inviting me to rephrase my implementation. It's a healthy thing and I'll try not to forget it.
I tried this approach in a similar program before, and it worked fine.
I abandoned it because I had problems when dealing with large quantities of data. It was a different case and has little to share with this one, because I have to keep my data within the amount of memory that my database can handle at once (16 MB). The bottom line is that I discarded the array path because of memory problems that I had in a similar case. I should think deeper.
perrin offered a solution on the same tune. I like his
splice approach. The same remarks as the previous one, concerning my unjustified fear about memory problems.
George_Sherston gave me two pieces of advice. The first one, about using two subs for the two different actions in my task, I found very much sensible and in line with good principles of software engineering. Sometimes I get carried away by the power Perl is offering that I forget the simple things I am able to do in other languages and could be workable in Perl as well. More food for thought and an invitation to balancing the ingredients in my scripts. Thanks.
I can't fully appreciate the second piece in the same manner, since it doesn't seem to solve my problem. I might misunderstand your hint, but it seems that you are inviting me to create one query for each line of text in my input file.
I am not sure that I can simplify my task with DBI (which is happily cruncing my data every day), since I would like to avoid creating 10_000 queries, and instead I need to create a query of 10_000 lines, which I can feed to the DB either through DBI or the standard MySQL client. It's true that DBI can quote my data much better, but it doesn't have any built-in facility for multi-line queries. Thanks anyway for taking the pains of giving me a solution, and sorry for not saying anything about DBI from the beginning. I did not include DBI in my request because I wanted to concentrate on the main issue. I was feeling that my question was already too long and I didn't want to overdo.
Finally,
danger gave me three lovely technical tricks that is what I was hoping for when I posted my question. I completely overlooked the “..” operator, and now I should start reviewing the perlop manpage. Many thanks.
I should say that I could not have hoped for anything better from this question. I got at least two immediately workable hints, and plenty of insight on how to re-think my task.
I wish a pleasant day to all the kind contributors to this node.
Gmax