in reply to which data structure do I need for this grouping problem?

The standard solution for handling files containing character-separated values (CSV), including tabulator separated values, is to use Text::CSV and - if possible - its accelerating companion Text::CSV_XS. It is "the standard" because it not only splits (and joins) on the separating character(s), but also handles quoting, escaping, and all of those nasty edge cases you can find in CSV files.

If you are used to work with relational databases and DBI, try DBD::CSV. It sits on top of Text::CSV and allows you to treat CSV files like database tables in a relational database. In other words: You can use SQL to work directly with CSV files.

All of those modules are currently maintained by our helpful Tux.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
  • Comment on Re: which data structure do I need for this grouping problem?