Two other techniques to prevent that an author appears twice in your final result are:
- Using a hash in which the key is the author's name, and each time you find an author you increment's it value (and compare it)
- Build a hash of the author's full name and the value the index in the array, and when you find the author name, use that hash to look up the index and set the value of that index (in the array) to a symbol that can't occur in your data (a number, a semicolon, a colon, ...)
I guess that both of these techniques will be faster but I'm not really sure of this since I'm too lazy to Benchmark it :) (so if you want to be sure then you should benchmark it yourself)