For thousands of files, I don't think you need to optimize; you might consider it for hundreds of thousands. I think the approach you outline is going to be best. You'll probably want to print exceptions to handle manually (or improve your tool) if you can't find "package" or it has a leading path component that doesn't match. I think you want [\w:] instead of \S. Watch out for apostrophes :) | [reply] [d/l] [select] |
- Find all the unique directory paths from your list.(via sorting or a hash table)
- Find all the unique roots (I guesss there might be more than one). (a/b/c, a/b => a/b)
- Having vastly diminished the number of directories to check, for each unique root,
- open one file in that root,
- find it's package declaration,
- and parse it to determine what the directory entry should be.
- Examine the results for all the unique roots to see if you can merge the results.
| [reply] |