- get the user-specified sequence file, library file and cut-off percentage - run fasta34.exe with the sequence file and library file - open the file that was output by fasta34 - set INPUT_RECORD_SEPARATOR ($/) to ">>" - reading each record of the fasta34 output file: -- the first read will contain just ">>", so skip it -- match /(\d+\.\d+)% identity/ and save the captured string as $per -- push the record onto an "org" array -- if $per > cut-off value, push the record onto an "input" array # You now have all the original fasta output in @org, and # all the "above-threshold" records in @input. At this point, # you want to run fasta again, but I can't figure out what # input you want to give it, how many times you really need # to run it, or how you should use the subsequent output.