in reply to Re: Regex / "Sophisticated" End of Line
in thread Regex / "Sophisticated" End of Line

Hi zwon,

Thank you for your reply. In your example, it would be NP_001120800.1425

.

The following script by james2vegas - with a little modification - solved my problem:

elsif (m/, Homo sapiens/) { my ($human) = m/((?:XP_|NP_)[\d. ]+)\s+/; $human = $1; print OUTFILE $1 . "\t";

So it works in two parts: First get the line which includes ", Homo sapiens", and then look for the NP_ or XP_ combination

I was curious if there is a way to pick up, say, the second "compact word" (I do not know how to say this properly, but a series of non-space characters) from the end. Another kind user provided an answer to this question, where in his solution he uses split by a space character, put the elements into an array and pick up the -2nd element, which is the second from the end.

Thank you for your help, guys.