Re: Regex / "Sophisticated" End of Line

I don't exactly understand what are you trying to do. Could you tell us what do you want to extract from the line

&#160; ACADM, Homo sapiensacyl-Coenzyme A dehydrogenase, C-4 to C-12 s
+traight chain     NP_001120800.1425 aa
[download]

Is it acyl-Coenzyme or NP_001120800.1425?

Comment on Re: Regex / "Sophisticated" End of Line Select or Download Code

Replies are listed 'Best First'.
Re^2: Regex / "Sophisticated" End of Line by nofutur45 (Initiate) on Oct 27, 2010 at 01:17 UTC
Hi zwon, Thank you for your reply. In your example, it would be `NP_001120800.1425` . The following script by james2vegas - with a little modification - solved my problem: `elsif (m/, Homo sapiens/) { my ($human) = m/((?:XP_\|NP_)[\d. ]+)\s+/; $human = $1; print OUTFILE $1 . "\t";` [download] So it works in two parts: First get the line which includes ", Homo sapiens", and then look for the NP_ or XP_ combination I was curious if there is a way to pick up, say, the second "compact word" (I do not know how to say this properly, but a series of non-space characters) from the end. Another kind user provided an answer to this question, where in his solution he uses split by a space character, put the elements into an array and pick up the -2nd element, which is the second from the end. Thank you for your help, guys.	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^2: Regex / "Sophisticated" End of Line
by nofutur45 (Initiate) on Oct 27, 2010 at 01:17 UTC

Hi zwon,

Thank you for your reply. In your example, it would be NP_001120800.1425

The following script by james2vegas - with a little modification - solved my problem:

elsif (m/, Homo sapiens/)
        
        { my ($human) = m/((?:XP_|NP_)[\d. ]+)\s+/; 
        
            $human = $1;
       
        print OUTFILE $1 . "\t";
[download]

So it works in two parts: First get the line which includes ", Homo sapiens", and then look for the NP_ or XP_ combination

I was curious if there is a way to pick up, say, the second "compact word" (I do not know how to say this properly, but a series of non-space characters) from the end. Another kind user provided an answer to this question, where in his solution he uses split by a space character, put the elements into an array and pick up the -2nd element, which is the second from the end.

Thank you for your help, guys.

[reply]
[d/l]
[select]