in reply to regex question: store multiple lines as a string

#!perl use 5.12.0; use warnings; while (<DATA>) { chomp; if (m{ \A line=ULMNm }msx && $. > 1) { print qq{\n}; } print; } __DATA__ line=ULMNm 3 1fdy_07 N-ACETYLNEURAMINATE LYASE user + 1 3 RMSD = 1.06 A MATRIX: -0.3862 -0.2080 -0.8987 0.6457 0.6347 -0.4244 -0.6587 0 +.7442 0.1108 -16.917 -91.429 -35.632 D 47 SER A 57 SER.? D 48 THR A 56 THR.? D 165 LYS A 33 LYS~? line=ULMNm 3 2tmd_00 TRIMETHYLAMINE DEHYDROGENASE user + 1 3 RMSD = 1.15 A MATRIX: 0.9011 -0.4313 0.0445 -0.1032 -0.3130 -0.9441 -0.4211 -0 +.8462 0.3266 52.913 23.262 25.449 A 169 TYR A 41 TYR~? A 172 HIS A 95 HIS^? A 267 ASP A 98 ASP~?

Outputs:

$ multiline_join.pl line=ULMNm 3 1fdy_07 N-ACETYLNEURAMINATE LYASE user + 1 3 RMSD = 1.06 A + MATRIX: -0.3862 -0.2080 -0.8987 0.6457 0.6347 -0.4244 -0.6587 +0.7442 0.1108 -16.917 -91.429 -35.632 D 4 +7 SER A 57 SER.? D 48 THR + A 56 THR.? D 165 LYS A 33 +LYS~? line=ULMNm 3 2tmd_00 TRIMETHYLAMINE DEHYDROGENASE user + 1 3 RMSD = 1.15 A + MATRIX: 0.9011 -0.4313 0.0445 -0.1032 -0.3130 -0.9441 -0.4211 - +0.8462 0.3266 52.913 23.262 25.449 A 16 +9 TYR A 41 TYR~? A 172 HIS + A 95 HIS^? A 267 ASP A 98 +ASP~?

-- Ken

Replies are listed 'Best First'.
Re^2: regex question: store multiple lines as a string
by nurulnad (Acolyte) on Oct 12, 2010 at 09:10 UTC
    sorry, could you please explain? if possible could you tell me how to store these as variables?

      This concatenates your multiple lines and stores the single string in an array element:

      #!perl use 5.12.0; use warnings; my @joined = (); my $index = 0; while (<DATA>) { chomp; if (m{ \A line=ULMNm }msx && $. > 1) { ++$index; } $joined[$index] .= $_; } for (@joined) { say } __DATA__ line=ULMNm 3 1fdy_07 N-ACETYLNEURAMINATE LYASE user + 1 3 RMSD = 1.06 A MATRIX: -0.3862 -0.2080 -0.8987 0.6457 0.6347 -0.4244 -0.6587 0 +.7442 0.1108 -16.917 -91.429 -35.632 D 47 SER A 57 SER.? D 48 THR A 56 THR.? D 165 LYS A 33 LYS~? line=ULMNm 3 2tmd_00 TRIMETHYLAMINE DEHYDROGENASE user + 1 3 RMSD = 1.15 A MATRIX: 0.9011 -0.4313 0.0445 -0.1032 -0.3130 -0.9441 -0.4211 -0 +.8462 0.3266 52.913 23.262 25.449 A 169 TYR A 41 TYR~? A 172 HIS A 95 HIS^? A 267 ASP A 98 ASP~?

      Outputs:

      $ multiline_join_array.pl line=ULMNm 3 1fdy_07 N-ACETYLNEURAMINATE LYASE user + 1 3 RMSD = 1.06 A + MATRIX: -0.3862 -0.2080 -0.8987 0.6457 0.6347 -0.4244 -0.6587 +0.7442 0.1108 -16.917 -91.429 -35.632 D 4 +7 SER A 57 SER.? D 48 THR + A 56 THR.? D 165 LYS A 33 +LYS~? line=ULMNm 3 2tmd_00 TRIMETHYLAMINE DEHYDROGENASE user + 1 3 RMSD = 1.15 A + MATRIX: 0.9011 -0.4313 0.0445 -0.1032 -0.3130 -0.9441 -0.4211 - +0.8462 0.3266 52.913 23.262 25.449 A 16 +9 TYR A 41 TYR~? A 172 HIS + A 95 HIS^? A 267 ASP A 98 +ASP~?

      I don't know what subsequent processing you want to do. I've just output each array element to the screen (say just tags on a newline).

      -- Ken