comment on

I really like Limbic~Region's approach, but here is the my idea for your algorithm. Mine is not dependent on fixed widths at all. It starts from the right and grabs the last six space-delimited strings. Then you can grab the first and second items. My Perl here is a bit sloppy, but this proof-of-concept works.

#!/usr/bin/perl
use strict;
use warnings;

while (<DATA>) {
    s/\s*([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]
+)$//;
    my @items = ($1, $2, $3, $4, $5, $6);
    m/([^\s]+)\s+(.*)$/;
    unshift @items, $2;
    unshift @items, $1;
    print "[$_] " foreach @items;
    print "\n";
}
[download]

__DATA__
BAZ 'N3''  N  0 ? ? ? 1
BAZ 'N4''  N  0 ? ? ? 1
BAZ 'C8''  C  0 ? ? ? 1
BAZ C9     C  0 ? ? ? 1
BAZ ZN     ZN 0 ? ? ? 0
BAZ HN1    H  0 ? ? ? 1
BAZ 1HN2   H  0 ? ? ? 0
BAZ 2HN2   H  0 ? ? ? 0
001 F11  F 0 ? ? ? 1
001 C11  C 0 ? ? ? 1
001 O11  O 0 ? ? ? 1
001 N12  N 0 ? ? ? 1
001 C12  C 0 ? ? ? 1
001 C13  C 0 ? ? ? 1
001 C14  C 0 ? ? ? 1
001 C15  C 0 ? ? ? 1
001 C16  C 0 ? ? ? 1
BCB CBA   C  0 ? ? ? 1
BCB CGA   C  0 ? ? ? 1
BCB O1A   O  0 ? ? ? 1
BCB O2A   O  0 ? ? ? 1
BCB 'N B' N  0 ? ? ? 1
BCB C1B   C  0 ? ? ? 1
BCB C2B   C  0 ? ? ? 1
BCB C3B   C  0 ? ? ? 1
BCB C4B   C  0 ? ? ? 1
BCB CMB   C  0 ? ? ? 1
[download]

OUTPUT:

[BAZ] ['N3''] [N] [0] [?] [?] [?] [1] 
[BAZ] ['N4''] [N] [0] [?] [?] [?] [1] 
[BAZ] ['C8''] [C] [0] [?] [?] [?] [1] 
[BAZ] [C9] [C] [0] [?] [?] [?] [1] 
[BAZ] [ZN] [ZN] [0] [?] [?] [?] [0] 
[BAZ] [HN1] [H] [0] [?] [?] [?] [1] 
[BAZ] [1HN2] [H] [0] [?] [?] [?] [0] 
[BAZ] [2HN2] [H] [0] [?] [?] [?] [0] 
[001] [F11] [F] [0] [?] [?] [?] [1] 
[001] [C11] [C] [0] [?] [?] [?] [1] 
[001] [O11] [O] [0] [?] [?] [?] [1] 
[001] [N12] [N] [0] [?] [?] [?] [1] 
[001] [C12] [C] [0] [?] [?] [?] [1] 
[001] [C13] [C] [0] [?] [?] [?] [1] 
[001] [C14] [C] [0] [?] [?] [?] [1] 
[001] [C15] [C] [0] [?] [?] [?] [1] 
[001] [C16] [C] [0] [?] [?] [?] [1] 
[BCB] [CBA] [C] [0] [?] [?] [?] [1] 
[BCB] [CGA] [C] [0] [?] [?] [?] [1] 
[BCB] [O1A] [O] [0] [?] [?] [?] [1] 
[BCB] [O2A] [O] [0] [?] [?] [?] [1] 
[BCB] ['N B'] [N] [0] [?] [?] [?] [1] 
[BCB] [C1B] [C] [0] [?] [?] [?] [1] 
[BCB] [C2B] [C] [0] [?] [?] [?] [1] 
[BCB] [C3B] [C] [0] [?] [?] [?] [1] 
[BCB] [C4B] [C] [0] [?] [?] [?] [1] 
[BCB] [CMB] [C] [0] [?] [?] [?] [1]
[download]

--
Damon Allen Davison
http://www.allolex.net

In reply to Re: can't use unpack or split?? by allolex
in thread can't use unpack or split?? by seaver

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.