comment on

I'm not sure this is responsive, but I'm trying to ignore the fields other than those for which you give an example, in the surmise that that's your problem area:

#!/usr/bin/perl -w
use 5.018;
use strict;
#1149716 

=head

I would like to extract a piece of data from one field that has multip
+le fields in it. The original field is a long description that usuall
+y contains a #F123456, #123456, #123-F123456, #123-123456, or #12AB-1
+23456 in it. This data floats around from left to right and there sho
+uld be whitespace before the #. Also, the end of the data is either w
+hitespace, or the end of the field. 

=cut


my @data = ("TRAY HINGED PLSTC 20 CAV #F32473",
             "BOX HSC,35-3/4X17-1/4 X 50-1/2  SIMULATOR TALL BOX",
              "PAD, FOAM, 24 X 24 X 1/4                #16193         
+                          
 112 SHEETS PER ROLL, ORDER IN FULL ROLLS",
              "PKG LIST,ASST ARM,RAD,300
 #F37784",
               "PAD, TOP CAP RE17-30048          #F30121              
+                        
 CORRUGATED ASSEMBLY, 22-7/8 X 21-1/8 X 4-3/4",
             "foo bar #379460 best F11",
             "F1234 SIMULATION",
           );

for my $data (@data) {
    # say "\t|$data|\n\n";
    chomp $data;
    if ( $data =~ /\n/ ) {
        $data =~ s/\n//g;
    }

    if ( $data =~ /(^.* #[A-Z]*\d+.*$)/m ) {
        say "\n\$data matches regex\n";
        $data =~ s/ +/ /g;        # clean up excess spaces
        say "$data \n";
    } else {
        say "\n\t The data, $data, does NOT MATCH\n";
    }
}
[download]

The regular expression may be obscure: here's an explanation:

C:perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/
+(^.* #[A-Z]*\d+.*$)/)->explain();"
The regular expression:

(?-imsx:(^.* #[A-Z]*\d+.*$))

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  (                        group and capture to \1:   
   # NB: I did NOT need the parens as there's no use of the capture
   # My bad, but harmless except for shoving bits &amp bytes around
   # when they didn't need to be disturbed.
----------------------------------------------------------------------
    ^                        the beginning of the string
----------------------------------------------------------------------
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
     #                       ' #'
----------------------------------------------------------------------
    [A-Z]*                   any character of: 'A' to 'Z' (0 or more
                             times (matching the most amount
                             possible))
----------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
----------------------------------------------------------------------
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
    $                        before an optional \n, and the end of
                             the string
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

<p>And the output is thus:</p>

<c>C:1149716.pl

$data matches regex

TRAY HINGED PLSTC 20 CAV #F32473


         The data, BOX HSC,35-3/4X17-1/4 X 50-1/2  SIMULATOR TALL BOX,
+ does NOT MATCH


$data matches regex

PAD, FOAM, 24 X 24 X 1/4 #16193 112 SHEETS PER ROLL, ORDER IN FULL ROL
+LS


$data matches regex

PKG LIST,ASST ARM,RAD,300 #F37784


$data matches regex

PAD, TOP CAP RE17-30048 #F30121 CORRUGATED ASSEMBLY, 22-7/8 X 21-1/8 X
+ 4-3/4


$data matches regex

foo bar #379460 best F11


         The data, F1234 SIMULATION, does NOT MATCH
[download]

and here's the output of my code:

$data matches regex

TRAY HINGED PLSTC 20 CAV #F32473


         The data, BOX HSC,35-3/4X17-1/4 X 50-1/2  SIMULATOR TALL BOX,
+ does NOT MATCH


$data matches regex

PAD, FOAM, 24 X 24 X 1/4 #16193 112 SHEETS PER ROLL, ORDER IN FULL ROL
+LS


$data matches regex

PKG LIST,ASST ARM,RAD,300 #F37784


$data matches regex

PAD, TOP CAP RE17-30048 #F30121 CORRUGATED ASSEMBLY, 22-7/8 X 21-1/8 X
+ 4-3/4


$data matches regex

foo bar #379460 best F11


         The data, F1234 SIMULATION, does NOT MATCH
[download]

HTH. Sometimes you'll get better answers if you trim your code to the mere few (<20) lines that demonstrate only the problem you want to address. I see you want more than what's here in terms of advice on the code you supplied but don't have time to try to create jumbled CSV that would give a shot at assessing the efficiency and/or clarity.

Spirit of the Monastery

In reply to Re: Extract data from CSV field. by ww
in thread Extract data from CSV field. by JobC

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.