Now, if you're like most people, most of your day is spent wandering around aimlessly and asking yourself, "Man, I wish I had a Perl script that would take like a huuuuge X12-formatted file, and split it up into input files, each one no greater than 1500KB, or 2500 claims, whichever comes first. Wow, if I had that... Man, I'd even be willing to edit the hardcoded output path in that script to suit where I wanted those chunks to go!" Well, look no further:
#!/usr/bin/perl ## ## X12Splitter written 043013 by Bowie J. Poag ## ## X12Splitter takes an X12-formatted .dat file, and splits it ## up into inputFiles no greater than 1500KB or 2500 claims, ## whichever comes first. ## ## Usage: ## ## x12splitter <filename> ## ## Example: ## ## x12splitter foo.dat ## $|=1; $numRecords=0; $numBytes=0; $fileName=$ARGV[0]; errorCheckAndPrep(); dumpChunks(); sub errorCheckAndPrep { print "\n\nX12Splitter: Checking $fileName for any structural probl +ems.."; @inputFile=`cat $fileName`; @temp=`ls -l $fileName`; @fileDetails=split(" ",$temp[0]); $fileSize=$fileDetails[4]+0; $numElements=scalar(@inputFile); $numTotalBytes=length($inputFile[0]); if ($numElements > 1) { print "X12Splitter: Input file is malformed. Exiting..\n"; exit(); } else { print ".."; } if ($fileSize!=$numTotalBytes) { print "X12Splitter: Payload size and stated file size mismatch. +Exiting.\n"; exit(); } else { print ".."; } if ($inputFile[0]=~/^ISA/) { print "Done.\n"; } print "X12Splitter: Check complete. Parsing file..\n"; @payload=split("~ST",$inputFile[0]); $envelopeOpen=$payload[0]; $envelopeClose=$payload[-1]; $envelopeClose=~/~GE/; $envelopeClose="~GE$'"; $payload[-1]=$`; if ($envelopeOpen=~/^ISA/ && $envelopeClose=~/~GE/) { print "X12Splitter: Evenvelope open and close chunks found succe +ssfully.\n"; } else { print "X12Splitter: Unexpected problem with envelope open. Openi +ng ISA header or ~GE close not found.\n"; exit(); } shift (@payload); ## Don't bother processing the envelope.. foreach $item (@payload) { $recordCount++; $openRecordText=substr($item,0,15); $closeRecordText=substr($item,length($item)-40,40); printf ("\rX12Splitter: Record %6d: [%15s.....%-40s] \r", $recor +dCount, $openRecordText, $closeRecordText); } print "\nX12Splitter: $recordCount total records found. Splitting.. +\n"; } sub dumpChunks { $chunkPayload=""; $chunkNum=0; $numBytesInThisChunk=0; $numRecordsInThisChunk=0; foreach $item (@payload) { $numBytesInThisChunk=length($chunkPayload); $numRecordsInThisChunk++; $chunkPayload.="~ST$item"; if ($numRecordsInThisChunk>2000 || $numBytesInThisChunk>1000000) { $chunkPayload="$envelopeOpen"."$chunkPayload"."$envelopeClose +"; open ($fh,'>',"/demo/fin/healthport/$fileName.part.$chunkNum" +); print $fh "$chunkPayload"; close ($fh); print "X12Splitter: $numRecordsInThisChunk records saved to / +demo/fin/healthport/$fileName.part.$chunkNum\n"; $numBytesInThisChunk=0; $numRecordsInThisChunk=0; $chunkNum++; $chunkPayload=""; } } ## Clean up the last of it.. $chunkPayload="$envelopeOpen"."$chunkPayload"."$envelopeClose"; open ($fh,'>',"/demo/fin/healthport/$fileName.part.$chunkNum" +); print $fh "$chunkPayload"; close ($fh); print "X12Splitter: $numRecordsInThisChunk records saved to / +demo/fin/healthport/$fileName.part.$chunkNum\n"; } print "\n\n\n";

In reply to X12Splitter: A Tool For Splitting X12-Formatted .dat Files by bpoag

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.