Cloudster has asked for the wisdom of the Perl Monks concerning the following question:
I wrote my first Perl program a couple of weeks ago and am quite happy with it (now I just have to figure out how to link it in to a web page and my PHP chat program, but I’ll figure that out eventually.) My job is database administration and development. Yesterday I sat down to do something that I do several times a week, reformat a script generated by SQL Server 2000’s Enterprise Manager when I realized that this was another opportunity to learn more of Perl and save quite a bit of time in the future, so I set forth and have fallen flat on my face. (I wrote most of this post this morning, I’ve since made most of it work)
Here’s what a SQL Server script looks like, this would be saved in a file:
CREATE TABLE [dbo].[CustomerBalances] ( [arbh_acct] [int] NOT NULL , [arbh_prop_code] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_ +AS NOT NULL , [arbh_unpd_bal] [decimal](10, 2) NOT NULL , [rCount] [tinyint] NOT NULL , [arbh_ar_cat] [tinyint] NULL ) ON [PRIMARY] GO
What I want to do is remove the square brackets, remove the COLLATE blah blah blah, insert three tabs in front of the data type, and insert a couple of tabs in front of the null option, etcetera. The tabs wouldn’t align perfectly in the end because of variance in the field name length, but that’s ok, TextPad is excellent for making that easy.
I’ve got most of that working. I have two final problems, both of which are beyond my skill. First, I can’t get the NULL/NOT NULL to parse correctly. I end up with NOT\t\tNULL.
Second, and most critical, is file size. If I’m dealing with a dinky little file like the above (300 bytes), it runs just fine. But if I feed it a 9k script file with 200 lines of code, I get a really weird result. The output file displays like it has an additional space between every character, there are lots of non-ASCII values in the file, and nothing has matched and been reformatted.
I have no idea what’s going on with my program. My intent was that a line would be read, tested to see if it contains a space and a data type name, and if it did, replace that space with three tabs. My result file is very badly mangled, and no longer ASCII. Obviously I’m missing something.
This is what the first part of the file looks like when it’s done processing:
CREATE TABLE 搀戀漀 .䌀甀猀琀漀洀攀爀䈀愀氀愀渀挀攀猀 ( ऀ arbh_acct int 一伀吀 一唀䰀䰀 Ⰰഀഀ 愀爀戀栀开瀀爀漀瀀开挀漀搀攀 瘀愀爀挀栀愀爀 (30) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL , ऀ arbh_unpd_bal decimal⠀ Ⰰ ㈀⤀ 一伀吀 一唀䰀䰀 Ⰰഀഀ 爀䌀漀甀渀琀 琀椀渀礀椀渀琀 NOT NULL , ऀ arbh_ar_cat tinyint 一伀吀 一唀䰀䰀 ഀഀ ) ON 倀刀䤀䴀䄀刀夀 䜀伀ഀഀ 䌀刀䔀䄀吀䔀 吀䄀䈀䰀䔀 dbo⸀ EmailText ⠀ഀഀ 䴀攀猀猀愀最攀䈀漀搀礀 瘀愀爀挀栀愀爀 (255) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL , ऀ InsertSequence bigint 䤀䐀䔀一吀䤀吀夀 ⠀Ⰰ ⤀ 一伀吀 一唀䰀䰀 ഀഀ ) ON 倀刀䤀䴀䄀刀夀 䜀伀ഀഀ 䌀刀䔀䄀吀䔀 吀䄀䈀䰀䔀 dbo⸀ ErrorCodes ⠀ഀഀ 䔀爀爀漀爀䌀漀搀攀 猀洀愀氀氀椀渀琀 NOT NULL , ऀ ErrorDesc varchar ⠀㐀 ⤀ 䌀伀䰀䰀䄀吀䔀 匀儀䰀开䰀愀琀椀渀开䜀攀渀攀爀愀氀开䌀倀开䌀䤀开䄀匀 一伀吀 一唀䰀䰀 Ⰰഀഀ 䔀爀爀漀爀䰀漀挀愀琀椀漀渀 挀栀愀爀 (1) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ⤀ 伀一 PRIMARYഀഀ GO ഀഀ
I’ve looked at the source file, and it’s definitely ASCII text, not Unicode. Here’s my program, I was trying to have all of the data types in one array and work it from that angle, but that’s beyond my skill right now:
Any suggestions would be most welcome. Yes, it's a rather brute-force approach, but I'm new to Perl and it does what I want it to (mostly).print "\nThis program reformats scripts produced by SQL Server 2000 En +terprise Manager\n"; print "to remove brackets and tab out data types and null settings.\n\ +n"; print "You provide a file name, this program reads it and produces a n +ew file\n"; print "with a .out extension.\n\n"; print "File name to process? (<enter> to end program.) "; chomp($sqlfile = <stdin>); $outfile = $sqlfile . ".out"; $datatypes = " binary/ bigint/ bit/ char/ datetime/ " . " decimal/ float/ image/ int/ money/ " . " nchar/ ntext/ nvarchar/ numeric/ real/ " . " smalldatetime/ smallint/ smallmoney/ sql_variant/ " . " sysname/ text/ timestamp/ tinyint/ varbinary/ varchar/ " . " uniqueidentifier"; open(IN, $sqlfile) || die "cannot open $sqlfile for input: $!"; open(OUT, ">$outfile") || die "cannot open $outfile for output: $!"; while (<IN>) { chomp; #delete or comment out the two following lines #if you need brackets around your object names. s/\[//g; s/\]//g; s/ \(/\(/g; s/ \,/\,/g; s/COLLATE SQL_Latin1_General_CP1_CI_AS//g; s/ON PRIMARY//g; s/\sbinary/\t\t\tbinary/g; s/\sbigint/\t\t\tbigint/g; s/\sbit/\t\t\tbit/g; s/\schar/\t\t\tchar/g; s/\sdatetime/\t\t\tdatetime/g; s/\sdecimal/\t\t\tdecimal/g; s/\sfloat/\t\t\tfloat/g; s/\simage/\t\t\timage/g; s/\sint/\t\t\tint/g; s/\smoney/\t\t\tmoney/g; s/\snchar/\t\t\tnchar/g; s/\sntext/\t\t\tntext/g; s/\snvarchar/\t\t\tnvarchar/g; s/\snumeric/\t\t\tnumeric/g; s/\sreal/\t\t\treal/g; s/\ssmalldatetime/\t\t\tsmalldatetime/g; s/\ssmallint/\t\t\tsmallint/g; s/\ssmallmoney/\t\t\tsmallmoney/g; s/\ssql_variant/\t\t\tsql_variant/g; s/\ssysname/\t\t\tsysname/g; s/\stext/\t\t\ttext/g; s/\stimestamp/\t\t\ttimestamp/g; s/\stinyint/\t\t\ttinyint/g; s/\svarbinary/\t\t\tvarbinary/g; s/\svarchar/\t\t\tvarchar/g; s/\suniqueidentifier/\t\t\tuniqueidentifier/g; s/NOT NULL/\t\tNOT NULL/s; s/NULL/\t\tNULL/s; print "$_\n"; print OUT "$_\n"; } close(OUT) || die "problem closing new $outfile: $!"; close(IN) || die "problem closing original $sqlfile: $!";
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Problem with larger files (and s/)
by jds17 (Pilgrim) on Jun 24, 2008 at 22:44 UTC | |
by Cloudster (Novice) on Jun 25, 2008 at 14:57 UTC | |
|
Re: Problem with larger files (and s/)
by moritz (Cardinal) on Jun 24, 2008 at 22:12 UTC | |
by Cloudster (Novice) on Jun 25, 2008 at 14:51 UTC | |
|
Re: Problem with larger files (and s/)
by jethro (Monsignor) on Jun 24, 2008 at 22:40 UTC | |
by Cloudster (Novice) on Jun 25, 2008 at 14:54 UTC | |
by jethro (Monsignor) on Jun 25, 2008 at 16:39 UTC | |
by Cloudster (Novice) on Jul 02, 2008 at 16:26 UTC |