brkstr has asked for the wisdom of the Perl Monks concerning the following question:
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Removing Delimiters
by revdiablo (Prior) on Aug 14, 2003 at 23:44 UTC | |
The opposite of a join is generally a split. So if you say $output = join ',', @parsDat;, then the opposite would be @parsDat = split ',', $output; (of course, any commas in the values of the original @parsDat will get clobbered and become delimiters themselves, though this won't be a problem with the sample data you posted) In this case, I must agree with thor's post, however. Since your original script already does all the work of slicing and dicing the data into a useable structure, why not take advantage of that, and build the new requirements directly into that? | [reply] [d/l] [select] |
|
Re: Removing Delimiters
by esh (Pilgrim) on Aug 14, 2003 at 22:15 UTC | |
Please provide specific examples of the input file contents and desired output file content. BTW, you can append mutiple files on *nix using cat(1). For example, If you just want to remove all instances of a specific delimiter character from the file (assuming no escapes or quoting), you could use something like the following to do it in place. This example assumes that your deliminter is a comma (,). Replace that with your character you want to remove.
-- Eric Hammond | [reply] [d/l] [select] |
|
Re: Removing Delimiters
by Aristotle (Chancellor) on Aug 14, 2003 at 22:09 UTC | |
I need to append all the files together again and remove the delimiters.With some code posted it may have been sufficient, but with neither code nor a more detailed explanation, we can't even begin to guess what you're trying to do. Please be more specific. Makeshifts last the longest. | [reply] |
|
Re: Removing Delimiters
by thor (Priest) on Aug 14, 2003 at 23:37 UTC | |
thor | [reply] |
|
Re: Removing Delimiters
by brkstr (Novice) on Aug 14, 2003 at 22:29 UTC | |
I used this to define where the delimiters were needed: I used this to place the delimiters: As far as the input is concerned, I have about 16 different files to append to each into one. Thanks edited by ybiC: balanced <code> tags, s/tab/spacespacespace/ | [reply] [d/l] [select] |
by graff (Chancellor) on Aug 15, 2003 at 04:27 UTC | |
Well, that would be fine, so long as two conditions are met (one of which was mentioned in one of the other main replies on this thread): In other words, reverting back to the original format is easy, so long as each of the field values still has exactly the same number of characters as before. If that condition is met, then you don't need anything more than a regex substitution or replacment -- either of the following will do (let's assume that $line contains the comma-delimited, human-readable/spreadsheet-portable form): Now, if the corrected data happens to end up with wider values relative to the original data, then your script has to either reject the data with a warning, or die with a message, saying the data cannot be converted back because of a too-wide field value (and it should be specific: which input file, which line, which field, what its value is, how wide it's supposed to be). If the data has fewer characters than the original, you'd have to figure out whether you can pad these values, and if so, what is the proper way to do the padding (leading zeros? leading spaces? something else?) Or maybe you should just reject these as well. This is all based on a guess about your task, but depending on what happens to the data in its "parsed" form, this is something you need to be very clear and careful about. A good way to incorporate field-width checks is to have an array that can drive both the initial parsing from the original file(s), and the width checking when "re-joining" the parsed data back into those ugly strings; eg, based on your code snippet:
| [reply] [d/l] [select] |
by esh (Pilgrim) on Aug 14, 2003 at 22:58 UTC | |
It seems to me that you have provided a sample of your input for the second process and a segment of code which was used to generate that from the first process. Please provide a sample of your desired output from the second process. Note: For the sample code you provided, instead of a bunch of substr calls, you could probably use a single call to unpack like:
-- Eric Hammond | [reply] [d/l] |
|
Re: Removing Delimiters
by brkstr (Novice) on Aug 15, 2003 at 15:19 UTC | |
I have a process in mind for completing this project. I may be wrong in my interpretation of it (due to my expierence in this language. But none the less the process should be the same). I have input date like this:
I have 16 different record types (they differ by the first three characters of the data file). Then I use the same code as previously shown to parse and put into files to import into Access to change/modify the data:
And...
The result I get is:
So far this part works fine. My next issues are to get these files into Access (for Modifing) and append all the files then remove the delimiters. Each field for all the records are fixed lengths and they should not change by the manipulation being done in Access. Thanks for the input, I hope this is a better post. | [reply] [d/l] [select] |
by esh (Pilgrim) on Aug 15, 2003 at 16:51 UTC | |
You have indeed clearly explained the part of the process where you do not have a problem. Unfortunately, the part where you are asking for help is still unclear to me. Are you asking for help to "get these files into Access"? If so, you'll need to provide details on the Access schema so that somebody with Access knowledge can help. How can the data be modified in Access but "not changed by the manipulation in Access"? Is the data coming out of Access to be merged or are you just merging the original output files you already have? Have you tried to append all the files and remove the delimiters using my instructions in a previous post? Here's an even clearer sample:
-- Eric Hammond | [reply] [d/l] |
by brkstr (Novice) on Aug 15, 2003 at 17:37 UTC | |
I am currently trying the suggestions you gave me now, I just haven't put it all together yet. Although another ? comes to mind when I parse and make seperate files they come out like "tmp_060". Is there a way within the script to set a name for each? Thank you | [reply] |