Re: Delete line if REGEX == true

Replies are listed 'Best First'.
Re^2: Delete line if REGEX == true by Tanktalus (Canon) on Aug 07, 2005 at 03:21 UTC
Since Unix is much more than an environment to run perl under, I'll offer another solution which may make more sense if all you're using perl for is deleting lines based on a regex. No point in dragging in a good, general purpose tool over a good, specific purpose tool when that's the purpose you need it for. ;-) `sed -e '/,,,,,/d' < source.txt > destination.txt` [download] There are a lot of good tools that come with unix. They can be worth the time to investigate - even if you don't use them, they can be invaluable guides in designing your own code to do something similar in perl.	[reply] [d/l]
Re^3: Delete line if REGEX == true by davidrw (Prior) on Aug 07, 2005 at 14:25 UTC
good point. And of course besides `sed` there's `grep`, ~~which i think is even better here (read: teaching environment) because of the direct parallel to perl.~~ `grep -v ,,,,, source.txt > destination.txt` [download]	[reply] [d/l] [select]
Re^4: Delete line if REGEX == true (semantics) by Tanktalus (Canon) on Aug 07, 2005 at 15:28 UTC
"Better" is a matter of perspective, then. Because you can do what the OP asked for in perl, sed, grep, awk, and, of course, in perl you can do it the sed way, the grep way, or a number of other ways. However, only the sed way (and, in perl, the emulation of sed) does it how the OP asked to do it. The sed command says, "Delete all lines that contain five commas in a row." The grep command says, "Find all lines that don't have five commas in a row." The end result is the same. But the meaning is quite different. Since the OP is thinking in terms of editing a file, line by line - anything else they are thinking of doing will probably also be thought of in a stream-editing manner. Thus, doing what they wanted in the semantic style of how they asked for it, is likely to be extensible to any future (or simply not-yet-stated) requirements they may have. Of course, the fact that we're looking for lines with five commas may point to a better solution using DBD::CSV and some convoluted SQL statement. After all, if the file is semantically a CSV, then you're probably best off treating it as a CSV. This is, again, a method for extensibility. Because the future of a CSV file will usually include more queries against its data (what else is SQL designed for but querying data?), and it also has a reasonable chance of being inserted into an RDBMS in the future where the only real access method available is SQL. That said, this is much more work up front for possibly zero gain in the future - it's up to the OP to evaluate the advice since the OP knows more about their context than we do. I'm a firm believer in solutions being in the same space as the problem. Thinking outside the box is fine to evaluate what problem space you're really dealing with, but the eventual solution being phrased the same as the problem can really ease maintenance or adding new features.	[reply]