RE: What's Wrong With Reference Counting?
by tye (Sage) on Aug 01, 2000 at 01:30 UTC
|
And I like not noticing. Even if my code ran
a little faster overall with mark-and-sweep GC, I'd rather
have the consistancy of performance provided by RC. Now,
if my code ran 10-times faster (or even 3-times faster)
with GC, I'd put up with moderate burstiness. But
Lisp programs spend 25% to 40% of their time doing GC.
Are you really claiming that RC overhead is anywhere near that
high?? Or is the nature of Perl that GC in it will be
tons more efficient than it is in Lisp?
Are you implying that it will be easier to implement a
correct version of GC than of RC? The research says that
RC is easier to implement. Sure, we run into bugs in Perl's
RC, especially when we extend a large, old code base in ways
not originally envisioned. You propose that we won't run
into any bugs in GC??
A question: Does mark-and-sweep even preserve destruction
order? It doesn't sound like it but I've only read high-level
descriptions
| [reply] |
|
|
I'm only speculating here, but: Yes, I believe that Perl
can make more efficient use of GC than Lisp does. You can't
swing a dead cat in a Lisp program without allocating and
freeing scads of conses. Perl, being written in C or something
equally capable of low-level manipulation, can avoid hot
spots in GC as they are measured.
Based on lurking around the gcc mailing lists,
I think the reason the conservative GC can *be* conservative
is that it's given a relatively small number of "root
pointers" that are the only valid sources for reference
chains of GC'able objects. If you miss a root pointer,
you get memory corruption. But I'd rather do that than
count references.
But as for the GC library itself: If we use the Boehm GC library,
which is somewhere between version 5 and version 6... NO,
I don't expect that the GC mechanism will have any bugs,
at least none worth speaking of. It's been put through
the wringer too many times.
No, destruction order is not maintained. But we've already
figured out that we want to separate end-of-scope cleanup
per se from object descruction. I wouldn't be
surprised to see Perl go the way of Java and not even have
any actual descructors. (I hope I'm not misstating
how Java works....)
Actually, the gcc example is a good one. They've converted
an existing program from non-GC to GC and they're happy
with the results.... How much better off will we be if we
design Perl 6 to use GC from day one?
-- Chip Salzenberg, Free-Floating Agent of Chaos
| [reply] |
|
|
As for consistency of runtime ... you've already lost it.
I use a time-slicing
system for programming (Linux). Most programmers do, I
suspect, now that Windows and its cousins are also MT
(if badly). And then there's virtual memory....
You can have MT and VM, or consistency. GC doesn't even
come into it most of the time.
-- Chip Salzenberg, Free-Floating Agent of Chaos
| [reply] |
|
|
Your idea of mark-and-sweep garbage collection is about twenty years out of date.
Modern garbage collectors are not bursty. They run incrementally
and the time taken by each pass can be tuned.
| [reply] |
|
|
Taking a trip into the memory basement, I came up with the Dr. Dobb's Journal for April 2000, where Joshua W. Burton discusses "various" GC algorithms and implementations. That article also references a DDJ Dec. 1997 article by Mike Spertus.
Joshua is/was a software engineer at Geodesic Systems (the makers of Great Circle AFAIK).
The focus of the article is on incremental garbage collectors with low latency, as latency is the biggest problem faced by GC nowadays. In terms of raw speed, an atomic collector (like "mark and sweep", where in one pass, all unreferenced memory is marked, and in the second pass, all marked memory is released for recycling) will always be the fastest, but on the other side, you have the latency problem, that for example "mark and sweep" will always take a certain amount of time.
An implementation that suggests low latency is Reference Counting, but even here you (can) have cascades of memory releases, pushing up the latency.
The article then moves on to describe a coloring collector, which colors memory into three categories (for any given moment) :
- black - live and fully scanned
- grey - known live but not yet fully scanned, may contain more pointers
- white - the objects whose liveness is in doubt
A collection begins with the root set gray and everything else white. The collection is finished when there is no more grey stuff left. Everything that remains white at the end of collection is garbage and can be released whenever the collector wants. The collector maintains a list of scanned pages and tracks all writes to it (a write to a scanned grey or black page makes rescanning that page necessary). This algorithm can also be interrupted without having to start from the beginning, if you have a mechanism (like CPU-level page locking) to be notified when a write to a scanned page occurs.
| [reply] |
|
|
I'll admit that I have no real experience with garbage collection (outside Perl) and little technical knowledge of it. I'm just going by what was referenced in the request itself. One was a 1998 survey that, under "Mark and Sweep" says:
"As with all tracing schemes, once GC is started, it has to go through all the live objects non-stop in order to complete the marking phase. This could potentially slow down other processes significantly, not to mention having to stop the requestor process completely."
You must be talking about something other than "mark and sweep" (which is what the suggestion proposed) or your definition of mark and sweep doesn't match that of the survey.
If this is using concepts that are 20 years out of date, then a link to more appropriate reference material would be appreciated.
| [reply] |
|
|
And it's ridiculous to talk about what "lisp programs" do, since there have been hundreds
of implementations of list over the last forty-five years,
and they all have different garbage collectors. Saying that "lisp programs
do this" or "lisp programs do that" is almost as meaningless as saying "computers do this" or "computers do that".
You might want to have a look at
ftp://ftp.cs.utexas.edu/pub/garbage/bigsurv.ps
which is a fairly recent survey of garbage collection techniques,
or at some of the other resources at
http://www.cs.ukc.ac.uk/people/staff/rej/gc.html
| [reply] |
|
|
Assume we are using the Boehm GC library in some version of Perl and we have an object, $parent, that contains a reference to another object $child. For simplicity, assume that eventually, all other references to both objects are removed. When garbage collection hits, is $parent guaranteed to be destroyed before $child?
Based on the suggestion and the web page that it references, it sounded like this guarantee would not be retained. I hope I've jumped to the wrong conclusion here.
This along with the potential large delay in destructor firing would make destructors nearly useless which takes away one of the biggest advantages to OO, IMHO.
| [reply] [d/l] [select] |