Search this blog

27 May, 2008

Collecting garbage

Our is one of the most fast advancing fields in computer science. Still there are some misconceptions that are hard to be broken. One of those is about garbage collection. Most game programmers hate it. Still most game programmers work in frameworks that do at least reference counting (very basic one, most of them are not thread safe, nor able to handle cyclic references correctly), and many times, when two reference counted systems have to be bridged, we design our own, bad, slow implementations of garbage collection.

The best examples of garbage collectors are the ones built into java and c# virtual machines, those systems manage to be faster than explicit allocators on server workloads. They are tested, they have excellent performances. Why most people still think that garbage collection is slow?

Well I think that has various reasons behind it. First of all, of course, is that early GC implementations, conservative and stop-the-world, were not as refined as current ones. Then probably there's a misconception that is really related to desktop java programs. Java used to suck for such tasks, mostly due to slow graphics and slow IO, that were never related to the GC itself. Last, but not least (at all) is that GC on desktop systems have to live with other programs, and the OS itself, that does not understanda GC concepts. That usually means that the GC will allocate a huge heap that they will manage themselves, thus eating up (or seeming to eat) much memory for even the most trivial hello world program.

Moreover for typical game systems, we don't rely on heap allocators as well, for performance critical code. In game, we usually want to have close to no allocations/deallocations, we want to be in complete control of the memory budgets for each subsystem, and most of the times, we just craft a bunch of ad-hoc class pools were memory management is a performance problem. Class pools can be done in a GC system as well, so it shouldn't really matter if we use GC or manual allocation, at all.

But first of all, why should we strieve for a GC world? Why should we like it, were are the advantages?

Many people think that a key advantage is that GC ensure not to leak memory. What a small feature! In any decent C++ code you will have your own debug memory allocator that makes tracking and slashing leak bugs a trivial task. Also, while you can't leak memory, you could leak other resources, and it also makes RAII more difficoult to implement, as usually you don't know when your destructors are going to be called!
GC does not help with other memory related bugs, like memory stomping, that's a language feature (not allowing out-of-bounds access and pointer arithmethic), it's true that such a feature is usually available in languages that use GC (because it makes the design of the collector much easier, otherwise, only conservative ones could be engineered), but still its not strictly a GC feature. GC do protect you from double deletes and using accessing deleted memory, still those bugs are not too hard to detect for a debugging allocator.

The real benefit is not that. It's about composibility. Simply, manual allocation does not compose (in a somewhat similar way to explicit locking). If you allocate a class, and then pass it to another subsystem, or you have to completelly transfer its ownership (and still that should be clearly documented, it's not the standard policy), and forget about it (don't store a pointer to it), or you have to manage the subsystem as well, making sure it will not use that class after you destroy it.
That's why we prefer to use reference counting. Garbage collection is only a usually more powerful, and faster, implementation of that idea, that is, decoupling memory managment from the rest of the code, to make composition easier.

There is also another benefit, knowing where all the pointers are (non conservative GC) also lets you move classes around in memory (moving GC). That's why GC usually have much lower fragmentation than the best non-fragmenting heap allocators. and it's another big win as fragmentation is usually a nasty problem. That also lets the system improve cache and page friendliness (this is done automatically by most GC, but noone prevents you to implement a system were you can expilictly hint about a given memory ordering in a moving GC) and it's one of the key features that let some GC systems to be faster than manual heap allocators, heap allocations are not simple as they might seem...

Update: an interesting read here and here
Update: I know that there are people that believe that manual allocation is still better because you can code a reference counting system or even a (usually incredibly bad) GC system for it. So you get more choice, and more power. That's a typical C++ mantra, sung without any real world experience. The problem is that as manual allocation is bad, everyone will need at least RC, so everyone will code its own system, that's because they have the power to choose, and they will choose, in different ways. Different and incompatible. Different and incompatible ways that will eventually meet and produce tons of slow, bad, ugly code in order to let them talk each other. That's it, typical C++ code.

12 comments:

Unknown said...

The problem with statements like GC is faster than alternatives such as reference counting, is that it ignores the fact that GC languages typically churn out much more garbage than manual allocation languages. Most memory in C/C++ ( above 90%) is allocated on the stack, which is either completely free or a single push/pop instruction.

DEADC0DE said...

But I was talking about GC not about any specific language...

Anyway that is another misconception, even for languages that have only reference types (i.e. Java, C# has structs that are value types), it's well possible for the VM to know if the lifetime of an object is limited to the scope of a function and so do not use its GC heap for those objects.

It's also kinda easy to do that kind of checks, if you don't ever copy the reference outside a given scope, you know that that scope corresponds to the life of the object...

Unknown said...

Nice write up.

I'd like to add that every memory allocation in GC based platforms, like the JVM, is almost as fast as allocating on the stack in C++. The reason is a big chunk of physical memory is allocated on the heap in the beginning and then only assigned to the object. What is really much slower is the 'collection and freeing'.

Ian's point also holds true. However, Java 7 is expected to include full Escape Analysis, which allows for lots of optimizations. Of course GC is not enough, but dynamic compilation is also required for them:

1. Conversion of heap allocations to stack allocations when possible
This should get us close to the percentage Ian mentioned. It is comparable to pointer analysis for static compiled languages, but is not bound by virtual methods and linking modules.

2. Synchronization Elision / Lock Coarsening
No unnecessary synchronization for multithreaded code. Already implemented in Java 6

3. Breaking up objects or scalar replacement
Further step to preventing fragmentation: Parts of objects might be in different memory locations to further increase the page friendliness..

DEADC0DE said...

Exactly, escape analysis is capable of converting heap allocations to stack ones if needed.

GC allocation performances are very fast in moving, compacting collectors as you have much lower fragmentation than heap allocators.

And while it could be true that deallocations are slower for a GC (but well, that's really implementation dependant), the truth is that we don't use standard heap allocators very much, we use class pools, that are well possible in GC environments as well, or reference counting that is usually inferior to GC as it trashes the caches if chain deletions occour (and still uses the slow, and more fragmentation prone heap allocator for object creation)

won3d said...

Plainly, GC as a way to simplify memory management is not entirely true, since it only transparently helps in the simplest cases. Note in Java you must still assign references to null when you are done with them, and there are also 3 different kinds of weak, non-owning references. GC does not mean that you can ignore ownership semantics, which must still be solved in design.

However, GC is still very important and necessary. You are certainly right about composition, particularly in the face of non-determinism. Defining lock-free data structures are far simpler when you can rely on automatic garbage collection -- it merely requires heroic effort, not divine powers.

There is one important thing that is only practically achievable via garbage collection: type safety. Manually collected handles may dangle and pun an object of a different type. This is impossible with garbage collection.

DEADC0DE said...

That was my point, GC is not about simplyfing, is about delegating memory management to a separate system, that in turns makes composition of other systems easier.

About the type safety I don't agree, again we are confusing language features with GC features. If you make a GC in C++ (i.e. the BohemGC) then you'll have a GC system but of course you won't end up having any more safety than what you had with explicit memory allocation. Is it true that most languages that natively use GC also are type safe, as this makes writing the GC easier, but that is still a language feature.

won3d said...

Believe me, I'm a fan of GC.

What I meant was that GC is practically necessary, but not sufficient, for type safety. One alternative to GC is to never collect objects, but that is hardly practical.

GC is kind of tricky in a concurrent environment, since the simple stop-the-world collection doesn't really work. And it is hard to do compacting compression incrementally. Etc, etc.

That being said, modern GC works pretty damn well.

DEADC0DE said...

Type safety is a language feature, not anything that has to do with GC. C++ without reinterpret_cast could still be type safe even with new and delete...

won3d said...

C++ would need more than a prohibition on reinterpret_cast to be type safe (e.g. pointer arithmetic, unchecked downcasts, etc). I would argue more than would be practical or useful. I like C++, btw.

Example of why you need automatic garbage collection for type safety: Say I free an object. Malloc can return that same value pointer in a future malloc, but I could still dereference the original handle, and the types no longer have to match. This can't happen with automatic garbage collection, since holding a handle guarantees that the type is stable.

DEADC0DE said...

True that C++ needs more than removing reinterpret_cast, it was just an example...

And you're also completely right about the malloc problem, indeed it does violate type safety, but that is still a C++ problem, you could easily do type-safe explicit memory allocation, for example by storing a header with type info and checking it before dereferencing a pointer...

In general every memory problem can violate type safety in C++ (this is another very cool example: http://www.bluebytesoftware.com/blog/2008/02/10/ViolatingTypeSafetyWithTornReads.aspx), but it's because C++ does not do any runtime check, it's not something that has to do with explicit memory management in general.

On the other hand, I can't think of any easy way to statically enforce type safety with malloc and free, mhm, but that could even just be due to my lack of imagination...

won3d said...

Yeah, I guess since we were talking about C++ I was presuming a static solution, but if you go dynamic then you're all good. And then you can have compiler analysis remove redundant checks, etc, but this starts moving away from a low-level systems language. Anyway, if you think of an alternative, then be sure to tell people!

DEADC0DE said...

No my post was about GC in general, not about any given language. But your observations were really intresting, thanks!