You still need a general purpose processor to control it but these cards are far more efficient at handling parallel (or massively multi-threaded architecture) algorithms. The CPU would handle the fetch and forward and the GPU handles the execute which pretty much negates the use of the CPUs ALU (which is really only for basic computation anyway!).

In time we may see that the traditional x86 CPU evolve into a chip which has some of its components decoupled from the silicon for general purpose numerical computation but for now at least we have to write specific proprietary code that handles a particular vendor's card.

It's a watch this space type technology (whilst crunching away on our quads and PS3s )