"ChipKill" - it doesn't sound good does it but if you're using large amounts of memory in a server then a faulty DIMM would bring the whole lot down without this.

Basically Chipkill memory design has got it's own dedicated chipkill processor on the main board which just calculates the bit-distribution: the controller allocates bits on different memory chips. If a memory chip fails, you just lose one bit, which can be corrected by ECC.

If you're using servers with large amounts of memory (roughly 16GB and above) then I would recommend this feature.

If you look at memory resilience closely then you really do see the importance of memory protection mechanisms, for example;

ECC has a typical availability of 99.989% (57.15 mins/year)
ECC + Chipkill has 99.999% (0.39 mins/year)
ECC + Chipkill + hot-spare has 99.999,998% (0.012mins/year)
ECC + Chipkill + hot-spare + Mem mirroring has 99.999,999,999% (0.00005 mins/year)

The last one I would strongly recommend when you have servers with around 256GB - 2TB (or more) of memory as the chances of chip failure are obviously higher. Obviously with chip mirroring then you half the total amount of accessible memory. ...and before you say it, yes I do have servers that can handle 2TB of memory, although it is currently only available for our large Itanium 'mainframe class' platforms.