Top 4 Things To Know About Cassandra in EC2

If you ask people in the know what the best practices for running Cassandra in the cloud (and specifically Amazon's EC2) are, they'll usually just tell you not to. Cassandra is designed to be run bare-metal on commodity hardware. But luckily at the excellent Cassandra SF 2011 Conference, a few key points were repeated by presenters who are actually doing it:

1. Keep Cassandra's datastore on ephemeral drives, not EBS volumes

It seems counter-intuitive, but analysis of production workloads has produced a consensus that ephemeral drives have better and more consistent performance than EBS.

Although Amazon is tight lipped about their actual EC2 infrastructure, there are a couple of speculative reasons that have been given to explain this. First, it is thought that because EBS volume communication must take place over the same network interface as all the other traffic on the system that EBS performance degrades as network usage rises. Second, since it is assumed that EBS volume…