This question has long been an online annoyance. A recent Slashdot post reminded us of a previous discussion of database difficulty, explaining how a database like Cassandra feels easy to learn.
Many other databases are as easy as Cassandra, they never claimed to be easy to learn or easy to learn fast, but they pointed out the fact that Cassandra is in a pretty narrow application domain. Also I think it’s pretty convenient to talk about Cassandra as a simple language because it is pretty much what people are used to from SQL, but Cassandra was built on top of many other databases and we try to keep the experience even easier for people, at least on the most common use case. But it’s really not that simple to say, “Come on, let’s build this simple database that we can learn in a few weeks,” because a database is like a library, it’s not something that you can start on your own and get good at in a few weeks, it’s something that you have to work on if you want to get good at it.
We try to introduce all new database concepts in a simple, easy to follow way, without needing any prior knowledge and it looks something like this:
First of all you have an idea what the structure is that you would like to use, a database, you pick a data type that you’d like to store, say “text”, it’s a data type that provides for a short dynamic schema, you have a basic form of data representation that you can use, it’s what you want, it’s a simple syntax for editing.
With a simple expression, we can exchange the structure and the method of creating the user with one another. Once we have this form of data exchange, we can do almost anything with it.
The second thing that we want to do is make it easy to do joins between different data, because this allows us to build our own schema. When we work with Cassandra we are used to structures that are related to relational data, so when we work with Cassandra we actually have an idea of what we are working with. So if we have this structure and we create this new entity that we want to store and if we can figure out what other structure that we can use to join these, it becomes really easy for us to do some query that requires some joins. This really becomes part of what we are used to doing with databases. And that’s really convenient.
We can make it even easier with a quick command to change the structure from the original. Just to make sure that we are sure that this structure is not changed when we do some data movement later, we can generate a new database structure.
The other thing that we want to do is have some query language, so we do not have to create the database structure and then generate all the query, the query generation engine will generate all the queries for us.
You can see the query language that we are using, and it’s the query language that we use for almost every query that we run.
We have a query that connects our application and a user that we have stored, and the query will actually require that you use the structure that you have, that you have a structure of a user, but that structure can be changed. So that’s actually what allows you to make this one operation that’s really easy to understand to actually work a lot faster, because we can make some changes, but it can still make sense. The user is in a certain structure, so when we move, for example, the user from one storage place to another location we don’t necessarily have to completely replace that structure, it will still make sense, and this makes us think more about the structure and what we are working with, but still it’s really easy to generate those queries and they are still quick to generate.
That’s the entire engine.