Cassandra¶
Overview¶
Cassandra is a partitioned row store database (Db split over whole cluster)
Replication for HA
Designed from the ground up as a distributed database with peer-to-peer communication (No Master node)
Feels lot like MySQL
Cassandra Demo App https://github.com/snazy/barker
Specifics¶
Update of non-existent data row will insert it as new, Insert of existing data row will update it
Select WHERE clause only possible for indexed fields (or by appending ALLOW FILTERING)
Start CQL Client¶
bin/cqlsh
Create new database¶
cqlsh> CREATE KEYSPACE app WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
cqlsh> use app;
Create new table¶
cqlsh:app> CREATE TABLE users (username TEXT PRIMARY KEY, firstname TEXT, surname TEXT, password BLOB, last_login TIMESTAMP);
INSERT, UPDATE, SELECT, DELETE¶
cqlsh:app> INSERT INTO users (username, firstname, surname) VALUES ('balle', 'Sebastian', 'Ballmann') IF NOT EXISTS;
cqlsh:app> UPDATE users SET firstname='Bastian' WHERE username = 'balle';
cqlsh:app> SELECT username, last_login FROM users WHERE surname='Ballmann' ALLOW FILTERING;
cqlsh:app> DELETE FROM users WHERE username='balle';
Create a cluster¶
Edit
conf/cassandra.yaml
Set at least the Cluster name and listen_address
Include all nodes in seed provider list
Start cassandra on all nodes
Check cluster status
bin/nodetool status
Import data from MySQL¶
Install squoop (http://sqoop.apache.org/)
sqoop import --connect jdbc:mysql://127.0.0.1/dev --username root --cassandra-keyspace dev --cassandra-create-schema