read

Riak Data Types (also referred to as CRDTs) adds counters, sets, and maps to Riak – allowing for better conflict resolution. They enable developers to spend less time thinking about the complexities of vector clocks and sibling resolution and, instead, focusing on using familiar, distributed data types to support their applications’ data access patterns.

CRDTs offer a principled approach to resolving conflicts and remove the need to compose complex merge functions. Justin Sheehy has a great quote describing what CRDT’s are:

CRDTs can provide values that appear similar to simple, well-known data types (like integers, sets, and maps) but with an internal structure that makes it safe to update them without any coordination between writers and without any loss of information in the face of concurrency.

The “C” in CRDT can stand for three different things. “convergent” if the underlying implementation is state-based, “commutative” if the underlying implementation is op-based, “conflict-free” if you wish to describe both/either at once without referring to the specifics of your internal choices.

The original paper about CRDT’s can be found here

Basho has released a developer preview of Riak 2.0 with these data types, you can download the open source preview from here and start experimenting over HTTP.

Bucket Types

Bucket types were added in 2.0 to eliminate the need for each bucket to have it's own custom bucket properties. If buckets share the same custom bucket properies, a bucket type can be created for them.

To get started, you'll want to create a new bucket type with:

riak-admin bucket-type create <bucket_type_name> <bucket_props>  

A simplified example of creating bucket types for the 3 data types is below:

riak-admin bucket-type create counters '{"props":{"datatype":"counter"}}'  
riak-admin bucket-type create sets '{"props":{"datatype":"set"}}'  
riak-admin bucket-type create maps '{"props":{"datatype":"map"}}'  

Enabling Bucket Types

After the types are created, they will need to be enabled with:

riak-admin bucket-type activate <bucket_type_name>  

The 3 example bucket types can be activated with:

riak-admin bucket-type activate counters

riak-admin bucket-type activate sets

riak-admin bucket-type activate maps  

Data Types

Data Types were added in Riak 2.0 to enable developers to use primitive data types they are accustomed to, while also removing the need for developing merge functions. When using these data types, the data structure is no longer opaque to Riak, with Riak detecting conflicts and performing merges for you.

When reading a Data Type value you will only ever see a single value. That value is still eventually consistent, but it will be as correct as it can be given the amount of entropy in the database, and when the system is quiescent, all values will converge on a single, deterministic, correct value.

Using Data Types

To PUT and GET data types, the below URL structure is used:

/types/<bucket_type_name>/buckets/<bucket>/datatypes/<key>

As an example, a counter in the bucket goals for the key pominville would be accessed at:

/types/counters/buckets/goals/datatypes/pominville

To delete a data type, the associated key must be deleted:

curl -X DELETE http://127.0.0.1:10018/types/counters/buckets/goals/keys/pominville  

Issuing a delete against the ../datatypes/.. path will not work.

Creating and Accessing Counters

When creating a counter, the value the counter is initially created with is it's starting value. For example, we'll set the counter pominville in the bucket goals to 1:

curl -X POST http://127.0.0.1:10018/types/counters/buckets/goals/datatypes/pominville -H "content-type: application/json" -d 1  
On retrieval, the counter is set to 1:  
curl http://127.0.0.1:10018/types/counters/buckets/goals/datatypes/pominville  
{"type":"counter","value":1}

Subsequent POSTs will increment or decrement that value, NOT replace it with a new value:

curl -X POST http://127.0.0.1:10018/types/counters/buckets/goals/datatypes/pominville -H "content-type: application/json" -d 19  
curl http://127.0.0.1:10018/types/counters/buckets/goals/datatypes/pominville  
{"type":"counter","value":20}
curl -X POST http://127.0.0.1:10018/types/counters/buckets/goals/datatypes/pominville -H "content-type: application/json" -d -1  
curl http://127.0.0.1:10018/types/counters/buckets/goals/datatypes/pominville  
{"type":"counter","value":19}

Creating and Accessing Sets

Note: The json_pp command line tool, installed with the JSON::PP Perl module, is used in the following examples to output human readable JSON. If perl is installed, JSON::PP can be installed with cpanp -i JSON::PP.

Items in a set in Riak can be added and removed with the add, remove, add_all, and remove_all operations.

When creating or adding to a set in Riak, the add operation or add_all operations can be used

add

curl -X POST http://127.0.0.1:10018/types/sets/buckets/teams/datatypes/wild -H "content-type: application/json" -d '{"add":"Koivu"}'  
curl -X GET http://127.0.0.1:10018/types/sets/buckets/teams/datatypes/wild?include_context=false | json_pp

{
   "value" : [
      "Koivu"
   ],
   "type" : "set"
}`

add_all

curl -X POST http://127.0.0.1:10018/types/sets/buckets/teams/datatypes/wild -H "content-type: application/json" -d '{"add_all":["Pominville", "Suter", "Parise", "Gaborik", "Clutterbuck"]}'  
curl -X GET http://127.0.0.1:10018/types/sets/buckets/teams/datatypes/wild?include_context=false | json_pp

{
   "value" : [
      "Clutterbuck",
      "Gaborik",
      "Parise",
      "Pominville",
      "Suter"
   ],
   "type" : "set"
}

Items can be removed items with the remove or remove_all commands:

remove

curl -X POST http://127.0.0.1:10018/types/sets/buckets/teams/datatypes/wild -H "content-type: application/json" -d '{"remove":"Clutterbuck"}'  
curl -X GET http://127.0.0.1:10018/types/sets/buckets/teams/datatypes/wild?include_context=false | json_pp

{
   "value" : [
      "Gaborik",
      "Parise",
      "Pominville",
      "Suter"
   ],
   "type" : "set"
}

remove_all

curl -X POST http://127.0.0.1:10018/types/sets/buckets/teams/datatypes/wild -H "content-type: application/json" -d '{"remove_all":["Gaborik", "Clutterbuck"]}'  
curl -X GET http://127.0.0.1:10018/types/sets/buckets/teams/datatypes/wild?include_context=false | json_pp

{
   "value" : [
      "Parise",
      "Pominville",
      "Suter"
   ],
   "type" : "set"
}

Creating and Accessing Maps

Maps in Riak are collections of data types. They can be comprised of any combination of: counters, sets, maps, registers and flags. I did not cover registers and flags, however registers are binary values, and flags are boolean values.

Items in a Map in Riak can be added and removed with the add, remove, and update operations. The add and remove operations take only a name or list of names as their parameters. When using the add command values are initialized with a default:

add

curl -X POST http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild -H "content-type: application/json" -d '{"add":"goal_counter"}'  
curl -X GET http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild?include_context=false | json_pp

{
   "value" : {
      "goal_counter" : 0
   },
   "type" : "map"
}

The update operation can be used to add name : value pairs to the map with data (as opposed to initializing with defaults using add).

update

curl -X POST http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild -H "content-type: application/json" -d '{"update":{"goal_counter":127,"goals_against_counter":130}}'  
curl -X GET http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild?include_context=false | json_pp  
{
   "value" : {
      "goal_counter" : 127,
      "goals_against_counter" : 130
   },
   "type" : "map"
}

When adding or editing data types in a map, the operations of the stand alone data types need to be used. We'll update the goal and goals_against counters, as well as add a new set representing the starting lineup:

curl -X POST http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild -H "content-type: application/json" -d '{  
    "update": {
        "goal_counter": -1,
        "goals_against_counter": 1,
        "starters_set": {
            "add_all": [
                "Parise",
                "Coyle",
                "Niederreiter",
                "Suter",
                "Brodin",
                "Kuemper"
            ]
        }
    }
}'
curl -X GET http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild?include_context=false | json_pp

{
   "value" : {
      "starters_set" : [
         "Brodin",
         "Coyle",
         "Kuemper",
         "Niederreiter",
         "Parise",
         "Suter"
      ],
      "goal_counter" : 126,
      "goals_against_counter" : 131
   },
   "type" : "map"
}

Registers and Flags in Maps

Two other data types are available in maps: registers and flags. Registers are a way to store binary data in a map, while flags are a way to store boolean data.

Registers can be added to maps like below:

curl -X POST http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild -H "content-type: application/json" -d '{ "update" : { "name_register" : "Minnesota Wild" }}'  
curl -X GET http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild?include_context=false | json_pp

{
   "value" : {
      "starters_set" : [
         "Brodin",
         "Coyle",
         "Kuemper",
         "Niederreiter",
         "Parise",
         "Suter"
      ],
      "name_register" : "Minnesota Wild",
      "goal_counter" : 126,
      "goals_against_counter" : 131
   },
   "type" : "map"
}

Flags can be enabled in a map:

curl -X POST http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild -H "content-type: application/json" -d '{ "update" : { "playing_flag" : "enable" }}'  
curl -X GET http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild?include_context=false | json_pp

{
   "value" : {
      "playing_flag" : true,
      "starters_set" : [
         "Brodin",
         "Coyle",
         "Kuemper",
         "Niederreiter",
         "Parise",
         "Suter"
      ],
      "name_register" : "Minnesota Wild",
      "goal_counter" : 126,
      "goals_against_counter" : 131
   },
   "type" : "map"
}

and disabled:

curl -X POST http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild -H "content-type: application/json" -d '{ "update" : { "playing_flag" : "disable" }}'  
curl -X GET http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild?include_context=false | json_pp

{
   "value" : {
      "playing_flag" : false,
      "starters_set" : [
         "Brodin",
         "Coyle",
         "Kuemper",
         "Niederreiter",
         "Parise",
         "Suter"
      ],
      "name_register" : "Minnesota Wild",
      "goal_counter" : 126,
      "goals_against_counter" : 131
   },
   "type" : "map"
}

Deleting name : value pairs from a Map

The remove operation can be used to delete a name : value pair, or a set of name : value pairs from a map using the name(s).

remove

curl -X POST http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild -H "content-type: application/json" -d '{ "remove" : "playing_flag" }'  
curl -X GET http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild?include_context=false | json_pp

{
   "value" : {
      "starters_set" : [
         "Brodin",
         "Coyle",
         "Kuemper",
         "Niederreiter",
         "Parise",
         "Suter"
      ],
      "name_register" : "Minnesota Wild",
      "goal_counter" : 126,
      "goals_against_counter" : 131
   },
   "type" : "map"
}
curl -X POST http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild -H "content-type: application/json" -d '{ "remove" : ["name_register", "starters_set"] }'  
curl -X GET http://127.0.0.1:10018/types/maps/buckets/teamstats/datatypes/wild?include_context=false | json_pp

{
   "value" : {
      "goal_counter" : 126,
      "goals_against_counter" : 131
   },
   "type" : "map"
}

Riak 2.0 also brings some other new features like the new Riak Search which leverages the Apache Solr full-text document indexing engine directly. Riak users now get the power of Solr, with the availability and scalability of Riak. I will take a look at the new search in a later post.

Blog Logo

Joel Jacobson

Passionate hacker, speaker & trainer, with an interest in NoSQL & Distributed Systems.


Published

Image

Joel Jacobson

A place I put stuff.

Back to Overview