Amazon Dynamo DB
Following are the characteristics of the Dynamo DB:
- It Simplifies hardware provisioning, Setup, configuration, replication and cluster scaling of NoSQL database.
- It is fully managed NoSQL database which is fast and provides low latency performance
- It automatically distributes data and traffic for a table over multiple partitions
- It also automatically add enough infrastructure capacity to support requested throughput levels and it adds or removes infrastructure and adjust the internal partition accordingly.
- To provide fast performance, all table data is stored on high performance SSD disk drivers.
- Dynamo DB performance, transaction rates, and its overall throughput can be monitored by Amazon Cloud Watch.
- It also provide automatic high availability and durability by replicating data across multiple zones with in AWS regions.
There are three data models which includes tables, items, and attributes. Below figure shows the appropriate relation between them.
Each item also has primary key that uniquely identifies the item.
Amazon Dynamo DB only requires that a table have a primary key, but it does not require you to define all of the attribute names and data types in advance. Individual items in an Amazon Dynamo DB table can have any number of attributes, although there is a limit of 400KB on the item size.
Each attribute in an item is a name/value pair. An attribute can be a single-valued or multi-valued set. For example, a book item can have title and authors attributes. Each book has one title but can have many authors. The multi-valued attribute is a set; duplicate values are not allowed. Data is stored in Amazon Dynamo DB in key/value pairs such as the following:
Applications can connect to the Amazon Dynamo DB service endpoint and submit requests over HTTP/S to read and write items to a table or even to create and delete tables. DynamoDB provides a web service API that accepts requests in JSON format.
Amazon DynamoDB supports large number of data types. Which are discussed below.
Scalar Data Types
A scalar type represents exactly one value. Amazon DynamoDB supports the following five scalar types:
- String Text and variable length characters up to 400KB. Supports Unicode with UTF8 encoding
- Number Positive or negative number with up to 38 digits of precision
- Binary Binary data, images, compressed objects up to 400KB in size
- Boolean Binary flag representing a true or false value
- Null Represents a blank, empty, or unknown state. String, Number, Binary, Boolean cannot be empty.
Set Data Types
Sets are useful to represent a unique list of one or more scalar values. Each value in a set needs to be unique and must be the same data type. Sets do not guarantee order. Amazon DynamoDB supports three set types: String Set, Number Set, and Binary Set.
- String Set Unique list of String attributes
- Number Set Unique list of Number attributes
- Binary Set Unique list of Binary attributes
Document Data Types
Document type is useful to represent multiple nested attributes, similar to the structure of a JSON file. Amazon DynamoDB supports two document types: List and Map. Multiple Lists and Maps can be combined and nested to create complex structures.
- List Each List can be used to store an ordered list of attributes of different data types.
- Map Each Map can be used to store an unordered list of key/value pairs. Maps can be used to represent the structure of any JSON object.
It is used to uniquely identify each item in a table. AmazonDB supports two types of primary keys and once configuration is done, its configuration cannot be changed once table is created.
- Partition Key: It is made of one attributes – a partition (or hash) key.
- Partition and Sort Key: This primary key has two attributes, the first attributes is the partition key and second key is sort (or range) key. Two item can have same partition key value but they must have different sort value.
To handle expected workloads on read and write, the Amazon DB provision certain amount of read and write capacity and these values can be scaled up and down by using UpdateTable action. Performing each operation on table, it consume some of the provision capacity units.
For example, given a table without a local secondary index, you will consume 1 capacity unit if you read an item that is 4KB or smaller. Similarly, for write operations you will consume 1 capacity unit if you write an item that is 1KB or smaller. This means that if you read an item that is 110KB, you will consume 28 capacity units, or 110 / 4 = 27.5 rounded up to 28.
A secondary index helps you to query data from table via alternate key. There are two types of indexes.
- Global Secondary Index: It is a Index with a partition and sort key that can be different from those of table. This key can be created and deleted at any time.
- Local Secondary Index: It is the Index which has same partition key attribute as primary key of table but different sort key. It is created when table is created.
Reading & Writing Data:
Any read or write in a table is done once primary key and Secondary Indexes are created.
There are three API action to create, update and delete items. These are PutItem, UpdateItem, and DeleteItem.
Calls to PutItem will update an existing item if the primary key already exists. PutItem only requires a table name and a primary key; any additional attributes are optional.
The UpdateItem action will find existing items based on the primary key and replace the attributes. This operation can be useful to only update a single attribute and leave the other attributes unchanged.
GetItem allows you to retrieve an item based on its primary key. All of the item’s attributes are returned by default, and you have the option to select individual attributes to filter down the results.
If a primary key is composed of a partition key, the entire partition key needs to be specified to retrieve the item. If the primary key is a composite of a partition key and a sort key, GetItem will require both the partition and sort key as well. Each call to GetItem consumes read capacity units based on the size of the item and the consistency option selected.
Amazon DynamoDB also provides several operations designed for working with large batches of items, including BatchGetItem and BatchWriteItem. Using the BatchWriteItem action, you can perform up to 25 item creates or updates with a single operation.
Amazon DynamoDB also gives you two operations, Query and Scan that can be used to search a table or an index. A Query operation is the primary search operation you can use to find items in a table or a secondary index using only primary key attribute values.
A Scan operation will read every item in a table or a secondary index. By default, a Scan operation returns all of the data attributes for every item in the table or index. Each request can return up to 1MB of data.
Scaling and Partitioning:
By AmazonDB you can create tables that can scale up to hold a virtually unlimited number of items with consistent low-latency performance. An Amazon DynamoDB table can scale horizontally through the use of partitions to meet the storage and performance requirements of your application. Each individual partition represents a unit of compute and storage capacity.
Amazon DynamoDB stores items for a single table across multiple partitions, as represented in figure Amazon DynamoDB decides which partition to store the item in based on the partition key. The partition key is used to distribute the new item among all of the available partitions, and items with the same partition key will be stored on the same partition.
As the number of items in a table grows, additional partitions can be added by splitting an existing partition. The provisioned throughput configured for a table is also divided evenly among the partitions and after a partition is split, however, it cannot be merged back together.
When a table is created, Amazon DynamoDB configures the table’s partitions based on the desired read and write capacity. One single partition can hold about 10GB of data and supports a maximum of 3,000 read capacity units or 1,000 write capacity units.