Amazon AWS Database
A database is the engine that allow your application to manage, access, and search large volume of data in its records. Database is the backend support for every application that require large number of data to be stored or accessed.
Database engine and System can be grouped in to two categories:
Rational Database management Systems: In this users read and write data from table from commands or queries using SQL (Structured Query language). It consists of one or more table and further contains many rows and column. A column contains specific attributes of records like name , age , number , address etc., whereas rows defines an individual records like full details of person who is employee of certain organization. Each record in a row is represented by unique ID called Primary Key, here Employee ID is primary key of each records. A record in a table can reference to another record in another table, by primary key of another record that pointer or reference is called foreign key.
A rational Data base is categorized two database systems
- Online Transaction Processing (OLTP): This refers to transaction oriented application that changes data frequently via read and write, example: e-commerce.
- Online Analytical Processing (OLAP): It is mostly used in reporting and analyzing large data sets
No SQL Database Systems: These database systems are non-rational and don’t have same table and column like rational database systems. NoSQL database are used as key/Value store or document store with flexible schemas, A common use case for NoSQL is managing user sessions, user profiles, shopping cart data etc.
Data warehouse is kind of repository that contain data from various resources and which is further used to compile reports , search by high complex queries by companies for their day to day work .
Amazon rational Database Service (Amazon RDS):
It is a service that provides rational database like behavior on AWS which further can launch one of many popular database engine that is ready to start taking SQL transactions. Amazon RDS can also offloads common task like backups, patching, Scaling and replication.
Amazon RDS exposes or connects database endpoints to which client software connects and execute SQL. Amazon RDS does not provide Shell access to Database DB instances and restricts access to certain tables that require advance privileges.
Amazon provides methods to query, analyze, modify, and administer databases.
Database Instances: Amazon RDS provides an API that helps to create and manage one or more DB instances. Each DB instance is an isolated database environment that runs open source database engine like MySQL, PostgreSQL, MarinaSQL, Oracle, SQL Server, Amazon Aurora. A DB instance can be multiple different databases which can be further managed in DB instance by executing SQL commands by Amazon RDS endpoint.
Following are the API which is used to create and resize the DB.
- CreateDBInstance: API used to launch new DB instance.
- ModifyDBInstances: By using this API we can change or resize DB instance.
The range of DB instances classes extends from db.t2.micro with 1 virtual CPU and 1 Gig memory to db.r3.8xlarge with 32vCPU and 244 GB of memory.
Following table provides the operational benefits of Amazon RDS over other.
Amazon RDS supports six database engines: MySQL, PostgreSQL, MaraineSQL, Oracle, SQLServer, and Amazon Aurora.
- MySQL: It is the open source database and supports wide range from small personal blog to large website. Amazon Supports MySQL 5.7, 5.6, 5.5, 5.1 version. Amazon RDS MYSQL allows us to connect using MySQL tool such as MySQL workbench or SQL Workbench. Amazon RDS MySQL supports.
- PostgreSQL: Amazon RDS Supports various version of PostgreSQL including 9.5.x, 9.4.x, and 9.3.x. Amazon PostgreSQL is managed by standard tool like pgAdmin. Amazon RDS supports multi-AZ deployment for high availability and read replicas for horizontal scaling.
- MariaDB: AmriaDB scales and adds features in such a way that it enhance the performance, availability and scalability of MySQL. Amazon supports MariaDB version 10.0.17. For MariaDB instance, Amazon RDS supports XtraDB storage and supports Multi-AZ deployment and read replicas.
- Oracle: Oracle is the most popular rational database and is mostly used in enterprise and is supported by Amazon RDS. It supports Oracle edition of 11g, 12c. Amazon RDS Oracle supports three different editions of popular database engine: Standard Edition One, Standard Edition, and Enterprise Edition.
- Microsoft SQL Server: It is also most popular rational database used in enterprise. Amazon RDS allows database Administrators (DBAs) to connect to their SQL Server DB instance in the cloud using tools like SQL Server Management studio. Following versions of SQL server Database is supported like SQL Server 2008 R2, SQL Server 2012, SQL Server 2014.
AWS offers two licensing mode, first is license Included and another is Bring your Own License.
- License Included: In this model license is held by AWS and is included in the Amazon RDS instance price. For Oracle: this includes license for Standard Edition One. For SQL Server: It includes license for SQL Server Express Edition, WEB Edition, and Standard Edition.
- Bring your Own License: In this Model, you provide your own license. For Oracle you must have Oracle Database License for DB instance Class and Oracle Database edition. If you want to have benefit of Oracle other edition then you must bring Standard Edition One, Standard Edition, and Enterprise Edition.
Amazon Aurora is a enterprise-grade commercial database which delivers up to five times the performance of MySQL for most of the web applications.
When you create the Amazon Aurora Instance, you create DB cluster which has more than one instance and includes a cluster volume that manages the data for instance. An Amazon Aurora DB cluster consists of different types of instances.
- Primary Instance: This is the main instance which supports both read and write workloads. When you modify data, you are doing modification on primary instance. Each Amazon DB cluster has one primary instance.
- Amazon Aurora Replica: This is a secondary instance that supports only read operation and each DB cluster can have up to 15 Amazon Aurora Replicas including primary instance. By using this you can distribute the read workloads among various instance which increase the performance.
Amazon RDS is built by using Amazon EBS which further allow you to select the right storage option based on your application requirement and cost. You can scale up from 4 TB to 6 TB provisioned storage based on database engine and workloads.
Amazon RDS supports three storage types.
- Magnetic: It is also called as standard storage offers cost-effective storage and is ideal for light I/O application requirement.
- General Purpose (SSD): It is also called gp2 storage and provide fast access then magnetic storage and can provide burst performance for small to medium sized databases.
- Provisioned IOPS (SSD): It is mostly used to meet for intensive Workloads, that are very sensitive to storage performance and requires consistency in random access I/O throughput.
Backup & Recovery:
There are two methods for backup and recovery. We will discuss one by one. Before that let’s discuss two important points before that.
RPO: Recovery Point Objective, is the maximum period of data loss that is acceptable in case of failure.
RTO: Recovery Time objective, is the maximum amount of downtime that is permitted to recover from backup and to resume processing.
Automated Backup: In this, it tracks the changes and takes backup of your database. Amazon RDS takes snapshot of entire DB instance volume not just individual database. A Retention period can be defined to take backup, by default the retention period is one day but it can be extended up to 35 days.
If DB instance is deleted then entire backup is deleted and cannot be recovered.
Manual Backup: In this we can take backup manually at any point of time. After that you can restore the DB instance to the specific state in DB snapshot. These snapshots are kept until deleted manually by Amazon RDS console.
Recovery: When you restore the database then a new DB instance is created, on existing DB instance the restore would not take place. When the restore is done, only default DB parameter and Security groups are associated with restore instance and when restore is complete you can attach or associate any new custom DB parameter or Security groups.
Multi-AZ with High Availability:
In this Amazon RDS creates the database cluster across multiple Availability Zones. It also helps the demanding RPO and RTO targets by using synchronous replication to minimize RPO and fast failover to minimize RTO to minutes.
Multi-AZ allows us to create the secondary copy of our database in another Availability Zones for faster DR. Multi-AZ deployments are available for all types of Amazon RDS database engines. Whenever a Multi-AZ DB instance is created, a primary Instance is created in one Availability Zone and Secondary instance is created in another Availability Zone.
When a DB instance is created you will be given DB instance Endpoint as following:
Which DNS can further resolve the IP address of this above URL.
Amazon RDS automatically replicates data from master database or primary instance to the secondary instance. Amazon RDS perform the failover in the following event.
- Primary Availability Zone reachability issue
- Connectivity loss to primary database.
- Component failure on primary database
- Primary database on storage failure.
Failover from primary to secondary happens automatically whereas DNS name remains the same, but the Amazon RDS service changes the CNAME to point to standby.
Scaling Up and Out:
As the requirement grows the DB instance can be scaled vertically or horizontally based on how you plan to scale.
To scale vertically, changes can be scheduled to scale during next maintenance window or begin immediately using ModifyDBInstance action. To change the compute and memory a different DB instance is selected and once smaller or bigger DB instance class is selected Amazon automates migration process to new class with only short disruption.
Horizontal Scalability with partitioning: Partitioning or sharding is done for horizontal scaling purpose which handles more users requests. The application has to decide how to route database requests to correct shard and becomes limited in types of queries that can be performed. NoSQL database like Amazon Dynamo DB or Cassandra are designed to scale horizontally.