Last November, Amazon Web Services (AWS) announced a new database service, named Aurora, which appears to be a real challenger to commercial database systems. AWS will offer this service at a very competitive price, which they claim is one-tenth that of leading commercial database solutions. Aurora has a few drawbacks some of which are temporary, but the many benefits far outweigh them.
Benefits
Using hardware specially built for Aurora, AWS has come up with a service that tightly integrates database with hardware, Aurora delivers over 500,000 SELECTS/sec and 100,000 updates/sec. That is five times higher than MySQL 5.6 running the same benchmark on the same hardware. This new service utilizes a sophisticated redundancy method of six-way replication across three availability zones (AZs), with continuous backups to AWS Simple Storage Service (S3) to maintain 99.999999999% durability. In the event there is a crash, Aurora is designed to almost instantaneously recover and continue to serve your application data by performing an asynchronous “crash” recovery on parallel threads. Because of the amount of replication, disk segments are easily repaired from other copies that make up the cluster. This ensures that the repaired segment is current, which avoids data loss and reduces the odds of needing to perform a point-in-time recovery from S3. In the event that a point-in-time recovery is needed, the S3 backup can restore to any point in the retention period up to the last five minutes. Aurora also has survivable cache which means the cache is maintained after a database shutdown or restart so there is no need for the cache to “warm up” from normal database use. It also offers a custom feature to input special SQL commands to simulate database failures for testing.
Aurora consists of two types of instances:
- Primary instance – Supports read-write workloads, and performs all of the data modifications to the cluster volume. Each Aurora DB cluster has one primary instance.
- Aurora Replica – Supports only read operations. Each DB cluster can have up to 15 Aurora Replicas in addition to the primary instance, which supports both read and write workloads. Multiple Aurora Replicas distribute the read workload, and by locating Aurora Replicas in separate AZs you can also increase database availability.
Aurora replicas are very interesting. In terms of read scaling, Aurora supports up to 15 replicas with minimal impact on the performance of write operations while MySQL supports up to 5 replicas with a noted impact on the performance of write operations. Aurora automatically uses its replicas as failover targets with no data loss while MySQL replicas can have this done manually with potential data loss. Since these replicas share the same underlying storage as the primary, they lag behind the primary by only tens of milliseconds. For many use cases this might be as good as synchronous.
In terms of storage scalability, I asked AWS some questions about how smoothly Aurora will grant additional storage in the event that an unusually large amount of it is being consumed since they’ve stated it will increment 10GB at a time up to a total of 64TB. I wanted to know where the threshold for the autoscaling was at and if it were possible to push data in faster than it could allocate space. According to the response I received from an AWS representative, Aurora begins with an 80GB volume assigned to the instance and allocates 10GB blocks for autoscaling when needed. The instance has a threshold to maintain at least an eighth of the 80GB volume as available space (this is subject to change). This means whenever the volume reaches 10GB of free space or less, the volume is automatically grown by 80GB. This should provide a seamless experience to Aurora customers since it is unlikely you could add data faster than the system can increment the volume. Also, AWS only charges you for the space you’re actually using, so you don’t need to worry about provisioning additional space.
Aurora also uses write quorums to reduce jitter by sending out 6 writes, and only waiting for 4 to come back. This helps to isolate outliers while remaining unaffected by them, and it keeps your database at a low and consistent latency.
Pricing
For the time being AWS Aurora is free if you can get into the preview which is like a closed beta test. Once it goes live there are a few things to keep in mind when considering the pricing. You pay an hourly price for each instance (primary and replica). Storage is $0.10 a month per 1GB used, and IOs are $0.20 per million requests.
Backups to S3 are free up to the current storage being actively used by the database, but historical backups that are not in use will have standard S3 rates applied. Another option is to purchase reserved instances which can save you money if your database has a steady volume of traffic. If your database has highly fluctuating traffic then on-demand instances tend to be the best so you only pay for what you need.
For full details, please visit their pricing page.
Drawbacks
Currently, the smallest Aurora instances start at db.R3.large and scale up from there. This means once the service goes live there will be no support for smaller instances like they offer for other RDS databases. Tech startups and other small business owners may want to use these more inexpensive instances for testing purposes. So if you want to test out the new Aurora database for free you better try apply for access to the preview going on right now. AWS currently does not offer cross region replicas for Aurora, so all of the AZs are located within a single region. On the other hand, that does mean that latency is very low.
It only supports InnoDB and any tables from other storage engines are automatically converted to InnoDB.
Another drawback is that Aurora does not support multiple tablespaces, but rather one global tablespace. This means features such as compressed or dynamic row format are unavailable, and it affects data migration. For more info about migrations, please visit the AWS documentation page.
Temporary Drawbacks
During the preview, Aurora is only available in the AWS North Virginia data center. The AWS console is the only means for accessing Aurora during the preview, but other methods such as CLI or API access may be added later. Another important thing to note is that during preview, Aurora does not support encryption at rest, but they plan to add it in the near future (probably before the preview is over). Another feature to be added at a later date is the MySQL 5.6 memcached option, which is a simple key-based cache.
Conclusion
All in all, it sounds amazing, and I for one am very excited to see how it will play out once it moves out of the preview phase. Once it is fully released, we may even do a performance experiment to load test a site that relies on an Aurora DB to see how it holds up. If you’re intrigued enough to try and get into the preview, you can sign up for it here.