Serverless Storage
Lambda functions and Fargate serve a great purpose and might accomplish what you need for a very small, hello world-type project, but chances are you want to do a bit more with your serverless applications. Maybe you have a subscription-based product and you have user profiles with data. Or maybe your application is an e-commerce site and you need to store sensitive information, like credit card details. Or maybe you’re just trying to save some images or files. Whatever the reason, you clearly need a storage solution, and this is where our serverless storage options come in.
Depending on what you’re storing, how frequently you access stored items, and how you want to interact with them, you have a number of choices regarding file & image stores and serverless databases. Over the years, AWS has released a number of choices (some, may argue, aren’t necessarily within the ‘serverless’ category even when named as such) but the number one choice in its humble beginnings was Amazon S3.
Amazon S3
Amazon S3 is a popular abbreviation for ‘Simple Storage Solution’ and is known as object storage. You can store any number of items in S3, from various documents, to images, videos, entire file systems, and so on. In order to store these resources, you provision your own storage space, known as an S3 ‘Bucket’. The easiest way to describe a bucket is to compare it to a folder on your computer. Your ‘Documents’ folder, for instance, may contain all of your Word documents, and you may have sub folders within it, say for ‘recipes’ or ‘projects’ or ‘schoolwork’. S3 buckets operate similarly, and just like your Documents folder, you can include sub folders and all different resource types. S3 is also a very popular choice for hosting static websites, and just like any other cloud offering, there are also plenty of security constraints in place for you to take advantage of, like bucket encryption, read-only access, and more.
Of course, there are tons of other great advantages to using S3 buckets. Many of AWS’s other great resources, like Lambda, have direct integration with S3, making it easy to access and read/write directly to your bucket. S3 also has different types of storage classes, like Intelligent Tiering and Glacier, that are more cost-effective for items that are accessed infrequently or for very large storage amounts. S3 is a global storage solution though, meaning that the name of your bucket must be unique across all AWS accounts within your regional partition (so yes, it is likely ‘test-bucket’ is taken). Whatever your storage need may be, S3 can be an effective solution, and one that is significantly cheaper and more secure than your on-premises solution or office filing cabinet.
Amazon DynamoDB
As developers, sometimes we are working with larger, very structured datasets that involve schemas and specific access patterns. For these, a database solution makes much more sense, and luckily we have a serverless option for that too! DynamoDB is a NoSQL database solution that offers lightning fast performance, ‘near unlimited’ throughput and storage, and backup, restore, and multi-region replication to ensure your data is always available and secure. You can interact with your database using queries, scans, read/writes, and more, all from the AWS console or the comfort of your favorite programming language.
DynamoDB is a powerful service, but you really will not be able to capitalize on all it has to offer unless you structure your data for Single Table Design. I will admit, mastering the art of Single Table Design (I just tried to abbreviate that, and I will never do that again..) is not easy; I would go so far to say that it is probably one of the most difficult parts of serverless development (if you’re having trouble, the resources Alex DeBrie puts out for this are unparalleled). You really need to have a good grasp on your dataset, understand ALL of your access patterns up front to plan your Primary Keys (PKs) and Global Secondary Indices (GSIs) accordingly. I don’t want to go too in-depth into this topic, since there are literally full books on the subject, but just know that although I think the advantages you have when using DynamoDB to its full potential are well worth it, this can be a steep learning curve and isn’t going to be the database storage solution for everyone.
Amazon EFS, Amazon RDS Proxy, Amazon Aurora Serverless, Amazon Redshift Serverless, Amazon Neptune Serverless
This is my serverless database catch-all section. In my opinion, S3 and DynamoDB are really the storage solutions you need for serverless development, however, you may find some use cases for these remaining serverless storage options. EFS is the Elastic File System, which automagically sizes for the number of files you have (S3 covers many file storage situations, but if you have something specific in mind, you may want to use EFS). RDS Proxy makes relational database connections more secure. Aurora, Redshift, and Neptune Serverless are basically the serverless versions of these three AWS service offerings. Some of these are up for debate in the serverless community as to whether or not they really constitute as serverless offerings, but since they are not databases I have had a use for in my short career, I will abstain from weighing in here.
Well, this about sums up the serverless storage solutions available. As always, there are definitely equivalents of these resources in your cloud provider of choice, and you can’t go wrong with any service you choose, as long as you’re evaluating based on your specific application needs. Join me tomorrow, as we take another step further into serverless with API design.*
*This is part of a series that will be covered here, but I also encourage you to follow along with the rest of the series on 90DaysOfDevOps.