Serverless technology is all the rage these days for many reasons. At the top of the list is cost-effectiveness. Since any company that adopts serverless no longer has to worry about building and managing costly infrastructure, that money can be used for more important things, such as building and deploying more robust and feature-rich applications.
For those that have yet to run into the serverless concept, the name can be a bit confusing. There are servers involved, but they tend to be virtual machines run on cloud-hosted infrastructure. So the idea is that you aren’t in charge of the servers, someone else is. Ergo, your company can run all your apps and servers without having a single server in your building.
Hence, serverless.
Now, there are all types of services that run on serverless environments, and all the big players offer serverless deployments. From AWS to Google to Azure, you’ll find serverless a part of just about everything.
Now that you understand the basic concept of serverless, you might want to know about some of the technology that runs on serverless environments. Let’s take a look at 2 particular tools, both of which run on Amazon AWS. These tools are AWS Athena and AWS Glue.
AWS Athena vs. AWS Glue
Athena | Glue | |
---|---|---|
Creation Year | 2015 | 2017 |
Creator | Amazon Web Services (Wikipedia) | AWS Glue (Wikipedia) |
Documentation | Athena Docs | Glue Docs |
Language Type | Interpreted Language | Compiled Language |
Typing | Dynamically Typed | Statically Typed |
TIOBE Rating | Not ranked in TIOBE index | Not ranked in TIOBE index |
Popularity | Growing rapidly (Google Trends Data) | Growing rapidly (Google Trends Data) |
Applications | Interactive query and analysis on data stored in Amazon S3 | Data integration, ETL (Extract, Transform, Load) |
Performance | Athena is optimized for quick and interactive query performance on large-scale datasets. | Glue offers good performance for data integration and ETL workflows. |
Stability | Athena has been stable and reliable over the years, handling large-scale workloads. | Glue is generally stable with regular updates from AWS. |
Learning Curve | Moderate – Users need to learn SQL for querying data and managing data partitions. | Moderate – Users need to understand AWS Glue’s ETL concepts and integration with other AWS services. |
Community Support | Athena has a large and active community, with extensive documentation and AWS support resources. | Glue has a growing community with various online learning resources and active AWS forums. |
Development Time | Athena offers quick and easy setup, especially when analyzing data stored in Amazon S3. | Glue provides a quick way to set up ETL jobs with its managed service. |
Key Advantages |
|
|
Key Disadvantages |
|
|
Famous Companies | Not specified | Netflix, Airbnb, NASA, Samsung, Kellogg’s, etc. |
Cross-Platform Support | Yes (Web-based platform accessible from various devices) | No (Specific to AWS ecosystem) |
First, let’s start by saying this isn’t an either-or perspective. AWS Athena and AWS Glue are simultaneously complementary and competitive. You can use them apart and greatly benefit from using them together. But what are these 2 pieces of technology? Let’s take a look.
AWS Athena
AWS Athena is an interactive query service used to analyze data stored in Amazon S3. Athena makes it very easy to use because the services are already interconnected. All you have to do is point Athena to your S3 data, define the schema you want to use, and start querying with the SQL query language.
Even better, you don’t have to bother preparing your data for the query. Because of this, all you need is basic SQL skills to analyze even massive and complex datasets housed in S3.
The benefits of Athena include:
- It’s serverless without having to pay any setup fees.
- Tap into all your data with basic SQL knowledge.
- Pay per query means you’re only paying for what you do.
- Supports most standard data formats, such as CSV, JSON, ORC, Avro, and Parquet.
- Queries are performed in parallel for lightning-fast results.
So, if you have massive amounts of data stored in Amazon S3, your best route for analyzing and querying is probably AWS Athena.
AWS Glue
AWS Glue is an ecosystem of tools to empower schema discovery and ETL (Extract, Transform, Load) using auto-generated scripts. To put it more succinctly, AWS Glue is a serverless data integration service that eases the complexity of discovering, preparing, moving, and integrating data from multiple sources.
The most widely used tools in Glue are Glue Metastore (a serverless Hive-compatible metastore that can be used in place of a self-managed Hive) and Glue ETL (a Spark service that allows customers to run Spark jobs without having to first deal with complicated configurations or manage the Spark infrastructure).
AWS Glue can be used for:
- Analytics
- Machine learning
- Application development
- Authoring and running jobs
- Implementing business workflows
Where AWS Athena is a means to interact with data, AWS Glue makes it easier for you to integrate data from multiple storage services.
One very nice feature of AWS Glue is that it includes a GUI interface to simplify the creating, running, monitoring, and managing of all your data integration jobs. With this GUI you can create seamless and efficient workflows and run them on an Apache Spark (serverless) ETL engine.
The AWS Glue Studio also simplifies the task of gathering, transforming, and cleaning your data.
What Companies Use AWS Athena?
Companies that are using or have used AWS Athena include the following:
- FINRA
- Siemens
- Cargotec
- Betterment
- Lockheed Martin Corporation
- LiveIntent
- Lenovo Group Ltd
- Red Hat, Inc.
The top industries that work with AWS Athena include:
- Information technology
- Computer software
- ISPs
- Financial services
- Marketing and advertising
- Insurance
- Telecommunications
- Retail
- Hospital and healthcare
- Higher education
AWS Athena also partners with Upsolver (a data lake ETL service) and Trianz (which accelerates and scales solutions for both Athena and AWS services).
What Companies Use AWS Glue?
Companies that use AWS Glue include:
- 21st Century Fox
- News Corps
- Apps Associates
- OLX Group
- OST
- myTomorrows
- Full 360
- Upserve
- Merck
- Expedia Group
- Autodesk
- NTT Docomo
- HappyFresh
- Pilot Flying J
- Unicorn
- MediaMath
- Amazon Prime Air
How Much Do These Services Cost?
As with most everything in the cloud, the cost of both AWS Athena and AWS Glue is on a case-by-case basis. Because these are pay-as-you-go services, you’ll want to take a look at the handy price calculators for AWS Athena and AWS Glue.
Conclusion
If you’re serious about simplifying and empowering data hosting on Amazon S3, you should seriously consider looking into both AWS Athena and AWS Glue. These two pieces of serverless technology can transform how your company uses data and greatly improve how your development and Ops teams work with, analyze, and manage your data.