Read Medium logo
No Results
Translate to
Read Medium Logo
Free OpenAI o1 chatTry OpenAI o1 API
Read Medium logo
No Results
Translate to
avatarNoel Anson

Summary

The content provides a comparative guide between Amazon Kinesis and Amazon Managed Streaming for Apache Kafka (MSK) focusing on their capabilities, ease of use, integration with AWS services, cost-effectiveness, and message delivery semantics to determine the best AWS streaming service for real-time analytics.

Abstract

The article compares two AWS streaming data services: Amazon Kinesis and Amazon Managed Streaming for Apache Kafka (MSK), to help organizations decide which service best fits their need for real-time data processing and analytics. Kinesis is noted for its high availability, easy management, and superior integration with other AWS services. It is suitable for low-volume workloads and those needing quick deployment with minimal operational management. MSK, on the other hand, is highlighted as a robust option for high-volume workloads and businesses that prefer or already use open-source Apache Kafka. The article emphasizes the importance of considering availability, scalability, price, and message delivery (Kinesis offers at-least-once delivery, while MSK provides exactly-once semantics) when selecting a streaming service. A cost analysis reveals that while Kinesis has a pay-as-you-go model based on shard-hours and payload, MSK pricing is based on instance and storage volume usage. The choice between Kinesis and MSK will depend on the specific needs of the organization, the scale of data processing, expertise in managing Kafka clusters, and the importance of cost and message delivery guarantees.

Opinions

  • Kinesis is preferred for ease of use and minimal operational management, making it ideal for starting with real-time streaming architectures and low volumes of data.
  • MSK is recommended for organizations with high-volume workloads, existing on-premise Kafka clusters, or specific requirements for exactly-once message delivery.
  • Kinesis is presented as more cost-effective for low to medium volume workloads, while the cost-effectiveness of MSK at high volume workloads depends on optimal cluster configuration.
  • The article suggests that true cost calculations should factor in the operational and team resources required to support and maintain the service, not just service pricing.
  • There is an opinion that AWS's Kinesis, being a fully-managed service, may impose limitations and higher costs at scale compared to a self-managed MSK cluster.
  • It is conveyed that AWS Kinesis offers better integration with other AWS services, which can expedite the development and deployment of real-time analytics solutions within the AWS ecosystem.

A Guide to Choosing the Right AWS Streaming Service: Kinesis vs MSK

Organizations with a relentless focus on their customers need the ability to collect a variety of data points, process and analyze them, then act upon the data as quickly as possible to improve their customers’ experience. As a result, these organizations are moving away from traditional batch data flows and adopting event-driven data pipelines to enable real time analytics. This is essential in a rapidly competitive and dynamic market — and we must be able to grow with the needs of our clients.

Building event-driven systems is no trivial task. One of the more complex decisions is the careful selection of a scalable, durable, highly available streaming platform that is able to best collect, store, and analyze events. While there are services from multiple cloud providers to choose from, this post focuses on the options within the AWS platform.

OK, What are my options?

There are two AWS services to choose from:

  • Amazon Kinesis
  • Amazon Managed Streaming Service for Kafka (Amazon MSK)

Both services are publish-subscribe (pub-sub) systems, which means producers publish messages to Kinesis/MSK and consumers subscribe to Kinesis/MSK to read those messages. An inherent benefit of adopting pub-sub systems is the decoupling of message producers from message consumers. Producers can produce messages at an incredibly fast pace without worrying about the messages’ downstream consumption, while consumers gain the benefit of consuming messages at a pace that does not overwhelm their resources.

Kinesis is AWS's principal service that provides powerful capabilities to collect, process, and analyze real-time streaming data. It is a fully managed service allowing you to build streaming applications while abstracting away the underlying infrastructure. In the Summer of 2019, AWS announced the release of Managed Streaming for Apache Kafka (MSK). Apache Kafka is a distributed open source streaming platform developed by LinkedIn and later open sourced with the Apache Software Foundation. MSK takes away the operational burden of managing an Apache Kafka cluster.

Either service can scale to process petabyte scale data volumes from a large number of sources with millisecond latency, enabling real-time analytics. However, selecting the right one for your use case requires careful consideration of the effort involved in scaling these services, the ease of development with them and the cost of adopting them in your architecture.

Noteworthy: The original creators of Kafka also started Confluent and built the Confluent Platform as a fully managed enterprise event streaming platform that can be run on AWS, GCP, and Microsoft Azure. For the purpose of this post we are limiting our comparison to AWS services only.

Tell me, which one is better?

There are four key considerations to make when determining which service better fits your use case. Let’s get started!

1. Availability, Scalability, and Ease of Management

Kinesis synchronously replicates data across three availability zones providing high availability and data durability by default. AWS manages the infrastructure, storage, networking, and configuration needed to collect and store streaming data. You start by picking a name for the stream and selecting the number of shards. One shard provides an ingest capacity of 1MB/sec or 1000 records/sec. Kinesis provides auto-scaling capabilities using APIs that can trigger scaling actions based on usage metrics. AWS also provides utilities that can be used to auto-scale a Kinesis stream based on record throughput. The Kinesis streams remain fully functional during the scaling process and producers & consumers can continue to read/write to the streams during these operations.

MSK requires a cluster sizing exercise prior to resource provisioning. This exercise must factor in your use case, availability, and scaling needs. AWS provides guidelines to size your cluster, but these tend to be a directional starting point. Identifying the right cluster size is an iterative process and requires cluster management expertise. There are some MSK defaults that you can rely on to gain high availability. For example, when you configure an MSK cluster, the brokers are spread across three availability zones by default. If one availability zone goes down, the system is able to recover with no data loss — as long as replication has been enabled. Using MSK APIs you can scale out your MSK cluster by adding more brokers. At the time of writing this post, MSK cluster brokers cannot be scaled up.

If you are starting with real-time streaming architectures and are working with low volumes of data, Kinesis is a preferred service because of its ease of use and minimal operational management. However, if you currently operate a Kafka cluster on-premise or have high volume workloads and are evaluating your options in AWS, MSK may be the correct choice.

2. Integration with other AWS Services

Real-time stream processing is an essential component of streaming pipelines. Computations like filters, joins, type conversions, and aggregation windows are ubiquitous operations that derive insights from data. Your choice of platform affects the stream processors available to you. Kinesis tightly integrates with multiple AWS services, making processing real-time data simple and accessible. This integration with the AWS ecosystem drastically reduces the time it takes to set up your data pipeline and start serving value back to customers.

With Kinesis, you can build streaming applications using:

  • Kinesis Firehose: To load data into S3/Redshift/Amazon ElasticSearch.
  • Kinesis Data Analytics: To build and deploy SQL or Flink applications.
  • AWS Lambda: Serverless compute-to-perform custom stream processing.
  • AWS EMR: To process big data leveraging the Spark or Flink framework.
  • EC2 / Fargate / EKS: To build custom streaming applications.

With MSK, you can build streaming applications using:

  • Kinesis Data Analytics: To build and deploy SQL or Flink applications.
  • Amazon EMR: To process big data leveraging the Spark or Flink framework.
  • EC2 / Fargate / EKS: To build custom streaming applications. There are some very powerful, easy to use, open source streaming frameworks specific to Kafka, like KSQL and the Streams API that expedite pipeline development. However, these do require operational expertise to be deployed in a scalable manner.

Since Kinesis is AWS’s principal streaming service, it is no surprise that it provides superior integration with other AWS services. If these AWS services are already part of your organization’s toolkit, your timeline to build and deploy a real-time analytics solution will be significantly shorter in comparison.

3. Price and Cost

As a fully-managed streaming service, Kinesis uses a pay-as-you-go pricing model. Pricing is based on Shard-Hour and per 25KB payload. One shard provides ingest capacity of 1MB/sec or 1000 records/sec. A Shard Hour is the number of shards used by your stream, charged at an hourly rate. With MSK, you pay for the number of instances in your cluster and the storage volumes attached to them. Use these pricing examples to calculate the cost for your use case:

  • Kinesis pricing example
  • MSK pricing example

For low volume workloads (up to 10 MB/sec), Kinesis is cheaper to set up and operate compared to the fixed cost of setting up and operating an MSK cluster to process the same volume. The smallest recommended MSK cluster you can provision currently is a 3x m5.large cluster, which is more expensive than the minimum 1 shard stream you can create with Kinesis.

For medium volume workloads (up to 100 MB/sec) and high volume workloads (up to 1000 MB/sec and above), there are other factors that play a role in determining cost. Kafka configuration settings dictate the performance you gain from your MSK cluster — number of partitions, replication factor, compression type and security protocol are a few of many configurations to consider. The smaller you can optimize your MSK cluster, the less you pay for it. With Kinesis, there is little configuration needed in comparison, so the cost scales directly with the number of shards used.

It is important to consider the limitations of the Kinesis service, which make it an expensive solution at scale. A shard in Kinesis supports a consumer reading data at a maximum of 2 MB/sec. To gain 4 MB/sec read performance you would have to distribute your data across 2 shards, thereby doubling your cost. By default that 2MB/second/shard output is shared between ALL the consumers of data in that shard. If you need to provide each consumer 2 MB/sec throughput you have to use enhanced fan-out at an additional price which makes the service very expensive. With MSK, you can configure your data to be retrieved at a lower per unit price at higher throughput in comparison.

Eventually, true cost calculations rely on the cost of teams needed to support and maintain the service along with implementing best practices, efficient management of resource and prioritizing cost optimization efforts while maintaining elasticity to meet customer demand.

4. Message Delivery

Message delivery semantics are critical to building fault-tolerant streaming data pipelines. Both Kinesis and MSK are distributed streaming platforms where producers, consumers, and the streaming platform itself can fail independent of each other. Therefore, it is critical to understand the design implications of message delivery guarantees for a streaming platform to be able to build fault-tolerant real-time analytics applications. There are three message delivery semantics:

  • At-least-once delivery — In the event a system fails or a network issue occurs, a message producer may continue to retry a message until it receives a successful acknowledgement. This can cause message duplication in the streaming platform. Hence, consumer applications must handle these scenarios by explicitly de-duplicating messages.
  • At-most-once delivery — If message producers are configured to not retry messages, it can lead to data loss if the streaming platform fails to commit and acknowledge the message. Consumers are only guaranteed messages that were successfully written to the streaming platform. Data loss is a serious problem for most businesses but its significance can vary based on your use case, specially if the original message can be recovered or reproduced easily.
  • Exactly-once delivery — The perfect world where a producer sends a message, it is written exactly once to the streaming platform and no duplication of messages or data loss occurs when a consumer reads that message.

Kinesis provides at-least-once message delivery while MSK (Kafka) provides exactly-once delivery. The amount of complexity you are willing to take on in building your application will help inform your decision. If you select Kinesis, your application must anticipate and handle duplicate records using the guidance provided. If you select MSK (Kafka), it is important to read and understand the usage of its transactions API to utilize exactly-once delivery.

Conclusion

There may be other service specific features such as producer/consumer libraries written in your preferred programming language or ease-of-monitoring that may seem extremely appealing, which makes this decision a challenging one. Which is why I recommend staying laser-focused on the objective that started you on your journey — improving customer experience. If you’re looking for a solution that can quickly get you started, is fully managed, and does not require multidisciplinary expertise, consider Kinesis. If you have the resources to build and manage a highly configurable streaming platform and find value in adopting open source technology (Kafka), MSK will serve you better.

References

Amazon Kinesis

Easily collect, process, and analyze video and data streams in real time Amazon Kinesis makes it easy to collect…

aws.amazon.com

Amazon MSK - Amazon Web Services (AWS)

Fully managed, highly available, and secure Apache Kafka service Amazon MSK is a fully managed service that makes it…

aws.amazon.com

Exactly-once Semantics is Possible: Here's How Apache Kafka Does it

I'm thrilled that we have hit an exciting milestone the Kafka community has long been waiting for: we have introduced…

www.confluent.io

Msk
Kinesis
Msk Vs Kinesis
Aws Streaming
Recommended from ReadMedium
avatarPritam Deb
Mastering AWS Step Functions: A Complete Guide to Serverless Workflow Automation

Learn how to automate complex workflows, handle parallel tasks, and manage errors efficiently with AWS Step Functions.

13 min read
avatarBhavik Patel
Analysing Tech Layoffs: Which Roles Were Hit Hardest?

In the aftermath of tech layoffs, it would appear that engineering was hit hardest, but what about the others?

8 min read
avatarArchana Goyal
What’s Next for Apache Spark 4.0: A Comprehensive Overview with Comparisons to Spark 3.x

Apache Spark has established itself as a leading platform for big data processing, and the upcoming release of Spark 4.0 introduces a range…

7 min read
avatarRahul Sharma
5 Costly AWS Lambda Cold Start Mistakes You’re Probably Making

Learn to Minimise Delays in Your Serverless Applications

5 min read
avatarAustin Starks
I used OpenAI’s o1 model to develop a trading strategy. It is DESTROYING the market

It literally took one try. I was shocked.

8 min read
avatarNagarjun Nagesh
Building a Data Pipeline: Storing CSV Data from S3 to AWS Postgres using DynamoDB Configuration

In this article, we’ll explore an architectural solution to process CSV files stored in an S3 bucket and import the data into an AWS…

4 min read