In the last few years, event-driven architecture has gained tremendously in popularity. COVID-19 being a major factor for that rise which forced companies from many industries, especially retail, to adopt digital transformation.
As companies rushed to establish their presence online, some had the appropriate tools and skills to manage the migration while others did whatever was necessary to keep things running. As the storm settles, with COVID-19 thankfully in the rearview mirror, companies now have some breathing room to take a step back and solidify their architecture.
A core component of a mature architecture is an event broker which is crucial in enabling you to implement event-driven architecture. It allows you to decouple your applications, migrate from monolithic applications to microservices, efficiently stream real-time data across datacenters and a lot more.
But as with many things, there are a lot of different options available and some architects might find themselves overwhelmed by them. A specific event broker might seem like an obvious option to them because they read about it in some book, or saw a tutorial on it on YouTube, or maybe just because it was the broker that their previous company used.
In my role as a solutions architect, I occasionally have conversations with enterprise architects about what factors they should consider when picking an event broker that is the best for their use case. In this post, my goal is to share my thoughts and my experience and how I have seen some very senior enterprise architects successfully pick an event broker.
Here is my list (in no particular order):
1. Define your Use Case
There is no such thing as a right or wrong technology. It’s either right for your specific use case or it is not. So, before you start exploring and speaking with vendors about what they have to offer, spend some time to define your use case and some key problems the solution should be able to solve.
This might seem very obvious to you but I have been attending many calls where the architects had not put it in writing what they are looking to achieve with an event broker. Because there are countless system design videos on YouTube these days, many technologists believe they must have a certain technology to be able to design a successful system. What they don’t really focus on is the problem that component is designed to solve and whether that’s a problem of concern for them.
We have all seen videos of how Netflix, Instagram, or Tinder have scaled their applications for millions of users with a certain technology stack composing of different components. But is the problem you are trying to solve requires you to work at Netflix’s scale? Do you really need a load balancer or a cache? If you do, write down the reasons. What problem are you trying to solve? Write them down. Take a step back and think from the perspective of the business you org supports. What are the benefits to your business of implementing this solution? Will it increase system availability and bring in more orders? Will it help process the orders faster and generate more revenue?
As a sample, here is a snippet of a use case that I found to be helpful:
2. Have an Open Mind
I have been in way too many meetings, more than I would like to admit, where I am asked questions like “do you have KSQL?”. No, but let’s talk about why and if you really need that specific feature. The question is then eventually changed to “I would like my application to be able to filter streams of data.” Now that’s a more broad requirement that can be discussed.
Again, with so many resources out there, some products, especially open-source products, get selected as the standard for reference architecture. If you were writing a book on event-driven architecture, you wouldn’t write it about brokers that are not freely available to everyone. That leads to everyone thinking that a certain open-source product is the best and de-facto standard for implementing event-driven architecture. And it might very well be! There is a reason why it’s famous but as with everything, make sure to have an open mind.
Now that you have a use case written out explaining what your business and technology department is trying to achieve, use it to explore available options without blindly picking what others are talking about. Speak to different vendors, see what they have to share about their experiences, build a framework for evaluating multiple options, run a Proof of Concept (POC), and then, finally, pick the product that meets your use case’s requirements.
I have met architects who are tied down to a specific technology and I have met architects who are always willing to learn about options, always evaluating, and always picking the appropriate technology that supports their business’s use case. Guess which ones are more successful in the long run? And, guess which ones are able to adjust when today’s buzz-worthy technology is easily replaced by another one in few years.
3. Which Protocols Do You Need?
I am trying to stay away from technical features that different brokers support in this post but this is a crucial question to ask when you are evaluating different brokers. Which protocol or protocols will you be using? There are different protocols for streaming events and they vary in the use cases and features they support. For example, some popular open-standard protocols are AMQP and MQTT.
MQTT is very popular with IoT use cases so many companies with IoT use cases pick MQTT as their messaging protocol and then a broker that supports MQTT.
Other companies might have a need for multiple protocols such as a lightweight protocol like MQTT for their portable devices, some proprietary protocol for backend enterprise processes, REST for cloud native services, and WebSockets for HTML5 dashboards.
If you can relate to the second example then look for event brokers that support more than just one protocol and focus on ones that do it natively, without any moving components/proxies that add overhead to your architecture. And, needless to say, think about what your requirement might be in the future. Will you stick with one protocol or will you potentially expand to multiple protocols?
Eventually, you would want to pick a solution that meets your requirements with fewer moving components and less cost. See next point.
4. Overall Cost of the Solution
It’s time to put our business hat on and think about the cost. Many architects and engineers don’t think about the cost soon enough as a factor. They put in all the effort in picking a solution, deploying it, migrating pilot use cases to adopt the technology, and then realize that they simply cannot afford the cost to scale this technology. What happens next? They have to repeat the process and search for another technology.
There are many business models out there designed to get you to quickly start using a product for free and it’s great because you can start small and prove that a solution works. But with anything free, there is always a catch and you are would be better off keeping an eye out for them early on.
Some points I like to highlight when I am speaking with enterprise architects:
- For your specific use case, what would be the upfront cost of one solution vs another?
- Is it a pay-as-you-go model that seems crazy attractive right now but will be prohibitively expensive when business booms?
- Have you taken into account the cost of the underlying resources such as number of VMs that need to be provisioned and the storage that’s required? (Do you really need the broker to have infinite storage? Who will pay for this infinite storage?)
- What about the cost of the number of engineers required to manage and support this solution? Is this solution easy to use, configure, and monitor or will you need an army of engineers to get it to run smoothly?
- What if you need to take advantage of any of the additional features such as adding support for another protocol? For example, does adding REST support require you to spin up a proxy on a separate VM (additional cost), provide HA/DR for it (additional cost), and then manage/support/monitor it (additional cost)?
- If you need help with professional services, will the vendor be able to provide you with resources which won’t cost you an arm and a leg?
- If you need to scale your solution in next 5 years, will your cost balloon up exponentially or are they predictable?
Be mindful of your costs from Day 1 and you will be better off. I have been in several meetings where architects start the conversation with “Our costs are increasing exponentially so we are exploring alternatives.”
5. Will You be a Valued Customer?
The more experience I get in my field and interact with technologists, the more I value how we treat each other. At the end of the day, you should pick a technology for meeting your business requirements in a cost-effective manner. But if you find yourself in a position where you have multiple contenders, look at the people you are dealing with.
When you are still exploring and doing a POC, will the vendor provide you with resources for free to help you run tests and have free educational sessions? When you need to purchase, will they be available to work with you to meet your budget? When you need help with implementing the technology, will you be able to easily get skilled resources from the vendor? When you need urgent support due to a production issue, will the vendor help on a weekend to fix the issue?
One of the most important factor is the quality of vendor’s support? I don’t need to tell you because I am sure everyone has experienced shitty support at least once. When you raise a support ticket, does a human actually respond to you with something useful or is it just an automated update telling you they will respond within 24 hours? And once those 24 hours are about to be up, they will respond last minute saying you need to provide additional information. And the issue would drag for weeks without a resolution.
These are all questions to ask since they make it very easy to work together with the vendor to help your business succeed.
To end the post, I will just leave you with a few thoughts. Picking an event broker is a very crucial decision because it will be the nervous system of your architecture. Many of your applications will be relying on it and integrating with it. It will need to be deployed in different regions on different types of VMs, on-prem or on cloud, and via docker or binaries. Once deployed, because of the number of applications that it will impact, it becomes very difficult to replace it so spend some extra time and do your due diligence to find an event broker that suits your use case!