I have been meaning to get around to this article for a long time. In the community I have read some articles which demonstrate sending a larger message through Azure Service Bus by using a session with the message. This is helpful but I always felt there was a bit of a gap in the wider architecture discussion around these approaches. There are a few different ways you can handle larger messages with Azure Service Bus and each has their own considerations and its important to be aware of these factors before choosing which approach you take rather than just using the first code sample you find on a Google search. In this article I wanted to attempt to cover the different options with examples and discussions around these approaches.
Before we jump into more detail lets first understand why this limitation exists. Azure Service Bus is a multi-tenanted PaaS service where many different users can be using sandboxed namespaces within the platform. Underlying the platform is a large set of infrastructure which services this platform. In order to offer a true PaaS pay as you go experience where all users can have a guaranteed service level then there needs to be some constraints which users must work within. One of these constraints is message size. If for example a user was to send to a very large message or a bunch of them, the hit of buffering those messages in memory could affect the SLA Microsoft would be able to maintain for all users. The constraint they have added supports allowing users to have the combination of this reliable SLA but also to pay per message because they are using a multi-tenanted platform and don’t need their own dedicated infrastructure which would need to be priced differently.
Hopefully the above gives a reasonable explanation of my opinion of reasons the service bus constraint exists based on my interpretation of the documentation.
Now with that said we still need to consider how we deal with larger message scenarios. In the rest of this article I will aim to talk through some of the different options available and provide some architectural considerations for each and a demo of how each will work.
The different approaches I will discuss in this article include:
- Session with Transaction
- Session without Transaction
- Cache a-side Message
- Use Server Service Bus
In exploring the options, where required I will use the demo of doing an RPC style pattern where a large message is sent to the Service Bus with a response message being returned via a response queue indicate in the Reply To address.
Illustrating the challenge
If you are not familiar with this architecture challenge the video below will illustrate how message size can be important for Service Bus based messaging.
Using Session with a Transaction
In this approach the idea is that we will break up a large message into a series of messages which the overall message is split over. These related are sent to Service Bus as part of a session. When the receiver of the messages collects the messages it will combine the messages back together to rebuild the original message. The messages are sent to Service Bus and received from Service Bus as part of a transaction.
The below video will provide a walk through of this scenario:
There are a number of key things you need to consider when chunking a message into a series of messages.
Why use the Transaction?
When we have a series of messages which we wish to send to Service Bus we need to consider if we will wrap the sends of messages inside a transaction or not. By it is probably a good idea that our send is wrapped in a transaction. The reason we would do this is so that we can guarantee that the commit of messages is all or nothing. We have decided that we don’t want to risk having one of the messages in the series to be missing as this would mean we cannot merge the messages successfully in the receiver. Using the transaction also means that the receiver can guarantee that all messages are on the queue once it starts to process from the session.
Transaction Limitation
When using Azure Service Bus one of the limitations is the number of messages you can send within a single transaction. That number is 100. This means that you can send 100 x 250KB messages in this transaction. Note that the message size also includes the message properties and not just the message body.
This means that the session with transaction approach might be quite effective but it will not solve the problem for every scenario. If you had an even larger message you might not be able to use the transaction.
Message Sequence
When the original message was broken into chunks and sent to the Service Bus the receiver needs to ensure that when it merges the series of messages back together it does so in the correct order so that the message is reformed correctly.
Sometimes you can rely on the queue semantics of first-in-first-out to guarantee the messages are reformed in order but if you are using the SendAsync methods on the Service Bus SDK you cannot guarantee that order will be the same order the messages are sent to the queue. With this in mind it’s a good idea when chunking a message to include some properties on the message to identify which the order in the sequence that each message is and also how many messages there are in the sequence.
Routing
When we are sending the message which is now chunked over a series of messages we need to ensure that all of the messages have any properties which may be used for routing. This isn’t a problem if we are only sending the message to a queue. If we are using a topic with some routing then we need to make sure all messages in the series are routed to the right subscription.
Use Session without Transaction
In this approach we will discuss the option of using the message session but without a transaction. As mentioned earlier there are limits on the number of messages you can send within a transaction so therefore the overall data size which can be communicated within a transaction based session is limited.
Removing the transaction from the approach we used earlier means we can send more messages in the session however we make a trade off to get this. The tradeoff is that we cannot guarantee that all of the messages within the session will be committed to the queue. This means we should be able to send a much larger overall message by splitting it across more than 100 messages but in exception scenarios we may end up with some of the messages in the sequence not making it to the queue and then the receiver will not be able to reform the original message. Also you need to consider what the receiver would do if it did not receive all of the messages. Firstly you would need to consider how long you would wait for the additional messages then you would also need to decide what to do if you have a message missing.
The below video will provide a walk through of implementing this approach.
With this option you can see that its possible to flow a much larger message but to achieve this you need to trade off and deal with some complex considerations.
Managing Lock Duration
When I am processing messages outside of a session, the point at which I acknowledge messages is important. If I keep collecting hundreds of messages and work with them I may go past the lock duration and the message lock will expire and the messages will be made available to be collected by the next receiver instance on this session. If I acknowledge messages before I have finished processing them, the trade off is I could lost messages in failure scenarios.
Losing a Message
I need to consider how I would handle lost messages. If I do not acknowledge them then the messages will return to the queue for retry and if I do acknowledge them they may never be properly processed. When no transaction is involved from the receiver side I have the problem that the messages I think are missing might just be delayed.
Incomplete Sequence
If a message sequence is incomplete and I think I can not process it, at what point do I decide to abandon the messages. Also what approach should I take for abandoning them and is there some point I can recover?
Cache a-side Message
In the cache a-side approach rather than breaking up a message the approach will place the full message in some kind of storage then flow a simple message through the Service Bus which provides a pointer to where the actual message is stored.
An example of this could involve the message sender saving a copy of the full message to the Azure Cache and then creating a simple message which simply provide a reference to where the actual message is stored. This reference message would then flow through the service bus and the receiver would be able to identify that this message indicated that the message body is stored somewhere separate and retrieve the message.
This approach can allow you to store a message of almost any size somewhere separate but only flow a very small message through service bus which acts as a pointer to the message in the message store.
The below video illustrates this option.
In this option there are some things to consider which are discussed below.
Where do I store the message?
The store you choose to use for the real message body is very important. Your choices probably include:
- SQL Azure DB
- DocumentDB
- Redis Cache
- Azure Blob Store
- Azure Table Store
Each of these options will have their own pro’s and con’s. You need to think about things like:
- What kind of read/write latency do I require to the message store
- How will I index and find messages to retrieve later
- Can I apply any security on the message store
- How much data can I store in a single message
- How many messages and what would be the total amount of data I can store
- How will messages be removed from the store once they are used
- Will the store evict old messages that are no longer used or will I need custom clean up
- Is the store resilient
- Do I need to worry about missing messages
This choice will depend upon your requirements.
How long do I store the message for?
The length of time a message is stored for can have an impact on the costs of the message store you choose. In most scenarios I would expect a receiver would collect a message soon after it received its pointer message from Service Bus. If there are lots of messages being processed then there could be a backlog which would mean that the message store could get quite large. If you had chosen a memory based cache you might find that you start to get close to the limit of the chosen cache size and new messages being added might result in old ones being evicted. For some scenarios this could be a real thing and the results could be bad.
If I had a pub/sub messaging scenario it may also be difficult to know when all recipients had processed their copy of the message so how would I know when to delete the message from the store.
Do I need to secure the message?
If the message is in a separate message store then I need to consider if it is secured appropriately. I may need to sign and encrypt the message before writing it to the store. I may also need to consider if the chosen message store allows me to separate messages intended for different recipients and to make sure one recipient can not delete the message before others had got it.
In this case if there are external parties involved I may choose to wrap up the message store behind an API.
What happens if the message is lost?
If the message body is lost from the message store then I would need to consider how my application would handle getting a pointer message but having no associated message in the chosen store. This could be a breaking scenario and it could influence which type of store I choose to reduce the risk of this happening.
Message Lock Duration
When I am reading a message from the message store I will need to consider if I maintain a lock on the message I have received from Service Bus. If I maintain the lock and the message is large then the message lock may expire while I am downloading the message. This would cause another receiver to potentially collect the message. If I acknowledge the message from Service Bus before downloading the message body and the download fails I may find it difficult to retry and the message will no longer be on the queue.
Use Service Bus Server
The final option which you could consider is to use Server Service Bus. If you chose this option in the context we are discussing here then you would probably be considering deploying it on a Baremetal server, on an Azure Virtual Machine or set of Virtual Machines on Azure IaaS. The Server Service Bus gives you a dedicated set of resources which allows you to take advantage of the fact that Server Service Bus is running on a dedicated set of infrastructure which is why Service Bus Server has a higher message size threshold. Service Bus Server will allow you to send a single message of 50MB.
You could also combine this with the approaches above to implement chunking if you needed really big messages.
Pros & Cons
As you can see handling large messages in Service Bus is not a straight forward choice. It is easy to pull together a simple demo which illustrates how you could do it, but transferring that hello world demo to the real world brings in a number of different things you need to think about. I have tried to summarise these in the below table.
Option | Pro’s | Con’s |
---|---|---|
Session + Transaction |
|
|
Session and No Transaction |
|
|
Cache a-side |
|
|
Service Bus Server |
|
|
Message Exchange Scenarios
In reality everyone is likely to have slightly different requirements in relation to their large message question and I think the best way to approach your own situation is to consider the message exchange pattern you need and any non functional requirements you might have along side it. I have put a couple of scenarios I think people might see below.
Getting Data via Queue based RPC
One pattern might be using queued messaging but implementing an RPC style pattern where data is requested and returned on a response queue like shown above. In this example if you are returning a large amount of data then you have probably queried a system and have a view of a data entity to return. In this case you are probably not worried about durability of the message and want a quick response. In this case using a cache to store the message body and flowing a pointer to the message like in our cache-a-side example is probably the best way to go. This will allow for responses that also vary in size.
If the request message was large rather than the response then it could be likely you are about to commit a lot of changes to data. In this case you will need durability for the message. You also need some kind of guarantee about the message. This will probably push you towards the Session + Transaction option or the Cache-a-side but choosing something like DocumentDB rather than a volatile cache to store the message.
One Way Send
If your doing a one way send of data to update the state of something then you probably need durability and a transactional guarantee. Again like above I would suggest that the Session + Transaction or Cache-a-side with a durable message store would be your best options.
One Way Stream of Data
If you are sending a lot of data in a fire and forget fashion to another system but do not care about message loss then the Session without transaction option is probably the one for you. Your receiver may or may not need to stitch the chunked message back together that would depend on the scenario. The key thing here is that you dont care about potential message loss.
With this scenario I am thinking along the lines of something like a collection containing hourly updates or something where the collection is too big to include in a single message but you could push it through as a sequence of messages. If you knew this state was volatile and changed regularly and needed publish & subscribe perhaps then you could push these messages through in this kind of fashion. Im imagining something like stock updates or something. If a message was lost it may not matter because another update for the same entity will come through very soon.
What about Service Bus Server?
At this point you may wonder why I have not pushed Service Bus Server a little more. In this article I wanted to focus on the cloud offering for service bus. I believe that using Service Bus Server is a viable option for handling your message size increases but I think the overhead of managing this your self is not insignificant. For some it will be a reasonable choice but I think your looking for cloud based solutions then your best approach is to consider the other options first and only fall back to hosting Service Bus Server on Azure VM’s if your sure the other options will not work for you.
Source Code
The source code used in these demo’s is available from the following link:
http://cscblogsamples.blob.core.windows.net/publicblogsamples/ServiceBus.LargeMessage.Samples.zip
Summary
Hopefully this article gives plenty of ideas and things to consider around large message patterns that will help you when it comes to implementing real world solutions and some thoughts on do’s and do not’s around this area.