Handling large size messages - Biztalk/SSIS ?

Based on my experience & knowledge I have put some of my thoughts on how to handle large inbound messages.

I would recommend you first analyze what kind of data is coming in the message. It is very rare that large messages/files are interchanged for transactional activities. What I mean to say is most of the scenarios involve large files received as a day-end activity. Do not come to conlusion that all messages should be handled in biztalk itself. If the message is more of data integration and less of business process integration, I would recommend you to use SSIS instead.

If your situation warrants use of Biztalk for processing the message, then my suggestions would be something like this.

Understand basics first

The fundamental rule for handling large message is to break it into small pieces, each of the piece handled as a seperate message. The objective is not to load the entire message into memory at once. By breaking a huge message into smaller onces you also improve the overall performance; smaller messages benefit from parallel processing capabilities of biztalk.

Biztalk 2006 now uses Virtual streams to handle inbound messages, any message over 1 MB will be buffered into files and swapped regularly to/from the main memory (disk I/O penalties do exist).

Tips

  1. Try to avoid usage of Maps.  If possible try to achieve the objective using promoted fields.
  2. Most of the time problems are encountered when these large messages are handled in custom pipelines or .NET components. The reason is that the entire message is loaded into memory at once for any operations on the message. Always use Virtual streams while reading message streams.
  3. If the message size is substantially huge, debatching is the best technique. Develop a custom pipeline component to break the incoming message into smaller parts.
  4. If you need to process the large message end-to-end (inbound to outbound as a single unit), then what you can do is debatch the message at inbound and aggregate all the message at the outbound using Sequential Convoys.
  5. Adjust the "message size threshold".  Any message larger than this size will be buffered to physical disk.
  6. Adjust "message fragment size" property so that the message is split into fragments.  MSDTC will be used to write into message box.
  7. 64 bit version of SQL Server is recommended.

No comments: