Understanding Exchange Task Throughput

Document created by user.oxriBaJeN4 Employee on Sep 15, 2015
Version 1Show Document
  • View in full screen mode

This article discusses how mailboxes are processed by an Exchange task, and builds on this information to explore how to best plan out how many Exchange tasks and Mimecast Synchronization Engine sites are required to host an implementation (an Mimecast Synchronization Engine site represents an instance of the Mimecast Synchronization Engine installed on a server).

 

Processing

 

Common considerations

  • Exchange tasks are applied to groups of mailboxes.
  • During an execution an Exchange task will process mailboxes sequentially one by one.
  • While an Exchange task is processing a mailbox a Mimecast specific lock is placed on the mailbox so that only one task can access a specific mailbox at the same time. If another task running on the same server tries to access the mailbox the second task will go into a waiting state until the lock is removed.
  • The length of time that a mailbox will take to process varies based on: 
    • the feature being used,
    • the definition defined for the feature,
    • the number of mail items in a folder and the number of folders.

 

Feature specific considerations

Mailbox Folder Replication

  • The first time a mailbox is processed the metadata of all messages and folders are replicated.
    • Only the metadata is replicated as the content is assumed to already be in the Mimecast archive.
    • The metadata of a message is enough for Mimecast to find the associated message content and display it to end users when requested.
  • Depending on the number of the folders in the mailbox and the number of items in each folder of the mailbox, this can take a long time to compete.
  • Once the first full replication of a mailbox is complete a state file for the mailbox is saved to the Mimecast Synchronization Engine server.
  • This state file is linked to the Mimecast Synchronization Engine site and the task that the mailbox is processed by.
    • It allows for all subsequent mailbox replications to leverage the incremental change system that Exchange offers to only replicate changes since the last replication.
    • Consequently all replications post the first full replication are significantly faster to complete per mailbox.
    • To enable this the mailbox must be processed by the same Mimecast Synchronization Engine site and Exchange task. Changing the task will result in full replications starting over again.

 

Calendar Replication

  • The first time a mailbox is processed all appointments and calendars are replicated to the Mimecast platform. This includes the upload of all attachments found on calendar appointments.
  • Depending on the number of the appointments in the calendar(s) and the number of attachments that need to be uploaded, this can take a long time to compete.
  • Once the first full replication of a calendar is complete a state file for the mailbox is saved to the Mimecast Synchronization Engine server.
  • This state file is linked to the Mimecast Synchronization Engine site and the task that the mailbox is processed by.
    • It allows for all subsequent mailbox replications to leverage the incremental change system that Exchange offers to only replicate changes since the last replication.
    • Consequently all replications post the first full replication are significantly faster to complete per mailbox.
    • To enable this the mailbox must be processed by the same Mimecast Synchronization Engine site and Exchange task. Changing the task will result in full replications starting over again.

 

Mailbox Storage Management

  • The first time a mailbox is processed all messages / attachments matching the admin definition need to be processed.
  • Depending on the number of items in the mailbox this can take a long time.
  • The next time a mailbox is processed all items that have already been stubbed or deleted will not be considered or even present in the mailbox, significantly reducing the time to process a mailbox.
  • As message / attachment candidates to be stubbed / deleted are identified, the Exchange task will send a request to the Mimecast archive to check that the message is already archived and that the user of the requesting source mailbox has permission to the message. 
    • If the message is confirmed it wil be stubbed / deleted.
    • If the message is not confirmed the message will not be stubbed / deleted. Requests for messages not found can cause extended delays in processing time. It is important to only run Mailbox Storage management when all Exchange messages are known to be in the Mimecast archive.

 

Managed Folders

  • The first time a mailbox is processed it is likely that there will be a high number of messages older than the Exchange delete messages older than time span specified in the Managed Folders definition.
  • Depending on how high this number is and how many items there are in the mailbox this can take a long time.
  • The next time a mailbox is processed the number of candidates to be deleted is likely to be significantly less reducing the time to process a given mailbox. 
  • As message candidates to be "managed" are identified, the Exchange task will send a request to the Mimecast archive to check that the message is already archived and that the user of the requesting source mailbox has permission to the message.
    • If the message is confirmed it wil be "managed."
    • If the message is not confirmed the message will not be "managed." Requests for messages not found can cause extended delays in processing time. It is important to only run Managed Folders tasks when all Exchange messages are known to be in the Mimecast archive.

 

Planning

Given the considerations described above it is important to properly plan the implementation of the Mimecast Synchronization Engine Exchange services features.

Firstly decide which features should be implemented, this will mostly depend on the drivers for using the feature and the user experience being enabled. Once the feature set has been decided upon the guidelines below will help to ensure the best experience:

 

  1. Typically Exchange tasks will be applied to Active Directory or Exchange groups. Where possible using groups dedicated to an Exchange task(s) is recommended. This helps to keep the experience predictable.
  2. Start small, run the chosen definition on just a few mailboxes to begin with.
  3. Record how long the first execution takes to process, as well as the time to process subsequent executions. This will give a baseline to work from when adding more mailboxes to the implementation.
    • For example, if a mailbox with 10,000 items takes 15 minutes to process on the first execution, and then 2 minutes on each execution thereafter, this will provide an indication on how long it takes to process a mailbox in the given Exchange environment with the given definition.
    • This data can be used to estimate timings for the wider implementation, as a rule of thumb, mailboxes with less mail items should complete quicker than the baseline, equally mailboxes with more mail items will take longer.
  4. The baseline results, the frequency that a mailbox should be processed (for example hourly, daily, weekly), and the total number of mailboxes to be processed will affect how many tasks will be required for each feature definition.
  5. As described in the considerations section above, an Exchange task processes mailboxes sequentially one at a time, however it is possible to apply a given definition to more than one Exchange task and have those Exchange tasks execute at the same time on the same or another Mimecast Synchronization Engine site to increase throughput and accelerate the time taken to process all of the mailboxes in the organization with the given definition(s).

 

Worked example

These figures are not exact representations and are provided for demo purposes.

 

Features and baselines

 

MailboxesAverage mailboxFeaturesFrequencyTimings
5,00010,000 itemsMailbox Folder ReplicationEach mailbox should be processed once per day.
  • First execution = 25 minutes
  • Subsequent executions = 2 minutes
5,00010,000 itemsMailbox Storage ManagementEach mailbox should be processed every 2 days.
  • First execution = 35 minutes
  • Subsequent executions = 5 minutes

Note on timings: while baselines are a good indication of calculating time to process, timings are not linear and are dependent on a number of factors, including,

  • Exchange load at the time of processing
  • Network load at the time of processing
  • The number of item per folder, typically the more items in a folder the longer a task will take to process. 
    • For example, a mailbox with 10,000 items split into 10 folders of 1000 items is likely to complete much quicker than a mailbox with 10,000 items with 10,000 items in a single folder.
    • Further more in Exchange 2003 environments once a folder increases above 5000 items the time taken to process that folder starts to increase exponentially due to the limitations of MAPI.

Calculations to scale

When calculating how to scale using baseline data, use the timings post the first execution. As we have discovered it is expected that the first executions will take longer than the regular running executions.

 

Mailbox Folder Replication

  • 5,000 mailboxes multiplied by 2 minutes implies 10,000 minutes or approximately 7 days to complete if using 1 task.
  • To align with the requirement to process each mailbox once per day, a feasible approach would be to split the 5,000 mailboxes into 20 groups of 250 mailboxes which would take approximately 500 minutes or just over 8 hours to complete all mailboxes per task.
  • This will result in 20 Exchange tasks running concurrently.
  • Consider starting the tasks at different times to distribute the load on the Mimecast Synchronization Engine server.

 

Mailbox Storage Management

  • 5,000 mailboxes multiplied by 5 minutes implies 25,000 minutes or approximately 17 days to complete if using 1 task.
  • To align with the requirement to process each mailbox every 2 days, a feasible approach would be to split the 5,000 mailboxes into 20 groups of 250 mailboxes which would take approximately 1250 minutes or just over 20 hours to complete all mailboxes per task.
  • This will result in 20 Exchange tasks running concurrently.
  • Consider starting the tasks at different times and alternate days to distribute the load on the Mimecast Synchronization Engine server.

 

Total

Based on the calculations in this example there are 40 Exchange tasks potentially running concurrently.

 

From an Exchange perspective this is only 40 additional connections so likely not an issue, however from the Mimecast Synchronization Engine perspective it would be a good decision to run with 2 Mimecast Synchronization Engine sites, using one for each feature to avoid mailbox locking and ensure that adequate hardware resource is available.

 

Changes to Active Directory Groups

When resolving the users in an Active Directory group the Mimecast Synchronization Engine connects to the local Active Directory, not the Mimecast platform and keeps a local cache of the group membership discovered.

 

This cache lasts for 1 hour. Consequently if users are added to or removed from a group, it can take up to an hour for the change to be reflected in an Exchange Task.

1 person found this helpful

Attachments

    Outcomes