Zipped EML Data

Document created by user.oxriBaJeN4 Employee on Sep 7, 2015
Version 1Show Document
  • View in full screen mode

Only the following two file types are accepted for Legacy Archive Data Management.  Data of any other format that is provided will not be considered.


  • .PST: Microsoft Outlook accessible PST files containing standard message format per user, Standard Journal Format (SJF) or Microsoft Exchange Envelope Journal Format (EJF) messages only.
  • .EML: (must be zipped): Formatted as either Internet standard RFC 822 or EJF messages only. EMLs contain text in either of these two formats/layouts, and they can vary dramatically depending on the export utility, so contact your local Legacy Archive Data Management team to discuss.
  • Any other formats of the .MSG file type (e.g. Items copied from a Microsoft Outlook mailbox) are not accepted.
  • All files of each type must be grouped together under a separate parent folder per type as detailed below.  This helps to clearly identify the different methods required to process each type of folder.
  • EMLs must be provided in zipped folders, as raw unzipped EML files will not be accepted.

Zipping EMLs

EMLs will only be accepted if they are in a zipped container according to the following criteria:


  • The file format must be .zip. No other compression file format is supported
  • The recommended Zip file size is 2GB for optimal performance and management
  • The maximum Zip file size accepted is 15GB
  • Zip files should contain as many EMLs as possible within the recommended or max size.
    Note: Individual EML files must not be zipped.
  • Zipped files must not be password protected
  • The Zip64 file type is supported
  • Zipped PSTs are not supported


Zip maintains a checksum which verifies the integrity of the file which easily identifies data corrupted during transfer. It also helps to compress data and ensures that there are fewer individual files, which translates to faster data transfer and copying time to Mimecast.


File and Message Sizes

Only individual message items smaller than 512MBs that are accessible and do not contain any corrupt components are processed. Messages over this size will be ignored.


Data Grouping

Data is accepted both when grouped Per user (PU) or Not Per User (NPU), and you will need to decide which grouping is most suitable for your data.  Each grouping provides different results, has different requirements and processing methods as detailed below.


Per User (PU)

PU data is where a user’s EML folder contains data that needs to be at least available to that user’s Mimecast account, (typically their primary SMTP email address). This does not rely on the recipient details found in the message headers, which exclude Blind Carbon Copy (BCC) and distribution list member recipients. The mailbox owner is defined by the user’s EML folder name, as explained in the File Names section below. These messages will also be available to any other Mimecast account where it's SMTP address is found as a recipient or sender of the message.


Public folder data may also be provided in EML folders to group each folder’s messages within Mimecast and access them like any other Mimecast account’s items.  Users can access this data via an alternate address or by logging in directly to the account via Mimecast Personal Portal or Mimecast for Outlook. This is done by naming the EML folders with an email address like a standard user’s.


Only future external messages emailed to a public folder will be accessible to this Mimecast account, but not internal messages and those that are moved directly into the public folder. Again a message will also be available to any other Mimecast account where its SMTP address is found as recipient or sender of the message.


If the mailbox owner’s SMTP address is not found in the message as a sender or recipient, (commonly being an Exchange address which was not resolved from the AD), then the mailbox owner is still able to access the message within Mimecast Personal Portal and Mimecast for Outlook under the Online Inbox but not Online Sent Items, irrespective of whether the mailbox owner originally sent them. They will, however, be listed under the Sent Items folder within the Exchange Folders view if they were originally in that folder.


Not Per User (NPU)

NPU data is where each EML folder does not contain data for individual users, but typically contains messages grouped by date, and typically consists of SJF or EJF messages. The format of the messages of each type differs significantly, as does the method used for processing the emails and their results:


  • SJF makes an exact copy of a message as the user’s mailbox receives it with no additional envelope information, and does not include BCC recipients or distribution list (DL) members that received a message.
  • EJF is typically the most comprehensive and complete format that is possible to provide, as it contains the SMTP addresses of all senders and the final recipients in complete envelope information, which includes BCC recipients and all distribution list (DL) members.


Based on the information available within SJF message, if a user’s mailbox was sent the message as a BCC or a DL member, the message will not be available in the Mimecast Archive. Alternatively, if that item was EJF then that account’s address is in the envelope details, and the message will be available to the user in the Archive.


File and Item Types: Advantages and Disadvantages

PU- Mailbox owner can access all messages

- Message format varies significantly, and may not be accepted - often requires testing

- Messages cannot be displayed in folders

- Messages with multiple internal recipients are duplicated in provided files and folders (Not single instance)

- Mailbox owner’s unresolved sent messages are listed in Online Inbox and not Online Sent Items

NPU - SJF- Messages with multiple internal recipients are not duplicated in provided files and folders (Single instance)

- User cannot access messages received as a BCC & a DL member

- Messages cannot be displayed in folders

- Additional Recipient Resolution  may be required to improve user access to their messages, takes longer


- Messages with multiple internal recipients not duplicated

- Every original recipient including BCC & DL members access message

- All messages reliably listed in Online Inbox or Online Sent Items

- Messages not displayed in folders

Recipient Address Resolution

A user has access to all messages where the SMTP email address they use when logging into Mimecast Personal Portal or Mimecast for Outlook is found within any sender or recipient field. Therefore, a message provided NPU that does not contain a user’s exact SMTP address is not available to that user.  Common reasons for this are:


  • When the user was BCC’d
  • When the user is a member of a distribution list
  • If the email is still in the proprietary format of the system it was exported from (e.g. Exchange addresses (LEDN), Lotus Notes, etc.).


Similarly, received messages are listed in the Online Inbox and sent messages in Online Sent Items, depending on whether the user’s SMTP address is found in either the recipient or sender field respectively within each message.


Irrespective of which of the following data groups is received, all Ingested messages are available to Mimecast Administrators that have access to the Archive menu within the Administration Console. Review the considerations in the table below for more information:


Data TypeDescription
PUThe PU method ensures that the mailbox owner’s account (indicated by the name of the user EML folder) at least has access to all messages within their EML folders when logged in with that SMTP address, irrespective of whether their SMTP address exists as a recipient or sender within an email. If their SMTP address is not listed in a message as a recipient or sender and is within that users’ PU EML folder, then that user’s Mimecast account can access the message, it is displayed in the Online Inbox (even if it was a Sent Item), and is available to the Archive Search.

NPU data in SJF format often does not contain the SMTP addresses of recipients or the sender. They never contain BCC and distribution list member recipients, and as the EML folder name does not indicate the mailbox owner as it does with PU data, this method relies solely on the address portion of the recipient and sender entries found within each message. This is used to ascertain the accounts that have access to the message. To make this information as complete as possible, an additional Recipient Resolution step can be done, which significantly extends the Legacy Archive Data Management process.


Typically, this begins with Recipient Resolution, and testing a small portion of the data to ascertain the number and percentage of recipients and sender addresses that do not have valid SMTP addresses and Exchange addresses that are not found within any Mimecast configured directory connections. The results of this testing will be reported for you to decide whether you would like this additional step completed for all data, which results in the Legacy Archive Data Management process taking significantly longer to complete. If decided, the resolution process is run on all data which lists all internal recipients and sender addresses that are not found in AD (i.e. ‘unresolved’). This list is submitted to the customer, so that the SMTP address can be entered next to each internal recipient, and is then returned to Mimecast. Those addresses will be replaced with the relevant SMTP and the messages will be Ingested.


NPU data in EJF format always contains a list of all recipients/senders and their SMTP address. This includes recipients emailed as a BCC and internal distribution list members. Because this format provides complete information, its import is highly reliable and produces the best results. All items the user originally received that are provided for import are available within their Mimecast account.

This format does not provide the messages in folders within each mailbox.


File Names

All file names must include the appropriate file extension of the accepted file formats i.e. .eml.


Windows Folder Names and Structures

The few top levels of the windows folder structure must indicate the format and grouping of all or each portion of the provided data, as each one is processed by very different methods. Each folder tree must contain data of that grouping and format only, and not a mix of both.


Only PU EMLs require an additional subfolder for each user’s EML files. Each user folder must be named by the mailbox owner’s email address of their Mimecast account that should have access to their ingested messages. This is typically their full primary SMTP email address.


Each user’s EMLs must be within their windows folder. The windows subfolder structure under the user folder is unimportant, and is ignored during the Legacy Archive Data Management process. Mimecast accepts a maximum of 60,000 items within any individual folder.


Data TypeData Grouping and Message Format


Example: E:\JosCoLtd_ing1234\eml\PerUser\\Anyname035kfa.eml



Example: E:\JosCoLtd_ing1234\pst\NotPerUser_SJF\StandardJournal1AnyName.eml



Example: E:\JosCoLtd_ing1234\pst\NotPerUser_Ejf\EJFJournal1Anyname.eml