PST Data

Document created by user.oxriBaJeN4 Employee on Sep 7, 2015Last modified by user.oxriBaJeN4 Employee on Nov 15, 2016
Version 2Show Document
  • View in full screen mode

Only the following two file types are accepted for Legacy Archive Data Management. Data of any other format that is provided will not be considered:

  • .PST: Microsoft Outlook accessible PST files containing standard message format per user, Standard Journal Format (SJF), or Microsoft Exchange Envelope Journal Format (EJF) messages only.
  • .EML (must be zipped): Formatted as either Internet standard RFC 822 or EJF messages only. EMLs contain text in either of these two formats / layouts, and they can vary dramatically depending on the export utility, so contact your local Legacy Archive Data Management team to discuss.
  • Any other formats of the .MSG file type (e.g. items copied from a Microsoft Outlook mailbox) are not accepted.
  • All files of each type must be grouped together under a separate parent folder per type. This helps to clearly identify the different methods required to process each type of folder.

Message Types (Message Class)

 

A PST file may contain many types of individual items called message classes. Message classes determines how the item is displayed in Outlook. Only standard message classes / types in PSTs are processed. These include:

 

Message ClassDescription / Example
ipm.note.*Standard email messages.
report.ipm*Notifications (e.g. delivery receipt, read receipt, non-delivery report, etc.).
ipm.schedule.*Appointment messages (e.g. calendar invites, meeting request acceptance, etc.).
  • Tasks, Calendar entries, Notes, Journal items, third party stubs, and other items not listed in the table are not processed, and will therefore not be ingested.
  • Any third party stub items that exist in a PST (i.e. any message class that contains shortcut) are not processed during the Legacy Archive Data Management process.

File and Message Sizes

 

Mimecast recommends that PST’s are provided as 5 - 10 GB files. However we will accept and process PSTs up to 50 GB, as long as they are not corrupt and are accessible to Outlook. Processing file sizes larger than 20 GB will take an extended amount of time. Only individual message items smaller than 512 MB that are accessible and do not contain any corrupt components are processed. Messages over this size will be ignored.

View the relevant section on File Accessibility for more information on corrupt files.

File Accessibility

 

Some export utilities (e,.g. Microsoft Exchange Exmerge 2003) can corrupt PST files when exporting more than 1.97 GB of data, causing them to be inaccessible. This issue is avoided by defining multiple shorter export time periods to create smaller PSTs. Alternatively export using Outlook 2003 or later, or another client that supports larger PSTs.

Confirm that any PSTs between 1.96 GB (2,065,694 KB) and 2 GB and multiple random messages within them in multiple folders are accessible using Outlook. Also randomly confirm Outlook can access PSTs of other sizes, especially larger ones.  If any of the data is inaccessible, you will need to reexport the data for those mailboxes, and split them into smaller files using shorter time periods.

Inaccessible PSTs will be reported once all data has been processed. You can then resolve the issues, confirm their access and reprovide the cured files within another separate Ingestion project. Mimecast will accept up to two submissions of the same data that you have attempted to fix.

 

Data Grouping

 

Data is accepted both when grouped Per User (PU) or Not Per User (NPU), and you'll need to decide which grouping is most suitable for your data. Each grouping provides different results, has different requirements and processing methods as detailed below.

 

Per User (PU)

 

PU data is where a user’s PST(s) contains data that needs to be at least available to that user’s Mimecast account, typically their primary SMTP email address. This does not rely on the recipient details found in the message headers, which exclude Blind Carbon Copy (BCC) and distribution list member recipients. The mailbox owner is defined by the PST file name, as explained in the File Names section below. These messages will also be available to any other Mimecast account where it's SMTP address is found as a recipient or sender of the message.

 

Public folder data may also be provided in PST files to group each folder’s messages in Mimecast and access them like any other Mimecast account’s items.  Users can access this data via an alternate address or by logging in directly to the account via Mimecast Personal Portal or Mimecast for Outlook. This is done by naming the PST folders with an email address like a standard user’s PST.

 

Only future external messages emailed to a public folder will be accessible to this Mimecast account, but not internal messages and those that are moved directly into the public folder. Again a message will also be available to any other Mimecast account where its SMTP address is found as recipient or sender of the message.

 

If the mailbox owner’s SMTP address is not found in the message as a sender or recipient, commonly an Exchange address which was not resolved from the AD, the mailbox owner is still able to access the message in Mimecast Personal Portal and Mimecast for Outlook under the Online Inbox but not Online Sent Items. This is irrespective of whether the mailbox owner originally sent them. However they will be listed under the Sent Items folder in the Exchange Folders view if they were originally in that folder in the processed PST.

 

Not Per User (NPU)

 

NPU data is where each PST does not contain data for individual users, but typically contains messages grouped by date and typically consists of SJF or EJF messages. The format of the messages of each type differs significantly, as does the method used for processing the emails and their results:

  • SJF makes an exact copy of a message as the user’s mailbox receives it with no additional envelope information, and does not include BCC recipients or distribution list (DL) members that received a message.
  • EJF is typically the most comprehensive and complete format that is possible to provide, as it contains the SMTP addresses of all senders and the final recipients in complete envelope information, which includes BCC recipients and all distribution list (DL) members.

 

Based on the information available in SJF message, if a user’s mailbox was sent the message as a BCC or a DL member, the message will not be available in the Mimecast Archive. Alternatively if that item was EJF, that account’s address is in the envelope details, and the message will be available to the user in the Archive.

 

File and Item Types: Advantages and Disadvantages

 

AdvantagesDisadvantages
PU

- Mailbox owner can access all messages

- Messages can be displayed in folders

- Messages with multiple internal recipients are duplicated in provided files and folders (Not single instance).

- Mailbox owner’s unresolved sent messages are listed in Online Inbox and not Online Sent Items.

NPU - SJF- Messages with multiple internal recipients are not duplicated in provided files and folders (Single instance)

- User cannot access messages where their SMTP address isn’t found as a sender or recipient or is received as a BCC & a DL member.

- Messages cannot be displayed in folders.

- Additional Recipient Resolution may be required to improve user access to their messages, but takes longer.

NPU - EJF

- Messages with multiple internal recipients are not duplicated in provided files and folders (Single instance).

- Every original recipient including BCC & DL members have access to all their messages.

- All of the mailbox owner’s sent messages are reliably listed in Online Inbox or Online Sent Items

- Messages cannot be displayed in folders

 

Recipient Address Resolution

 

A user has access to all messages where the SMTP email address they use when logging into Mimecast Personal Portal or Mimecast for Outlook is found within any sender or recipient field. Therefore, a message provided NPU that does not contain a user’s exact SMTP address is not available to that user. Common reasons for this are:

  • When the user was BCC’d.
  • When the user is a member of a distribution list.
  • If the email is still in the proprietary format of the system it was exported from (e.g. Exchange Addresses (LEDN), Lotus Notes, etc.).

 

Similarly received messages are listed in the Online Inbox and sent messages in Online Sent Items, depending on whether the user’s SMTP address is found in either the recipient or sender field respectively within each message.

 

Irrespective of which of the following data groups is received, all ingested messages are available to Mimecast administrators that have access to the Archive menu within the Administration Console. Review the considerations in the table below for more information:

 

Data TypeDescription
PUThe PU method ensures that the mailbox owner’s account (indicated by the name of the PST) at least has access to all messages within their PSTs when logged in with that SMTP address, irrespective of whether their SMTP address exists as a recipient or sender within an email. If their SMTP address is not listed in a message as a recipient or sender and is within that users’ PU PST, then that user’s Mimecast account can access the message, it is displayed in the Online Inbox (even if it was a Sent Item), and is available to the Archive Search.
NPU - SJF

NPU data in SJF format often does not contain the SMTP addresses of recipients or the sender. They never contain BCC and distribution list member recipients, and as the PST file does not indicate the mailbox owner as it does with PU data, this method relies solely on the address portion of the recipient and sender entries found within each message. This is used to ascertain the accounts that have access to the message. To make this information as complete as possible, an additional Recipient Resolution step can be done, which significantly extends the Legacy Archive Data Management process.

 

Typically, this begins with testing a small portion of the data to ascertain the number and percentage of recipients and sender addresses that do not have valid SMTP addresses and Exchange addresses that are not found within any Mimecast configured directory connections. The results of this testing will be reported for you to decide whether you would like this additional step completed for all data, which results in the Legacy Archive Data Management process taking significantly longer to complete. If decided, the resolution process is run on all data which lists all internal recipients and sender addresses that are not found in AD (i.e. ‘unresolved’). This list is submitted to the customer, so that the SMTP address can be entered next to each internal recipient, and is then returned to Mimecast. Those addresses will be replaced with the relevant SMTP and the messages will be imported.

NPU - EJF

NPU data in EJF format always contains a list of all recipients/senders and their SMTP address. This includes recipients emailed as a BCC and internal distribution list members. Because this format provides complete information, its import is highly reliable and produces the best results. All items the user originally received that are provided for import are available within their Mimecast account.

This format does not provide the messages in folders within each mailbox.

 

File Names

 

All file names must include the appropriate file extension of the accepted file formats (e.g. .pst). Additionally PST files that contain PU data require a specific file name convention. NPU PST files files do not require a specific file name convention, but must include the file extension.

 

Only the following three PU PST file name conventions are accepted, otherwise the data will be returned to be corrected. Each PST must be named by the mailbox owner’s full SMTP email address, which is typically the primary address they send and receive emails from and therefore will be used to log onto Mimecast to access their archive.

 

To enable you to store multiple PSTs for one user in the same folder, they can be numbered by any increment, but must follow either of the two convention variations below.

 

p

PU PST File Name Convention Syntax OptionsExamples of Naming Conventions
{MimecastAccountFullSMTPEmailAddress}.pst

\pst\joblogs@josco.com.pst

\pst\bettyblogs@josco.com.pst

{MimecastAccountFullSMTPEmailAddress}.1.pst

 

\pst\broblogs@josco.com.1.pst

\pst\broblogs@josco.com.2.pst

{MimecastAccountFullSMTPEmailAddress}.pst.1

 

\pst\bettyBlogs@josco.com.pst.57

\pst\bettyBlogs@josco.com.pst.58

To confirm the entire file name is correct, including extension, configure Windows Explorer to show extensions by clicking the Tools menu | Folder Options | View tab. Uncheck the Hide extensions for known file types option, and click Apply to all folders. If this option is not selected, the .pst extension will not show, and adding .pst to the file name will incorrectly name the file with .pst.pst. You can revert this setting after you have checked the file names

Windows Folder Names and Structures

 

The few top levels of the windows folder structure must indicate the format and grouping of all or each portion of the provided data, as each one is processed by very different methods. Each folder tree must contain data of that grouping and format only, and not a mix of both. Mimecast accepts a maximum of 60,000 items within any individual folder.

Folder structures and names under the Base folder names and structures that contain PSTs are unimportant and are ignored during the Legacy Archive Data Management process.

Data TypeData Grouping and Message Format
PU

\{CompanyName_Assettag}\Pst\PerUser{UserSMTP}.pst

Example: E:\JosCoLtd_ing1234\pst\PerUser\joblogs@josco.com.pst

NPU - SJF

\{CompanyName_Assettag}\Pst\NoUser_Standard{Anyname}.pst

Example: E:\JosCoLtd_ing1234\pst\NotPerUser_SJF\nonameconvention.pst

NPU - EJF

\{CompanyName_Assettag}\Pst\NoUser_EJF{Anyname}.pst

Example: E:\JosCoLtd_ing1234\pst\NotPerUser_EJF\nonameconvention.pst

Attachments

    Outcomes