Content Examination Definitions: Reference Dictionaries

Document created by user.oxriBaJeN4 Employee on Sep 21, 2015Last modified by user.oxriBaJeN4 Employee on Apr 1, 2017
Version 6Show Document
  • View in full screen mode

Content Examination Definitions can either be created as independent definitions, or link to a Reference Dictionary. Reference Dictionaries are typically created by an Administrator to contain a list of words, phrases, or regular expressions that can be linked to a Content Examination Definition. Multiple Content Examination Definitions can point to the same dictionary.


An example would be for a customer implementing HIPAA compliance. The compliance policy may state that all emails that contain both a medical term and a Social Security Number (SSN) should be held and not delivered outbound. The Administrator would create a Reference Dictionary containing all the medical terms, as well as one with the regular expression for SSNs. When creating the Content Examination Definition, both of these Reference Dictionaries can then be linked to by using the Insert menu.


Creating a Reference Dictionary


To create a reference dictionary:

  1. Open the Administration Console.
  2. Click on the Gateway | Policies menu item.
  3. Click on the Definitions drop down on the top toolbar.
  4. Select the Content Definitions menu item.
  5. Select a Folder in the navigator into which the definition is to be created.
    You cannot create a definition in the Root folder.
  6. Click on the New Content Definition button.
  7. Enter a meaningful Description. This name is logged against the email when a match is found, so it is recommended to use something expressive
  8. Select the Definition Type, which can either be an Independent Content Definition or a Reference Dictionary.  For this article, select Reference Dictionary
  9. Enter the search parameters in the Word/Phrase Match List. The formats for the search parameters are:

    Search ParametersExample
    Weight [ :maxscore ] [ search text ]4:1 “Company Confidential”
    Weight [ :maxscore ] [ required ] [ search text ]1 required “Project X”
    Weight [ :maxscore ] [ exclude ] [ search text ]>1 exclude “Tax exemption”
    Weight [ :maxscore ] [ regex ] [ regular expression ]10 regex 4[0-9]{12}(?:[0-9]{3})?
    Weight [ :maxscore ] [ hash ] [ MD5# ]1 hash 9EBD30E761ED4FF770A90DDBD5CB4190 Confidential.PDF
  10. Click on the Save and Exit button.


Parameter Details


When specifying search parameters, the following rules must be followed:

  • Weight: Each line must begin with the required score for that particular word or phrase.
  • Maximum score: There is the option to set the number of occurrences in the email that should trigger the definition. If an entry of 1:10 is added before the search term, Mimecast will match up to ten instances of the search term.  If 1: is entered before the search term, there is no upper limit to the score. This scoring is only used if the option Match Multiple Words is enabled in the actual Content Examination Definition.
  • Conditions: The optional operators “required” and “exclude” can also be used. Add the word required, if the match term is specifically required for the policy to trigger. If a required item is not found, the weight is set to zero and no further scoring takes place. If the word exclude is added after the weight, and the match term does exist, the weight is set to zero and no further scoring takes place. Required and exclude terms should be placed in the first line of the search term list.
  • Search text/phrases: Enter single words or phrases, enclosing multiple words in quotation marks (e.g. “one two”).
  • Regular expressions: The expression must be preceded with the word “regex”. Regular expressions can be used to detect structured strings like Social Security Numbers or Credit Card Numbers in emails.
  • MD5#: Enter the “hash” at the beginning of the line (or following the score if relevant) followed by the MD5 code of the attachment. The MD5# is a unique reference given to specific file contents. If the attachment is known to Mimecast (i.e. Mimecast has previously processed the attachment) this checksum is located in the Transmission Data when viewing the email delivery details.
  • Comments - Add by using a hash symbol (#) at the beginning of the line. These are ignored when examining the email for matches.


Mimecast Managed Reference Dictionaries (MMRDs)


Mimecast can provide the reference dictionaries listed below to customers using the latest gateway. These are added to a Content Examination Definition using the Insert menu:

  • Profanity Lists
  • Credit Card Regular Expressions

If false positives are being generated as a result of some of the contents of these dictionaries, a negative value counter score can be added as a line item when inserting the Reference Dictionary in a Content Examination Definition.