Using Fuzzy Hashing with a Content Examination Definition

Document created by user.3AEuBpAOr2 Expert on Feb 2, 2016Last modified by user.oxriBaJeN4 on Mar 27, 2017
Version 11Show Document
  • View in full screen mode

Fuzzy hashing is a concept which involves the ability to compare two distinctly different items, and determine a level of similarity expressed as a percentage

between the two. As such it is used to limit the flow of sensitive information from leaving your organization. This is achieved by matching content similarities between a control document and email attachments passing through your Mimecast service.

 

The following types of fuzzy hashes can be used with content examination definitions.

  • Mimecast Fuzzy Hash (MFH) ignores any images in an attachment, basing it's similarity score on the attachment's text.
  • SSDEEP uses the entire attachment (including text and images) to determine how similar one file is to another.

 

This guide explains how to upload a control document, and specify it in a Content Examination Definition. This is a two step process that involves:

  • Generating a fuzzy hash.
  • Adding a fuzzy hash to a Content Examination definition.

 

What You'll Need

 

  • Access to the Administrator Console, with edit rights to the Administration | Gateway | Policies menu item.
  • A document to use as your control document.

 

Generating a Fuzzy Hash

 

To generate a fuzzy hash to use with a Content Examination definition:

  1. Open the Gateway Policy Editor.
  2. Select the Definitions drop down. A list of the definition types is displayed.
  3. Select the Content Examinations definition type from the list. The list of definitions is displayed.
  4. Click the Fuzzy Hash Definitions button.

    fuzzyhash1.png
  5. Click the Generate Fuzzy Hash button. The Fuzzy Hash Generation section is displayed.
  6. Complete the Fuzzy Hash Generation section as follows:

    Field
    Description
    Description

    Specify a description for the file to which the generated hash(es) belong. The description is visible to administrators when viewing the definition, or selecting entries from the list of previously generated hash values.

    Fuzzy Hash Type

    Specify the type of fuzzy hash you would like to generate. The options are Mimecast Fuzzy Hash (MFH), SSDEEP, or Both.

    When creating MFH hashes, the control file must meet the minimum file size limit of 4KB.

    If you select Mimecast Fuzzy Hash (MFH), we recommend all images are removed from the control document. This reduces the amount of time taken to generate the fuzzy hash.

    New File Upload

    Click the Browse button to select the control document file. Only one file can be selected.

  7. Click the Generate button.

 

Once the fuzzy hash has been generated, it can be added to a content examination definition.

 

Adding a Fuzzy Hash to a Content Examination Definition

 

After you have created a fuzzy hash, you'll have to add it to a content examination definition. This enables you to define the criteria that must be met before your configured actions take effect.

 

To add a fuzzy hash to a content examination definition:

  1. Either select the:
    • Definition to be changed.
    • New Content Definition button to create a definition.
  2. Select the Insert | Fuzzy Hash menu item. The Policy Definition dialog is displayed.
  3. Complete the Policy Definition dialog as follows:

    Field / OptionDescription
    Line ScoreSpecify a value to assign to the fuzzy hash. This is measured against the definition's activation score.
    AppendControls where a fuzzy hash is placed in the word / phrase match list. If enabled, the fuzzy hash is added to the bottom of the list. If disabled, the fuzzy hash is added to the top of the list.
    New File UploadClick the Lookup button to select the fuzzy hash file you wish to use.
  4. Click the Save and Exit button. The fuzzy hash and line score are displayed in the word / phrase match list.
  5. Select the Fuzzy Hash Setting drop down in the definition.
  6. Specify a Similarity Percentage value. This is applied to all the fuzzy hashes defined in the word / phrase match list.
  7. Click the Save and Exit button.

Fuzzy hashes can be used in conjunction with other search terms (e.g. Regexes, Words, or Phrases). More information regarding examples that can be used in Content Examination definitions can be found in the Content Examination Definitions: Usage examples article.

2 people found this helpful

Attachments

    Outcomes