Quantcast
Channel: Symantec Connect - ブログエントリ
Viewing all articles
Browse latest Browse all 5094

EVDuplicateCleaner makes the big time...

$
0
0

Whilst IMAP, storage queue and the new Enterprise Vault Search are grabbing all the headlines in our latest release, Enterprise Vault 11, there is plenty of other less heralded work that you may not have noticed. For instance, if you had so much time on your hands that you could pay close attention to the contents of your ...\Enterprise Vault\... program directory, you may have spotted a new arrival called EVDuplicateCleaner.exe in Enterprise Vault 11. In addition, full details on using the tool can now be found in the Utilities guide.

This utility has been around for a few releases now but was only ever available via version specific downloads from this technote - http://www.symantec.com/business/support/index?page=content&id=TECH193878.

It exists to target and resolve duplicates created as a result of the same one item in the same folder in the same mailbox being archived multiple times, which can sometimes occur due to varied unique data characteristics or corruptions of target items. Such duplicated archiving scenarios are rare but, in the unfortunate event that they do occur and are reported to us, we have two primary goals - firstly to root cause why the duplication is occurring and fix it; secondly to assist the customer in cleaning up any duplicates that have been created as a result of this erroneous situation.

The utility has three modes in which it can be run:

Summary– this mode runs a SQL query which groups items that were archived from the mailbox and the same folder to the same archive, with the exact same item date/time, in order to provide a high level report of the estimated number of items that may have duplicates in the archive, and the number of estimated duplicates that exist of that item. It is a good mode to scan a Vault Store for potential duplicates and at least flag any archives that require further investigation.

Report– this mode dives deeper into individual archives, again running SQL queries for each estimated duplicate group to report the saveset details of the most recently archived item in each group and the number of estimated duplicates of that item. It is a good mode to run on a per archive basis for affected archives and gather per saveset details in order to perform some additional investigation into the reality of these estimated duplicates.

***You could not have missed how many times I used the word ‘estimate(d)’ in the first two modes. That is because these modes use SQL only to perform analysis based on item metadata, in order to provide quick results of potential duplicates, but they are ultimately also capable of providing false positives – i.e. albeit unlikely and obscure, multiple items may have been archived from the same mailbox and the same folder, with the exact same item date / time that are not in fact duplicates

Execute– this mode works from the lists of duplicate groups provided by the SQL in Report mode, and performs the de-deduplication clean-up operation, deleting all duplicate savesets apart from the most recently archived one. Execute mode does not just rely on the estimation of the previous two modes however. Prior to performing any deletion, it will compare the fingerprints (or if a fingerprint is not available, extensive index properties) of the item to keep and the item to delete to guarantee that they are in fact duplicates at a binary level. This level of validation of the item’s duplicity is obviously resource and time intensive and is the reason why we decided in the original design of the utility to reserve it only for the mode where changes to items can occur

So, I hope some of that helps you understand the raison d’etre and architecture of this utility a little better. In reality, there should be no need for you to be rushing to your servers and running the utility right now, but it is at least now easily accessible should its services be required.


Viewing all articles
Browse latest Browse all 5094

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>