Most data sources contain 3% to 5 duplication. Doubled addresses can harm customer relationships and create unnecessary expenses.
Finding and removing duplicative information in large data sets requires a suitable software solution.
Dixendris provides d-doop, the high speed, precise, effective and very affordable duplicate removal software solution. Our software is perfect for identifying duplication in all types of data (addresses, material lists, etc.), and also works for very large data-sets.
Please contact us for a free consultation, we are happy to help you.
Contact
Dixendris AG
Binningerstrasse 15
CH-4051 Basel
Switzerland
Phone +41 61 272 25 15 -
www.dixendris.com -
info@dixendris.com
The powerful Dixendris d-doop solution detects duplicate records in a wide range of data sets, providing you separated clean and duplicative data.
As a rule of thumb, every source contains between 3% and 5% duplication. Such duplication can be costly. For example, in a marketing campaign, duplicate data means extra contact (mailing or phone) cost. For the consumer, additional contact leads to frustration and annoyance.
Hand-eliminating such duplication requires support time and cost. This could mean intensive efforts in campaigns with multiple deliveries. In large data sources, such as a customer database, the cumbersome task of duplicate elimination requires a software solution.
The Dixendris d-doop solution identifies duplicates in a variety of sources. The powerful yet affordable d-doop system allows you to deduplicate, for example, 1,000,000 records in less than forty-five minutes on a standard PC! (see Run-time table)
The Dixendris d-doop system easily configures and customizes to your needs. You can either use it stand-alone, or incorporate it into your existing workflow.
Combine newly acquired address data with your current database, eliminating duplication and producing separate reports:
The following table shows the approximate runtime for deduplicating address records. The following fields have been used as a base for deduplication: Salutation, First Name, Last Name, Street, Number, City and Zip with a standard configuration.
| Number of Addresses | Runtime |
|---|---|
| 50'000 | 1 minute |
| 100'000 | 2 minutes |
| 150'000 | 5 minutes |
| 300'000 | 15 minutes |
| 1'000'000 | 40 minutes |
| 2'000'000 | 2.5 hours |
| 3'000'000 | 4.5 hours |
Reference System: standard laptop with Intel T2400, 1.83GHz, 2.00GB RAM.
Dixendris d-doop system is not only efficient, but very effective in identifying precise and similar duplicates.
There are two types of duplicates; fully identical and similar. The second type of duplicate can occur from spelling and transposition errors, and from acquiring and introducing third-party addresses. Dixendris d-doop uses a fuzzy search to find both fully identical and similar entries.
Finding similar entries requires a complex search, and in contrast to finding identical records, is very challenging for a computer system. A similarity must be as meaningful as possible in order to assign a similarity percentage value.
When using a fuzzy search to deduplicate similar entries, search times grow exponentially with the number of records. Dixendris d-doop uses clever algorithms to optimize search speed, thereby efficiently processing your records within a useful timeframe.
While there are many systems capable of deduplicating a few thousand records within a few minutes, such systems require tens of hours or fail on data sets of over 300,000 records.
Dixendris d-doop is a high performance deduplication system and, thanks to it's unique FAME (Fingerprint Accelerated Matching Engine) technology, can deduplicate large data sets within relatively short periods of time (85,000,000 comparisons per second).
Thus it is truly possible to deduplicate large data sets within a reasonable and useful amount of time.
|
Dixendris AG
|
Phone +41 61 272 25 15
|