1. Database size
The more records in your database, the more records Duplicate Check needs to compare every time you run a DC job. Pretty obvious. If you want to do a specific search in your database, you could decide to apply a filter to your job to run the job on a subset of your data. Ask yourself, is it really necessary to run this job in my entire database?
2. Scenarios
Duplicate Check identifies duplicate records based on the applied scenario. A scenario defines what fields, records should be compared on, to find and identify duplicate records. Our default scenario for Leads has defined 5 fields, so it will compare records based on 5 fields. If you apply a more extended scenario, or multiple scenarios, the runtime of your job will be extended as well.
These tests are executed in a test environment with dummy data (1275 Lead records, 20% duplicates). No rights can be derived from this information.
3. Numbers of records returned in index search
When running a DC Job, Duplicate Check returns a number of potential duplicate records for every record in your Object in the duplicate detection process. This is a process that runs in the background and is not visible to the user. Out of those returned potential duplicate records, Duplicate Check will define duplicate records that reach the threshold level. The number of records that are returned in that background process is defined in the DC Setup. Generally spoken, the more records you return in the index search, the better the duplicate results. However, returning more duplicate records will extend to the runtime of your DC job.
These tests are executed in a test environment with dummy data (1275 Lead records, 20% duplicates). No rights can be derived from this information.
4. Duplicate Check Local
Duplicate Check is a native Force.com application. That's pretty awesome since we're the only deduplication app that is native! The advantage of totally running on the Salesforce cloud is that we can analyze your data right where it is. The 'downside' is that fact that we depend on the Salesforce servers. Even though the uptime is great, the server speed is a bit variable. If the Salesforce servers have a bad day, Duplicate Check is affected by that as well. Running a job could take a little longer than usual, but will always deliver results. So, we created Duplicate Check Local with which you can process your data on a local computing machine and returns the results to the Salesforce Cloud Service.