I'm not sure if dedupit has the functionality I need. I had planned to have multiple rule-sets per dedupe run E.g. first it tries an exact email match (maybe 20% of the records get matched. Next it might try an exact mobile phone AND 1st 3 characters of first name, then maybe another 10% of the records might get matched. And it would keep going through all the matching rules and re-trying the remaining records.
I want more flexibility in defining the match rules. Instead of requiring exact matches
- substring match
- date inbetween or less/greater than another date(s).
Does it need to check every contact with every other contact for each run? Suppose I've just run a batch job load, and 500 new records have been added. Can I specify that dedupit only needs to check the newly add records (e.g. based on date-created field) - otherwise this will lead to a lot of unnecessary processing, since we will have around 20 feeds coming daily, and we would like to run the dedupe process after each load.
Anyway.... whether or not we use dedupit or our own external process... we still need a way to hide the records from normal users until they have been merged or declared unique.
With this module is it possible to hide all records having a particular field value set, from all users except one group? I want to automatically import a list of contacts from other systems which will contain dupes, and I want them to be hidden from all users until a data steward has reviewed them all and merged them with existing records where needed.