How does VEC bulk upload deduplicate addresses?

Question

Valid Email Checker · Accepted Answer

Deduplication in Valid Email Checker bulk uploads is a single client-side pass that runs before we count credits or send the task to verification. The Remove Duplicates checkbox on the [bulk upload page](how-do-i-find-the-bulk-upload-page) is on by default, and most customers should leave it on. Here is what it actually does and how to read the result.

How the comparison works

Each address is lowercased and compared against the set of addresses already seen. The first occurrence stays in the list, subsequent matches are dropped. The comparison is purely on the lowercased string — so `John@example.com` and `john@example.com` are treated as duplicates and the first one is kept. Whitespace is trimmed before the comparison.

`john@example.com` and `John@example.com` collapse to one entry (case-insensitive). `  john@example.com  ` and `john@example.com` collapse to one entry (whitespace trimmed). `john@example.com` and `john.smith@example.com` are not duplicates (different local parts). `john@gmail.com` and `j.o.h.n@gmail.com` are not deduplicated by us, even though Gmail treats them as one mailbox. Our dedupe is string-based, not provider-aware.

What you see after dedupe

When duplicates are removed, the upload form shows a toast: "Removed N duplicate emails". The unique count is what gets charged and what gets verified. If your file had 12,000 rows and 2,000 were duplicates, you spend 10,000 credits and get back 10,000 verification results. The 2,000 duplicates are silently dropped — they are not separately reported in the output.

When to turn the toggle off

There is one scenario where leaving duplicates in makes sense: a small list where the duplicates carry meaningful context in other columns. For example, a CSV that has multiple rows per email because each row is a different signup form submission. Even there, the cleaner approach is to dedupe in your spreadsheet first, keep the row you care about, and then upload. Letting our verifier process duplicates wastes credits since the answer for an address is the same regardless of how many times it appears.

Dedupe runs before the 1M cap check The unique-count check against the [1,000,000-address cap](what-is-the-max-number-of-emails-in-one-bulk-job) happens after dedupe. A 1.2-million-row file with 250,000 duplicates lands at 950,000 unique addresses and the task proceeds.

150 free credits, no card

How does VEC bulk upload deduplicate addresses?

How the comparison works

What you see after dedupe

When to turn the toggle off

Related questions