How do I clean my company data so I can win with email ?

So your team has never really been capturing email addresses, or the person who was doing the data capturing was a student, and half the time they were either hungover or didn’t understand the difference between their elbow or their “digestive tract”.

You are not alone.

We get this comment in meetings and emails all the time. There is hope, and you can do something about it.

Lets split the problem up into different parts, and let’s find solutions to each element.

Our first split is past and present.

The past:

Can’t do anything about it? Or, can you?
If you have all the hard copies of the opt-ins, you could look at doing a recapture of the data, but think about how much time has passed since those people got something from you?

The cost of doing this might be a bit high for the reward you could get.

Unless a strong business case exists, it’s usually, a “let the past be the past.”

Ok, so what about the data we do have, can we validate it?


List validation

We have a list validation service, where we check a few factors about the mailbox. We return a rich set of metrics on each email.

Here is a snippet of the results (a little technical, but worth a skim read)

The verification result:

  1. deliverable,
  2. undeliverable,
  3. risky,
  4. unknown

And the reason:

  • invalid_email - Specified email is not a valid email address syntax
  • invalid_domain - Domain for email does not exist
  • rejected_email - Email address was rejected by the SMTP server, email address does not exist
  • accepted_email - The SMTP server accepted email address
  • low_quality - Email address has quality issues that may make it a risky or low-value address
  • low_deliverability - Email address appears to be deliverable, but deliverability cannot be guaranteed
  • no_connect - Could not connect to SMTP server
  • timeout - SMTP session timed out
  • invalid_smtp - SMTP server returned an unexpected/invalid response
  • unavailable_smtp - SMTP server was unavailable to process our request
  • unexpected_error - An unexpected error has occurred
  • role true | false - true if the email address is a role address (,, etc.)
  • free true | false - true if the email address uses a free email service like or
  • disposable true | false - true if the email address uses a disposable domain like or
  • accept_all true | false - true if the email was accepted, but the domain appears to accept all emails addressed to that domain.
  • did_you_mean null | string - Returns a suggested email if a possible spelling error was detected. ( ->
  • quality float - A quality score of the provided email address ranging between 0 (no quality) and 1 (perfect quality). More information on the Quality Scores are listed below
  • email string - Returns a normalised version of the provided email address. ( ->
  • user string - The user (a.k.a local part) of the provided email address. ( -> bob)
  • domain string - The domain of the provided email address. ( ->
  • success true | false - true if the API request was successful (i.e., no authentication or unexpected errors occurred)

We then remove the addresses that don’t work anymore, and fix the spelling errors.

We can use the above metrics to further target our communication to just the best of the best.

For Transactional Email

Quality Score
1.00-0.55: Good
0.54-0.20: Fair
0.19-0.00: Poor

For Marketing Email

Quality Score
1.00-0.70: Good
0.69-0.40: Fair
0.39-0.00: Poor


Great, we now have a higher level of confidence that the list you have is of people that exist.

The future:

Now let’s look at the future segment of our problem.

What are we going to change about the way we capture data?

Opt-In opportunities

Have a look at these touch points in your company and see if you can improve on the operations or systems to help capture a person’s details:

  1. Website opt-in
  2. Website registrations
  3. Customer walk-ins ( are you asking for their email)
  4. Wifi access (ask for an email [and permission])
  5. Booking forms (online and offline)
  6. Application forms (Manual and online)
  7. Leads from marketing (Conversion pages, lead-ads.)
  8. Opt-In apps on tablets at expo’s or marketing events

How you go about changing or implementing these things is generally up to each company [one size does not fit all].

Data enrichment

The third thing is to look at how rich the data is. How much do you know about the person on the database?

We have a data enrichment service where we can find a lot of other public domain knowledge about a person based on their email address.

We then follow these steps for you:

  1. You upload your customer list to our system
  2. Tell your account manager you would like this database enriched.
  3. We configure things on our side and enhance the email addresses on that list.
  4. You get a list that you can segment based on topics and interest.

Here is my report, all taken from just my email address

Greg Phillips Email data enrichment

It’s excellent having just one person details, though we need this at scale so we can segment and communicate.

Email stats show there is a 44 to 1 ratio on ROI (Sounds great right!)

The crucial part of the 44:1 ratio is that 77% of that is made up of automated and targetted communication. Data enrichment helps you do this.

Here are some screenshots of the types of data you can expect to find out about your database by using our enrichment services:

Topics cover the first level, and you would see things like Art or Hobbies or Business in here.

Here is an incomplete list of topics you can expect to see:

  • Travel destinations
  • News
  • Books and publications
  • Television and film
  • Popular culture
  • Business and careers
  • Government and politics
  • Hobbies and interests
  • Sports
  • Technology
  • Social media
  • Music
  • Causes and activism
  • Food and drink
  • Travel
  • Science
  • Faith and religion
  • Style and fashion
  • Personal finance
  • Society
  • World cultures
  • Automotive
  • Art
  • Education
  • Nature and the outdoors
  • Health
  • Life stages
  • Parenting
  • Gaming
  • Home and garden
  • Beauty
  • Pets
  • Adult entertainment

Let’s take the data a level lower, and this is the detail that emerges.

Art divides up into visual or performance plus a few randoms

Business segments into many areas. Many. As you can see, the business covers many things.

Hobbies capture many of our relaxing activities, but I am not too sure if shopping should be a hobby 😉


So to recap, yes we can do something.

  1. Clean up the old database (List validation).

  2. Fix the business processes to capture new clean data.

  3. Think of places in your company you could be capturing data

  4. Enrich the data to know even more.

Feel free to drop me an email to chat more about this. Always happy to help.