Originally published in 2015.
Email addresses in freshly registered short lived domains are increasingly used to send spam and malware. They are also used in spear phishing campaigns often combined with bitsquatting/typosquatting techniques to fool users into trusting the message content.
The same applies to websites serving malicious content that are linked by the phishing emails.
As long as spam block lists can be an efficient technique to limit amount of spam originating from well know spammers then they won’t provide protection against fresh or targeted campaigns.
The idea described below is built around dynamic block/watchlist list generation based on incoming email logs and the sender’s address domain age. It assumes parsing mail log or incoming mail queue in search for freshly registered domain names. In this PoC I’m using domaintools.com API to avoid issues with parsing different creation date formats coming directly from whois. The obvious bottleneck here is the amount of API queries that can be run in short time. To workaround this you should first create a baseline file or db table with senders’ domains that your server already saw and accepted in the past. If you have some central log repository then it shouldn’t be an issue. Obviously all the whitelists you might be using should be included as well. The whole flow should look like this:
1. Extract domain names from your mail log. Simple regex should get the job done. Note that it is a good idea to limit this effort only to incoming messages that are being accepted in to the queue.
for line in file.readlines(): try: email = re.search(r'([a-zA-Z0-9_.+-]+)@([a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)',line) domain = email.group(2)
2. Verify if the extracted domain is already present on any of your lists. If yes skip processing.
def search_onlists(domain, exclusions): found = Falseprint "search_onlist got "+domain+" and "+exclusions try: files = os.listdir(exclusions) for file in files: if file.endswith(".txt"): print "Checking in file: "+file r_file = open(exclusions+file, 'r') for line in r_file.readlines(): if domain in line: found = Trueprint "Found! Skipping "+domain return found r_file.close() return found except Exception as e: print "Error in search_onlists\n" print(e) return '-1'
3. Obtain registration time using domaintools API.
def get_created(domain): request = Request().service("whois").withType("xml").domain(domain) result = request.execute() root = ET.fromstring(result) created = root.find("./response/registration/created") print "Domain "+domain+" was created "+created.text return datetime.strptime(created.text, "%Y-%m-%d") def get_age(created):try: age = abs(datetime.utcnow() - created).days return age except ValueError: print "Data type error is get_age"except: print "Unexpected error in get_age"
4. Verify your age threshold. If it’s older than it skip further processing and add it to already checked list to avoid unnecessary checks in the future.
exists = search_onlists(domain, exclusions) if not exists: try: age = get_age(get_created(email.group(2))) print domain + ' is ' + str(age) + ' days old'if int(age) < int(domain_age): ... else: output = open(exclusions+'processed.txt', 'a') output.write(email.group(2)) output.write("\n") output.close()
5. If the checked domain is younger than the threshold then add it to the blacklist together with current date so it can be used for cleanup purposes.
if int(age) < int(domain_age): print "Adding "+domain+ " to blacklist" output = open(exclusions+'output.txt', 'a') output.write(email.group(2)+","+str(datetime.utcnow())) output.write("\n") output.close()
Example execution to check if domain is younger than 10 days. Domain names listed in txt files under /home/exclusions are excluded from checks. Two files will be created/updated in /home/exclusions:
output.txt (csv with domain names for blocking and addition dates) processed.txt (list of domains that were already checked) python mailchk.py /var/log/maillog.log 10 /home/exclusions/
Column 1 in output.txt can be used as your age based spam block list. It could be also utilized on a proxy server to block users from visiting web pages hosted on identified domains.
The above technique could be also used around forward proxy logs to identify malicious websites visited by users or C&C communication.