Additional protection with an expanding CSS dataset

Posted by The Spamhaus Team on 2 Nov 2022

As of Wednesday, November 9th, the eagle-eyed amongst you will notice the CSS dataset will start to swell. Imperceptibly at first, but slowly and surely, we anticipate the addition of 1.5 million listings over the next 4-6 months; that's approximately a 100% increase! The goal? Increased protection and insight for all users of this dataset. Whether you use it to filter email via the Data Query Service (DQS) or for intelligence on IPs through the Spamhaus Intelligence API, you will benefit…. And ultimately, so will the broader internet community.

Meet Robert

I could bore you with all the blurb about “continuous improvement,” – but you know that Spamhaus researchers aren’t going to sit on their laurels while miscreants constantly change their modus operandi to evade detection. So, I’ll quickly move on and introduce you to Robert.

Robert is one of the Spamhaus Project’s data scientists; boy, he “gets” data. Recently, Robert’s been beavering away and has identified additional areas of malicious activity. How? Good question – but not one we’ll be answering. After all, we don’t want to tell the bad guys how we identify them!

Are these “bad guys” really bad?

We know that some individuals and organizations get listed due to naivety relating to domain and IP reputation. For example, some domain owners don’t realize that if you unwittingly host your domain on infrastructure shared with cybercriminals, you may suddenly find yourself listed due to the mess they’re making of your shared IP space.

However, almost all this new intelligence is focused on those who are purposefully abusing the internet. The Project’s researchers will list IP addresses spewing out spam, not because of a compromised device or a proxy but because of outright malevolent behavior.

Why the “slowly but surely” approach?

The research team will introduce these listings at approximately 50,000 per day. Understandably, some of you may still be thinking why we’re going with a “slowly but surely” methodology. If we’re seeing badness, why not list it all immediately?

The answer is that it isn’t always possible to test IP and domain reputation to the point of having no doubt a false positive won’t arise. Of course, due diligence is undertaken, and the analysis, signals, and rules used to curate these listings are tested, checked, and retested. But threat hunting isn’t black and white, and until the data is used in the wild, researchers can’t be 100% sure.

Therefore, it’s vital to release in small bursts, monitor, and then continue with the next release. On the occasions The Project’s threat hunters and researchers haven’t observed this process, the feedback from the wider internet community has been a loud and resounding “Please, don’t do that.” (or sometimes using language with a little more color to it!)

We’d like you to share your CSS-related observations

It won’t come as a great surprise that the data scientist legend, Robert, will be able to measure the effectiveness of this new intelligence, but we’d like to see what you’re experiencing. So please, let’s keep Robert honest – take a data dump of the amount the CSS is catching today. Then take one a month for the next six months – and share with us. We’d love to compare results!

You spoke. We stopped listing. Now we are back on!

Following the introduction of the long-term behaviour rules to the CSS list, as promised, we started introducing new listings…slowly. However, as we reached approximately 1.4 million listings, some of you reached out (thank you!) to say, “hey, we are observing a few false positives”. In early February, we stopped.

Over the next month – our data scientist wizard – Robert, meticulously reviewed, researched, and refined the long-term behaviour rules to eradicate the false positives you were seeing. Today, the rules are back on, listing over 900,000 IPs and increasing daily!

Keep sharing your CSS list experiences – your insight is invaluable!

Blog News

Spamhaus Intelligence API (SIA)

Spamhaus Intelligence API (SIA) contains context-rich metadata relating to IP and domain reputation. Integrate this data with your applications to enhance existing data feeds, or consume as an independent data source.

In this easy-to-consume format, SIA can be used for threat detection and investigation, risk scoring, customer vetting, validation and much more.

Save valuable time investigating and reporting
Simple and quick to access
Data you can trust in

Data Query Service (DQS)

Spamhaus’ Data Query Service (DQS) is an affordable and effective solution to protect your email infrastructure and users.

Using your existing email protection solution, you will be able to block spam and other related threats including malware, ransomware, and phishing emails.

The service has never failed and utilizes the longest established DNSBLs in the industry.

Proactive & preventative
Save on email infrastructure & management costs
Actionable

Resources

Increased performance and search capabilities for users of IP reputation data via API

28 October 2022

Blog News

Commercial or developer subscribers to any IP datasets via Spamhaus Intelligence API (SIA) will experience improved performance and search capabilities for this service.

Learn more

The holiday hack – a reminder of why you shouldn’t always trust emails

28 April 2022

Blog

Here’s a cautionary tale to anyone and everyone who uses email. The learning is simple: Always be vigilant, especially if its content asks you to provide personal information or click on links and download files.

Learn more

A new dataset is available via the Spamhaus Intelligence API

30 June 2021

News

Spamhaus has released the extended CSS Blocklist (CSS) and made it available via our API service. This provides users with additional insights relating to compromised and malicious IP addresses.

Learn more

Additional protection with an expanding CSS dataset

Meet Robert

Are these “bad guys” really bad?

Why the “slowly but surely” approach?

We’d like you to share your CSS-related observations

You spoke. We stopped listing. Now we are back on!

Related Products

Spamhaus Intelligence API (SIA)

Save valuable time investigating and reporting

Simple and quick to access

Data you can trust in

Data Query Service (DQS)

Proactive & preventative

Save on email infrastructure & management costs

Actionable

Resources

Increased performance and search capabilities for users of IP reputation data via API

The holiday hack – a reminder of why you shouldn’t always trust emails

A new dataset is available via the Spamhaus Intelligence API