Re-Imagining Cyber Security

Tag: best practice

Names…Names Everywhere! The Problem, and Non-Problem, of Name Pollution

Naming pollution is real.  It’s a real problem.  First anti-malware/AV malware detection names, now APT group names – and their campaigns – and their malware.  Analysts are in love with names – and marketing is in love with their names.

You see, naming is powerful.  It’s why we agonize over a child’s name.  It’s why (in the Judeo-Christian tradition) God’s name was truncated and not to be uttered.  At about 2 years old we start learning the names of things and are able to start uttering them back.  This gives us power, because when the 2-year-old is able communicate a thing’s name – we give it to them!  It’s powerful to a 2-year-old and that same power follows us throughout life – see “name dropping” – or the honor of naming a new geographic/astrological feature.


It’s followed us into the information security space – for both good and bad.  You see, we need names.  Names are important.  It’s part of how we organize cognitive information and make sense of our world – through abstraction.  It’s important to how we communicate.  But, like any power, it can be misused and misappropriated.  Every organization now loves to name “adversaries,” “actors,” “activity groups,” or whatever you call them.  They can blog about it, tweet about it, produce nice glossy materials and presentations.  It gives them power – because that’s what names do.

The problem isn’t names, it’s the power we attribute to them and their use in our analysis.  When ThreatToe calls something BRUCESPRINGSTEEN and CyberCoffin identifies a similar activity and names it PEARLJAM, everyone else starts updating their “Rosetta Stone” and makes the association BRUCESPRINGSTEEN = PEARLJAM.  Everyone else now starts attributing their intelligence to these two named groups.  But, nobody actually knows what the heck these things are aside from a few properties (e.g. IPs/domains/capabilities/etc).  That is not enough to understand.

I can’t tell you how many time’s I’ve heard: “Did you see the recent report from CyberVendor – can you believe they attributed that activity to PEARLJAM?!  That is clearly STEVIEWONDER – those guys don’t know what they’re talking about.”  The problem with that statement is that assumes: (1) you actually know what you’re talking about (you’ve correct correlated activity) and (2) you understand their definition of PEARLJAM.  Within their own analytic definition the correlation could be absolutely correct.  It’s that we’ve made unfounded assumptions and assigned too much power to the names.

NamesEverywhereBut, WHY CAN’T WE JUST ALL AGREE ON NAMES!!!!! (as this is usually said in an elevated tone and usually while slightly-intoxicated)  Because we can’t.  That’s why.  It’s not about the names.  The names are just crutches – simple monikers for what is very complex activity and analytic associations which we still don’t know how to define properly.  To understand this, you need to understand how we’re actually defining, correlating, and classifying these into groups – read the Diamond Model section 9 for this information.

The simple answer: it’s hard enough to correlate activity consistently within a 10 person team let alone across a variety of organizations.  The complex answer: correlation and classification is a complex analytic problem which requires us to share the same grouping function and feature vector.

What we shouldn’t do is to start using each other’s names – because, again, it’s not about the names.  If you begin to use the names of others you start to take on their “analytic baggage” as well since you are now intimately associating your analysis with theirs.  This means you may also take on their errors and mis-associations.  Further, it may mean that you agree with their attribution.  Its highly unlikely that you’ll want intertwine your analysis with that of others whose you don’t really understand.

Instead, we need to rely on definitions.  We need to openly share our correlation and classification logic and the feature vectors which we’re applying.  But to those who are now saying, “Finally! An answer!  Let’s just share this!” sorry, it’s not a silver bullet.  Because, the feature vector is highly dependent on visibility.  For instance, some organizations have excellent network visibility, some have outstanding host visibility, others may have great capability/malware visibility, etc.  It means that generally, I need the same visibility as another organization to effectively use the shared functions to produce accurate output.

So, reader, here I am, telling you about this problem forcing poor analytic practices on daily basis causing us all these issues but without a real solution in sight.  Yes, I think that sharing our definitions will get a LONG way towards improving correlation across organizations and giving those names real value – but it is by no means a silver bullet.  I’m a proponent of this approach (over pure name/Rosetta stone work) but I know we’ll still spend hours on the phone or in a side conversation at a conference hashing all of this out anyways.  But maybe, just maybe, it will reduce some analytic errors – and if that is the case it is better than what we have today.

Questions for Evaluating an External Threat Intelligence Source

I’ve spoken before on the cost of poor threat intelligence and its risk to an organization.  I’ve also spoken about the 4 qualities of good intelligence: relevance, timeliness, accuracy, and completeness. To better evaluate threat intelligence sources – DRIVE FOR TRANSPARENCY!  If you treat threat intelligence like a black box you’re going to lose.

Here are questions to use when evaluating an external source. These are just a starting point or additions to your own list based on your unique needs.

[Relevance] Why do I need threat intelligence?

Before you go out evaluating threat intelligence sources, you need to know what you’re looking for.  This is best done using a threat model for your organization and asking where threat intelligence supports visibility and decision making within that model.  Remember, your own threat intelligence is almost ALWAYS better than that produced by an external source.  External intelligence should complement your own visibility and reduce gaps.

Kudos: Thanks to Stephen Ramage for his comment highlighting the exclusion of such a critical question.

[Relevance] What types of intelligence are available?

Strategic country-level reporting? Cyber threats mixed with political threats?  Technical indicators?  Campaign behaviors?  Written context?  These all determine how useful, actionable, and relevant the intelligence will be for your organization.

[Relevance] Give me your context!

Make sure you understand the context provided with any data.  There is a difference between threat data and threat intelligence.  Intelligence helps drive effective decision-making.  Context makes data relevant.

[Relevance] Which threat types?

Is it limited to botnet C2 nodes?  Commodity threats in general?  Does it cover targeted threats?  Does the threat intelligence provide insight into your threat model?

Related Questions: How many unique threats are distinguishable in the intelligence?

[Relevance] How many direct threats to my organization or those in my industry has your intelligence identified?

Has the source ever shown direct success in highlighting threats in your industry?

[Relevance] How is the intelligence made available to consumers?

If the intelligence is not provided in a usable form, it will not be successful.

[Relevance] What types of use-cases produce the best experience/feedback?  In which use cases has your intelligence failed?

This is a soft-ball question but one which should provoke a good question-answer session.  The answers will illuminate their decisions developing the intelligence and highlight where the intelligence may fit best (or not fit at all).

Related question: What threat model is this intelligence attempting to address?

[Completeness/Relevance] What is the source of the intelligence?

Is this intelligence derived from human sources crawling the dark-web?  Global network apertures?  VirusTotal diving?  This question should frame their visibility into threats and inform the types of intelligence expected.  This also highlights any natural biases in the collection.  Look for sources of external intelligence which complement your own internal threat intelligence capabilities.

[Completeness] What phases of the kill-chain does the intelligence illuminate?

Understand how wide, against any single threat, the intelligence goes.  Does it only show C2, or will it also illuminate pre-exploitation activities as well.  The wider the intelligence, the greater the likelihood of it being useful.

[Completeness] What is the volume and velocity of the intelligence?

“How much” intelligence is actually produced?  Numbers don’t matter that much – but if the number is ridiculously small or ridiculously large, it is an indicator of possible issues.

[Accuracy] How is the intelligence classified and curated?

Drive for transparency in their process which helps improve your evaluation on accuracy. Be wary of “silver bullet” buzz-word answers such as “machine learning” or “cloud.”

[Accuracy] How is the intelligence validated?

Do you want to track down false positives all day?  No!  Do you want to rely on poor analysis? No! Make sure this question gets enough attention.

Related questions: How often is it re-validated?  How are false positives handled?  How can customers report false positives?  What is your false positive rate?  How many times in the last month have you had to recall or revise an intelligence report?

[Accuracy] Does the intelligence expire?

Expiration of intelligence is key.  Is there a process which continuously validates the intelligence?

[Timeliness] How quickly is the intelligence made available to customers after detection?

Related questions: What part of your process delays intelligence availability?  What is the slowest time to availability from initial detection?

Powered by WordPress & Theme by Anders Norén