Updated on by

SCARS|RSN™ Guide: What Is Big Data?

Big Data – Big Scams Databases

If the pile of manure is big enough, you will find a gold coin in it eventually. This saying is used often to explain why anyone would use big data. Needless to say, in this day and age, the piles of data are so big, you might end up finding a pirate’s treasure. In our case we look for scammers – the real scammers in the noise of all the reports filed.

How Big Is The Pile?

But when is the pile big enough to consider it big data? Per Wikipedia:

“Big data is data sets that are so big and complex that traditional data-processing application software is inadequate to deal with them.”

As a consequence, we can say that it’s not just the size that matters, but the complexity of a dataset. The draw of big data to researchers and scientists, however, is not in its size or complexity, but in how it may be computationally analyzed to reveal patterns, trends, and associations.

When it comes to big data, no mountain is high enough or too difficult to climb. The more data we have to analyze, the more relevant conclusions we may be able to derive. If a dataset is large enough, we can start making predictions about how certain relationships will develop in the future and even find relationships we never suspected to exist. For example, we might find a financial manager in Malaysia that is managing the proceeds from scams in the Ivory Coast!

The Treasure

We mentioned predicting the future or finding advantageous correlations as possible reasons for using big data analysis. Just to name a few examples, big data could be used to set up profiles and processes for the following:

  • Stop terrorist attacks by creating profiles of likely attackers and their methods.
  • More accurately target customers for marketing initiatives using individual personas.
  • Calculate insurance rates by building risk profiles.
  • Optimize website user experiences by creating and monitoring visitor behavior profiles.
  • Analyze workflow charts and processes to improve business efficiency.
  • Improve city planning by analyzing and understanding traffic patterns.
  • Discover how a local African University has been co-opted by scammers, or how a new corporate scammer group has begun operations in Mauritius.

Beware of Apophenia

Apophenia is the tendency to perceive connections and meaning between unrelated things. What statistical analysis might show to be a correlation between two facts or data streams could simply be a coincidence. There could be a third factor at play that was missed, or the data set might be skewed. This can lead to false conclusions and to actions being undertaken for the wrong reasons.

For example, analysis of data collected about medical patients could lead to the conclusion that those with arthritis also tend to have high blood pressure. When in reality, the most popular medication to treat arthritis lists high blood pressure as a side effect. Remember the old research edict: correlation does not equal causation.

In statistics, we call this a type I error, and it’s the feeding ground for many myths, superstitions, and fallacies.

This is a word most victims should remember since the tendency to make false assumptions is profound.

The Researchers

As more and more data becomes digitized and stored, the need for big data analysts grows. A recent study showed that 53 percent of the companies interviewed were using big data in one way or another. Some examples of use cases for big data include:

  • Data warehouse optimization (considered the top use case for big data) (such as for scam reports numbering in the millions)
  • Analyzing patterns in scammer language used to identify scriptwriters
  • Sports statistics and analysis; sometimes the difference between being the champion or coming in second comes down to the tiniest detail
  • Prognosis statistics or success rates of particular medications can influence a doctor’s recommended course of treatment; an accurate assessment of which could be the difference between life and death
  • Selecting stocks for purchase and trade; quick decision-making based on analytical algorithms gives traders the edge

We use big data in the form of anonymous data gathered from reports to monitor active threats. Viewing these data sets allows us to see trends in scams development, from the types of scams that are being used in the wild to the geographic locations of attacks.

From these data, we’re able to draw conclusions and share valuable information with the public and our data feed recipients in government and law enforcement or website operators, in report forms, such as our annual Cybercrime & Scam Tactics and Techniques report, and even in heat maps showing where scammers receive money or operate.

The Tools

Technologically, the tools you will need to analyze big data depend on a few variables:

  • How is the data organized?
  • How big is big?
  • How complex is the data?

When we are looking at the organization of data, we are not just focusing on the structure and uniformity of the data, but the location of the data as well. Are they spread over several servers, completely or partially in the cloud, or are they all in one place?

Obviously, uniformity makes data easier to compare and manipulate, but we don’t always have that luxury. And it takes powerful and smart statistical tools to make sense out of polymorphous or differently-structured datasets.

As we have seen before, the complexity of the data can be another reason why we need special big data tools, even if the sheer number is not that large.

As big data tools are made available, they are still in the early stages of development and not all of them are ready for intuitive use. It requires knowledge and familiarity to use them most effectively.

Your Personal Data

When we go online, we leave a trail of data behind that can be used by marketers (and criminals) to profile us and our environment. This makes us predictable to a certain extent. Marketers love this type of predictability, as it enables them to figure out what they can sell us, how much of it, and at which price. If you’ve ever wondered how you saw an ad for vintage sunglasses on Facebook when you were only searching on Google, the answer is big data.

Scammers are also beginning to understand the value of big data as they are developing automated tools to scan social media profiles and predict who will make the best targets.

Imagine a virtual assistant that retrieves travel arrangement information at your first whim of considering a vacation. Hotels, flights, activities, food and drink—all could be listed to your liking, in your favorite locations, and in your price range at the blink of an eye. Some may find this scary, others would consider it convenient. However you feel, the virtual assistant is able to do this because of the big data it collects on you and your behavior online.

Scammers are testing this kind of data that they are buying from legitimate sources (though the sources do not know it will be used by scammers).

The Data-Driven Society

One of the major contributions of big data to our society will be through the Internet of Things (IoT). IoT represents the most direct link between the phys