Anomaly Detection in Telecommunications Using Complex Streaming Data | Whiteboard Walkthrough

Contributed by

14 min read

The telecommunications industry is on the verge of a major transformation through the use of advanced analytics and big data technologies like the MapR Converged Data Platform. The MapR Guide to Big Data in Telecommunications is designed to help you understand the trends and technologies behind this data driven telecommunications revolution. Download your complimentary copy here.

In this week's Whiteboard Walkthrough Ted Dunning, Chief Application Architect at MapR, explains in detail how to use streaming IoT sensor data from handsets and devices as well as cell tower data to detect strange anomalies. He takes us from best practices for data architecture, including the advantages of multi-master writes with MapR Streams, through analysis of the telecom data using clustering methods to discover normal and anomalous behaviors.

For additional resources on anomaly detection and on streaming data:

The full video transcription follows:

Hi. I'm Ted Dunning from MapR. I'd like to talk a little bit about data processing in the context of telecom. In particular, what I'll be talking about is how to extend some of the ideas that we've talked about in other videos about anomaly detection to the particular case of telecom data. This is very different than some of the other data that we've been talking about. It's exciting, interesting, and different in both the quantity and the kind of data.

Let's talk a little bit about where the data we're talking about comes from. Here's kind of a schematic. We've got towers, both near and far. We have a handset, or a device, or a car traveling through the field of these towers. Now, at any given point, at some particular point, that's the circle there, the device is going to be receiving from multiple towers. Those are the red lines. It's going to see multiple towers. It's going to have some sort of propagation delay from those towers and a signal strength from each of those towers. But, the handset or device will typically be talking to one of the towers. It will send signal reception reports back to that tower. Of course, this tower's report is going here, and we'd like to figure out how to make sense of this data and discover things about both the static and the dynamic situation from all of the different devices that are wandering around these towers.

Let's talk about how to do that. The idea that I'm going to talk about today, the architecture is that each of these towers, and I've drawn the towers just out here in a line to make the diagram simpler, each of the towers would have locally a very small data asset. Essentially, a cluster. Possibly a very small cluster that would have MapR stream replicate in it. Each of the towers has one of these streams. All of the data that the tower is capturing from these call detail reports about what we're interested in, which today is signal strength. Tomorrow it might be something else - will be pushed into one of these simple streams. Now, these are smaller, but they are replicated then to, well I say galactic headquarters, but it really could just be a local point of presence that's not tower local. It could just be centralized across the towers, or one special tower might be in a facility that could support it.

The basic idea is that all of these small streams are replicated into the big stream. The big stream, then, is an amalgam of all of the data captured from all of the towers. The topics in question might be labeled by the tower in question or it might be labeled by the handset itself. In either case, we collect it all here. Many, many topics can reside in a single stream, and as long as each topic is updated in only one place at a time, we can do multi-master writes to MapR streams.

We've got the physical situation. We can see where thee data comes from, in terms of the physics of propagation and things like that. We can see the data architecture, how the data transports. Let's talk about how we actually analyze it. Now, the data itself is going to be fielded in some sense. The fields may be complex, but they still are somewhat notionally a table thing. We have our device or handset or some designator. We have the tower that is reporting the data to us. How did the data come to us via which tower? We have some sort of locational information. I've put them as an X and a Y there.

Now, of course, we don't really have an exact XY location. That's why I put the little asterisks up there. We might, instead, some signal strength, estimated position, or propagation delay. It might be a GPS location. This field would be an excellent candidate for a complex data structure with alternative possibilities in there. We might not even represent location in two dimensions. It might be much more convenient to embed the signal strength measurements that we're getting into some high dimensional space where all of the infelicities of the reality of the physics and propagation can be normalized away by the structure of this high dimensional space. There's asterisks there, I don't want to go into it too much. There's asterisks.

Now, the other thing we get, of course, are tower and signal. These are another candidate for, a very good candidate for structuring. We might say, "Oh, let's just make that a list of objects," which we might have many of them. That's a great example where structured data might be less convenient than complex data. That's roughly the data that we're going to get. Now, looking at this, we can now group by the tower that's getting a report. Many handsets, many paths that the data will come to us, and many locations then. We will get tower, location, and signal strength reports from all of these different things.

We might want to normalize away some of the characteristics of the device receiving it. Some handsets are not very good at receiving data. Some are very good at it. In any case, what we're going to wind up with is many, many reports. You can see how this handset traveled through the field. This one traveled this way. This one traveled across here. They're going to report many times about the signal strength of this one tower. This is not going to be just in the area where individual handsets have talked to this tower. For instance, this tower is far away, and we're going to report this way, but we still might have received a signal strength from that tower. Once we rearrange the data after receiving it in a large stream at the galactic headquarters, we can now draw this picture of all of the reports about a single tower, not from a single tower.

Now, once we have these signal strengths, we can draw ... What would you call it? A contour map. An iso-strength. An isobel, sort of, map here. It's going to have a complex shape. If there's something here that is causing a shadow, we would see that in a disruption in a disruption in the isobel curve.

Now, the cool question is, can we detect strange anomalies and changes in this structure?

This is a very complex set of data. It's very dynamic. Measurements here will be at a different time than measurements here, but can we actually do that? In particular, one of the tricks that we can use is we can take all of the measurements for all past time, cluster them physically by whatever locational information we have. We will get these, what would be called a voronoi tiling. That's “voronoi.”A Russian dude from a long time ago. We might get this using K means clustering or something like that. It's a tiling of the space after we've clustered it into regions. Just based on the characteristics of K means clustering, these regions will tend, not exactly, to have roughly the same data volume in each one.

Now, we can treat each one of these regions as a single entity. We can make these regions small if we have lots of data or large if we don't have a lot of data. We can make them small if we're willing to wait a long time to gather enough data. We can make them large if we want to respond very quickly. Meaning, that in a short time, we'll actually get enough data. We can even look at multiple resolution at the same time. Course for fast response. Very fine for very detailed response. Okay.

Now, given that we have this tiling, given that we now have data, in particular, just the number of reports at a particular strength coming in for a particular tile or K means cluster, we can do something very exciting. This is when we can really do some serious anomaly detection. In another video, I talked about how to model the rate of things, like visits to a website. The idea then is if we have the rate of events coming in, historically, we could use the historical data. We could use things like diurnal variations, morning to evening to night, to predict what the rate should be right now. Then, with that rate, right now, we could look at the actual arriving events, and we can decide when there's an anomaly in the form of missing events. We can look at the time to the last report and decide when something has gone awry.

Now, we have a different kind of data here, and it's really kind of exciting. What we have is not only the signal strength reports or the number of reports from a particular tower in this particular cell, we also have the rates and strengths from all of the neighboring cluster cells. Remember we clustered on either geometry or geometry plus signal strength. Neighboring cells should have very, very close association. I can predict what should be happening, say here, based on what is happening, and what has happened recently in the neighboring clusters to that. Even the not-so-neighboring clusters, they may provide information as well.

Using the neighboring information and the recent historical information, I can get an expectation of what should be happening within that cell. That can then tell me whether or not I should be seeing a report at a particular moment or when it has been too long. If we cluster in geometry and signal strength at the same time, then if something comes in and disrupts transmission, then in a particular geometry the number of reports at a particular signal strength will suddenly change. Some strong signal strengths will stop being reported and weak ones will start being reported. These will have different cells in the K means clustering. We will see one cell drop suddenly, and another one rise. We should be able to detect that anomaly based on our expectation, and the observations, and the time until these events.

What we've done here is we've taken the physical situation, the logical situations, the data, and then driven it all the way to a useful anomaly detection on this telecom data. Now, this is only one of the kinds of analyses that you can do, but one of the key factors here is we were able to drive the data into a single place where the analysis could be made in a somewhat global sense. This doesn't have to be absolutely global or galactic scale, it might be on a city scale, a neighborhood scale, but the key thing is we got data from multiple sources without major coding at the data sources. We have very stable data sources, very hands-off data collection, and very global and perspicuous data in the central site.

That's the benefit of having a good platform plus some cool algorithms. I hope this was fun. Thank you very much.

This blog was originally published on October 19, 2016.


This blog post was published May 17, 2017.
Categories

50,000+ of the smartest have already joined!

Stay ahead of the bleeding edge...get the best of Big Data in your inbox.


Get our latest posts in your inbox

Subscribe Now

NOW AVAILABLE - NEW FREE COURSE:

Application Containers and Kubernetes