10 min read
This blog post is an excerpt from the MapR Guide for Big Data in Federal Agencies and the Public Sector. To read more, you can download it here.
I should start by saying that MapR recognizes the public sector as being split into 3 separate segments: defense, intelligence, and civil services. This blog, though, will focus on civil services and how government agencies can benefit from data analytics and machine learning technologies.
First, let’s compare the public and the private sectors, purely from a data standpoint, and try to draw some similarities between the two. Unlike the private sector, where the key to the existence of corporations is to generate profits, the public sector has their mandate in constantly responding to its constituents, providing services, and maintaining infrastructure. In doing so, it is fair to say that the latter often trails the former in adopting the latest technologies. That being said, the enormity of data that the public sector is exposed to is very similar to what the private sector experiences. So, one would think that the public sector should also benefit from a cutting-edge data platform that helps to acquire, store, process, and analyze data as it happens, just as the private corporations normally would look for. It is also important to note that the public sector, again very much like the private sector, is exposed to wide varieties of datasets – images, text, videos, voice, social media – that it has to store, process, analyze, and derive meaningful insights out of.
The picture below provides an overview of a few key federal agencies and some high-level directives given to these agencies.
Note that the directives each do not correspond to a single agency or a government institution, so the picture above is not meant to provide a one-to-one mapping; these are thought of more as higher-level outcomes that may require multiple agencies to work together, in many cases.
Now let us take some of the directives, think about some use-cases that most of us have witnessed in recent years as they relate to governments and federal agencies, and try to break them down into the following 4 categories:
While there are many more use cases in the public sector, let us consider a few examples and quickly understand what some of the key requirements are to build a robust solution to potentially address those use cases.
According to a UNODC (United Nations Office on Drugs and Crime) report, criminals laundered close to $1.6 trillion – or 2.7% of the global GDP – in 2009. The Financial Crimes Enforcement Network (FinCEN), a bureau of the U.S. Treasury Department, uses an analytics tool that can collect and analyze large numbers of bank transactions in order to combat domestic and international money laundering, terrorist financing, and other financial crimes. In addition, local agencies, such as police departments, can leverage advanced, real-time analytics to provide actionable intelligence that can be used to understand criminal behavior, identify crime/incident patterns, and uncover location-based threats. Electronic surveillance requires video feed analysis, real-time theft detection, and alerting, all of this while protecting citizens’ personal data privacy. The MapR Data Platform provides capabilities such as machine learning and anomaly detection that allow for identification of patterns that can reduce crimes.
The McKinsey Global Institute estimates that applying big data strategies to better inform decision-making could generate up to $100 billion in value annually across the U.S. healthcare system by optimizing innovation, improving the efficiency of research and clinical trials, and building new tools for physicians, consumers, insurers, and regulators to meet the promise of more individualized approaches. In addition, researchers can use the MapR Data Platform to analyze a much larger patient population, decide what treatments are most effective, and identify patterns in side effects of drugs.
Public sector agencies need to have the ability to analyze traffic flow data on different roads or in different parts of the city. Reducing traffic congestion requires understanding busy routes, toll plazas, and volume distribution of traffic tickets handed out as well as encouraging citizens to use public transport – all of this while managing the cost of additional infrastructure required. The MapR Data Platform helps in aggregating real-time traffic data, gathered from road sensors, GPS devices, and video cameras, and provides traffic managers with the ability to identify potential problems in a public bus network. Adjusting public transportation routes in real time can prevent these potential traffic problems in dense urban areas.
City, state, and other local government entities have been swift to improve constituent services and control costs. Specific use cases include automated responses to FAQs that previously required human intervention; early identification of infectious diseases to better arrest broader outbreaks; predicting criminal activity that triggers optimized police patrol presence; analyzing citizens’ feedback on city, state, and federal budgets as well as on ballots as they come up for voting; and anticipation of water, electric grid, and gas infrastructure failures while keeping support staff on high alert for failures.
Fraud is expensive and wasteful. Using big data analytics, the U.S. Department of Health and Human Services (HHS) is using predictive analytics techniques to spot anomalies in various entitlement programs. Meanwhile, the IRS is constantly looking to combat tax fraud by deploying big data analytics to comb through structured and unstructured data, identify suspicious behavior, and actually help to find fraudulent tax filings.
The government’s various disaster relief agencies are using big data analytics and visualization solutions to speed relief to victims to rebuild homes and businesses. These tools permit a faster, more efficient analysis of loan applications and performance data to get funds distributed more quickly while minimizing fraud with improved, real-time scrutiny of loan applications.
It can be argued that every use case stated above is complex, given the volume and varieties of data that would need to be dealt with. If I had to break down the solutions, though, it would probably result in the following key essential technologies needed to build those solutions:
A combination of MapR Database, MapR XD, MapR Analytics and ML Engines, MapR Event Store, and MapR-Edge – all constituted within the MapR Data Platform – is helping our customers implement many of these use cases and achieve real results for their constituents.
Lastly, the diagram below describes the architecture of the MapR Platform, catered to use cases pertaining to federal agencies. This architecture shows how MapR helps civilian agencies achieve 3 important objectives: (i) allows the agencies to augment existing applications with newer data-centric capabilities, (ii) build interactive analytical applications to improve decision-making, and (iii) future-proof the IT investments by being able to constantly deploy new(er) intelligent applications – all of these simultaneously by using the best platform for AI and analytics.
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.