October 03, 2016 | BY Dr. Kirk Borne
One of the most significant characteristics of the evolving digital age is the convergence of technologies. That includes information management (structured and unstructured databases: e.g., NoSQL), data collection (big data), data storage (cloud and distributed data: e.g., Hadoop), data applications (analytics), knowledge discovery (data science), algorithms (machine learning), transparency (open data), computation (distributed data processing: e.g., MapReduce and Spark), sensors (Internet of Things: IoT), and API services (microservices, containerization). One more dimension in this pantheon of tech, which is the most important, is the human dimension. We see the human interaction with technology explicitly among the latest developments in digital marketing, behavioral analytics, user experience, customer experience, design thinking, cognitive computing, social analytics, and (last, but not least) citizen science.
Citizen Scientists are trained volunteers who work on authentic science projects with scientific researchers to answer real-world questions and to address real-world challenges (see discussion here). Citizen Science projects are popular in astronomy, medicine, botany, ecology, ocean science, meteorology, zoology, digital humanities, history, and much more. Check out (and participate) in the wonderful universe of projects at the Zooniverse (zooniverse.org) and at scistarter.com.
In the data science community, we have seen activities and projects that are similar to citizen science in that volunteers step forward to use their skills and knowledge to solve real-world problems and to address real-world challenges. Examples of this include Kaggle.com machine learning competitions and the Data Science Bowl (sponsored each year since 2014 by Booz Allen Hamilton, and hosted by Kaggle). These “citizen science” projects are not just for citizens who are untrained in scientific disciplines, but they are dominated by professional and/or deeply skilled data scientists, who volunteer their time and talents to help solve hard data challenges.
The convergence of data technologies is leading to the development of numerous “smart paradigms,” including smart highways, smart farms, smart grid, and smart cities, just to name a few. By combining this technology convergence (data science, IoT, sensors, services, open data) with a difficult societal challenge (air quality in urban areas) in conjunction with community engagement (volunteer citizen scientists, whether professional or non-professional), the U.S. Environmental Protection Agency (EPA) has knitted the complex fabric of smart people, smart technologies, and smart problems into a significant open competition: the EPA Smart City Air Challenge.
The EPA Smart City Air Challenge launched on August 30, 2016. The challenge is open for about 8 weeks. This is an unusually important and rare project that sits at that nexus of IoT, big data analytics, and citizen science. It calls upon clever design thinking at the intersection of sensor technologies, open data, data science, environment science, and social good.
Open data is fast becoming a standard for public institutions, encouraging partnerships between governments and their constituents. The EPA Smart City Air Challenge is a great positive step in that direction. By bringing together expertise across a variety of domains, we can hope to address and fix some hard social, environmental, energy, transportation, and sustainability challenges facing the current age. The challenge competition will bring forward best practices for managing big data at the community level. The challenge encourages communities to deploy hundreds of air quality sensors and to make their data public. The resulting data sets will help communities to understand real-time environmental quality, what are the driving factors in air quality change (including geographic features, urban features, and human factors), to assess which changes will lead to better outcomes (including social, mobile, transportation, energy use, education, human health, etc.), and to motivate those changes at the grass roots local community level.
The EPA Smart City Air Challenge encourages local governments to form partnerships with sensor manufacturers, data management companies, citizen scientists, data scientists, and others. Together, they’ll create strategies for collecting and using the data. EPA will award prizes of up to $40,000 to two communities based on their strategies, including their plans to share their data management methods so others can benefit. The prizes are intended to be seed money, so the partnerships are essential.
After receiving awards for their partnerships, strategies and designs, the two winning communities will have a year to start developing and implementing their solutions based on those winning designs. After that year, EPA will then evaluate the accomplishments and collaboration of the two winning communities. Based upon that evaluation, EPA may then award up to an additional $10,000 to each of the two winning communities.
The EPA Smart City Air Challenge is open until October 28, 2016. The competition is for developers and scientists, for data lovers and technology lovers, for startups and for established organizations, for society and for you. Join the competition now! For more information, visit the Smart City Air Challenge website.
Spread the word about EPA’s Smart City Air Challenge — big data, data science, and IoT for societal good in your communities!
We send special thanks to Ethan McMahon @mcmahoneth at the EPA for his contributions to this article and to the EPA Smart City Air Challenge.
For this challenge and others that you might have to tackle in your organization, look to MapR for its continuing support of the data analytics community with its Converged Data Platform and its suite of big data training and analytics services.
An earlier version of this post first appeared at rocketdatascience.org