CovidInsights: A Data Exploration System for Providing Covid-19 Insights by Similar Examples

Explore the system!

To confront the difficult Covid-19 crisis, it is critical - for both ordinary citizens as well as policy makers - to be able to get insightful information about the current disease situation at any geographical location, as well as insights about how the situation is going to evolve in the near future. Such information can help to make smart decisions, to be prepared, and be strong during this difficult time. There have been extensive recent efforts in data collection and in developing exploratory tools for Covid-19 that provide some statistics at different aggregate levels of location, such as country, state, or city. This includes well-known dashboards such as Covid-19 map, CDC COVID Data Tracker, and NYTimes Coronavirus map. Alongside, ML, simulation, and mathematical models have been built to predict how the situation will evolve. These models are helpful, especially in showing high-level trends of the disease.

We propose CovidInsights, a complimentary and orthogonal system to the above tools that is based on the key observation that not all places across the globe are at the same stages of the crisis. In other words, there is a delay in the spread of Covid-19 (and similar contagious diseases) at different places. For example, while places like China have passed the peaks and are experiencing more stable early post-Covid-19 times, countries like India, Brazil, and US seem to be experiencing the peak, while many other countries are in the earlier stages of the pandemic.

This shift in the spread of the disease provides a unique opportunity to provide insightful information by searching and exploring within the available data for "similar historical examples" to the current state of a given location. In other words, we propose to build a system that searches for cases of specific locations and specific times in past (i.e., (location, time) instances) that are similar to the current state of a queried location. An example of a result for a query location "Chicago" could be (Milan-Italy, 03/19/2020), meaning that the city of Milan experienced a similar situation in March 19 with what Chicago is experiencing now.

We believe such information has multiple advantages. First, what happened in the near future of similar historical examples is available in the data, i.e., as "real experiments" that has already happened in real world. It does not require complex computational modeling or simulations. Such models rely on many factors (such as social distancing) that play a significant role in what will happen next. Despite the significance of such factors, the task of identifying, measuring and collecting such information is usually difficult. Second, complex computational models are often difficult to interpret and understand by humans, whereas the use of similar historical examples is very intuitive and well received by people, and can provide us with insights about the current situation as well how it might evolve in the near future.

To the best of out knowledge, CovidInsights is the first of its type and it initiates a data exploration framework that provides novel insights through similar (location, time) examples for monitoring different states of epidemics. CovidInsights is an online interactive tool requiring efficient and effective algorithmic solutions from different areas. It leverages different techniques from the database and machine learning research community, including k-nearest neighbor (k-NN) query processing and indexing, efficient processing of streaming data, and similarity computation over temporal (time-series) data as well as non-temporal data.