Tackling COVID 19 Pandemic with Data Science and Machine Learning

Starting as a pneumonia outbreak in December 2019 from the Wuhan city of China, COVID 19 has become a global pandemic with over 1.1million confirmed positive cases in India.  By July 2020, more than 597,583 deaths have been reported worldwide with around 27428 deaths in India. In India, around 35,000 new positive cases are seen daily, while the recovery numbers are yet to catch up with the pace of new cases. States like Maharashtra, Tamil Nadu, Delhi, Gujarat have shown the highest number of cases. With these heart wrecking statistics, COVID 19 has taken the shape of a global disaster.

 ​These crisis and subsequent lockdowns have further aligned social lifestyle towards technology and the internet. This amalgamation of statistics and technology gives rise to the immense potential of data science  in disaster management  and creating scientifically robust post-corona policies. The crisis has brought with itself huge volumes of data, which includes health data, corona induced migration data, time series data, biological data on the virus etc. In fact government apps like Aarogya Setu are a gold mine of geospatial data, which if used ethically can lead to major breakthroughs in tackling the pandemic.

Data science in a nutshell is the collective term used for collection of data, cleaning data, data exploration, modelling, analysis and data visualization. Machine Learning is one of the core techniques used in data science. Arthur Samuel defines machine learning as the ability of a computer to learn without being explicitly programmed.  Machine learning in the past has been successful in forecasting and prediction of disasters including forest fires, cyclones,earthquakes etc. And while we speak, in this supersonic paced world of data analytics, data scientists have already achieved new milestones with COVID 19 data.

One of the major challenges in fighting COVID 19 has been the limited tests conducted so far. Being a new pandemic and limited medical research, large scale testing is expensive as well as logistically difficult. Instead of waiting for slow and expensive lab reports, utilizing medical data to build machine learning models will provide simpler, faster and cheaper tests to screen patients and diagnose COVID 19. Wearable technology like Apple watches, Fitbit app , Google fit etc can provide data to identify patterns and preliminary detection of symptoms. Tampa General Hospital in Florida has been one of the first hospitals to use AI based face scanning to monitor quarantined patients and detect feverish visitors. This proves to be extremely helpful when dealing with hundreds and thousands of new patients daily. However the most successful attempt in this regard has been by Chinese tech giant Alibaba, who developed a machine learning model to diagnose the COVID 19 virus with a 96% accuracy. The research team used CT scan images and deep learning algorithms to predict the probability of COVID 19. In developing countries like India, that have poor health services and limited health workers, cheap and cost effective preliminary diagnosis technology can go a long way in scaling up testing of people.

Another important tool of machine learning is Natural Language Processing(NLP), which deals with analysis of text data. Building chatbots is a major utility of NLP in COVID 19 disaster response. Tech giant IBM is offering services based on NLP models to ease governments , health organizations to handle the large influx of coronavirus related phone calls, messages and inquiries. Other tech companies like Microsoft have released chatbots that help people self-identify their best course of action, given their specific symptoms. Chatbots also help healthcare professionals to automate the process of interacting with quarantined COVID patients without the need for physical contact. Through NLP based models we can also process millions of text data from online information platforms to detect fake news and ensure that there is no misinformation in the public domain.

Being a new virus, it is critical for medical science to come up with a reliable vaccine or drug for its treatment at the earliest. This process can be expedited without sacrificing quality using data science. ML models have successfully identified molecular inhibitors for Ebola virus and H7N9 avian flu virus in the past.  Similar methodologies can be replicated for COVID 19 which is spreading rapidly. But while medical researchers find a new vaccine, data science based techniques can be used to identify existing drugs which may be effective in curing the virus. This can be done by building biomedical knowledge graphs from large datasets of existing drugs and vaccines.

As a response to the virus outbreak the Indian government and governments across the globe have initiated large scale lockdowns and social distancing measures. However the effective planning of these lockdowns and quarantine facilities have to be backed by data analytics. Now machine learning researchers are trying to ​quantify the effects of these measures in different parts of the world. A team led by researchers from MIT have successfully leveraged COVID 19 data and an artificial neural network to determine the effectiveness of lockdowns, quarantine measures and predict its spread. Models like this help the government to decide on extension of lockdowns and gradual withdrawal of restrictions in a quick and robust manner.

On the other hand, statistical machine learning can be a tool to map vulnerability and identify containment zones. It can also be used to plan allotment of resources for red zones and financial budgeting for the affected population. Advanced classification algorithms can streamline local administrative planning and automate disaster mitigation. It can monitor and forecast mass migration of people and pre plan policy measures. At times of great crisis the supply chain of aid distribution, food and resources is delayed and disrupted but these hurdles can be automated and smoothened using data science.

COVID 19 has created a disaster crisis which is unparalleled to any calamity ever seen in independent India. It has altered our social life as well as devastated our economy. In these difficult times, it is extremely necessary to switch to a scientific and technology based approach to mitigate and adapt. As a nation we must use tools of data science to become self reliant and independent (​atmanirbhar) in our response strategy. As we walk into the uncertain darkness of the post corona world, a data driven approach is our only light to socio-economic prosperity and best hope to regain normalcy.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: