Home | Publier un mémoire | Une page au hasard

The impact of covid-19: to predict the breaking point of the disease from big data by neural networks


par Woohyun SHIN
Paris School of Business - MSc Data Management 2001
Dans la categorie: Informatique et Télécommunications > Intelligence artificielle
   
Télécharger le fichier original

sommaire suivant

PSB PARIS SCHOOL OF BUSINESS

Mémoire

Pour l'obtention du diplôme de

Master of Science

En Data Management

Présenté Par :
SHIN Woohyun

THÈME

The impact of COVID-19 : To predict the breaking

point of the disease from Big Data by Neural Networks

Responsable du Mémoire : OMRANI Nessrine

- 1 -

Année Scolaire : 2019-2020

- 2 -

Abstract

The weather data generated per second is BigData, which is difficult to process with computers at home. In particular, supercomputers used by the National Weather Service are more expensive than clusters that connect multiple computers in parallel with a single machine. To address these limitations of a single machine, the cluster environment was built using the BigData framework Hadoop and Spark. Subsequently, a deep learning prediction model was created using temperature data to predict the reduction point of COVID-19. The model is designed to put the maximum temperature of the past decade at each day at input value, and to predict the 2020 weather, hoping for an early end to COVID-19. As a result, the predicted reduction point of COVID-19 was consistent with the actual breaking point.

Keywords : BigData, Hadoop, Spark, Deep-Learning

Les données météorologiques générées par seconde sont BigData, qui est difficile à traiter avec des ordinateurs à la maison. En particulier, les superordinateurs utilisés par le National Weather Service sont plus chers que les clusters qui connectent plusieurs ordinateurs en parallèle avec une seule machine. Pour remédier à ces limitations d'une seule machine, l'environnement de cluster a été créé à l'aide du framework BigData Hadoop et Spark. Par la suite, un modèle de prévision d'apprentissage en profondeur a été créé en utilisant des données de température pour prédire le point de réduction de COVID-19. Le modèle est conçu pour mettre la température maximale de la dernière décennie à chaque jour à la valeur d'entrée, et pour prédire la météo 2020, en espérant une fin précoce de COVID-19. Par conséquent, le point de réduction prévu de COVID-19 était conforme au point de rupture réel.

Mots clés : BigData, Hadoop, Spark, Deep-Learning

- 3 -

Table of Contents

I. INTRODUCTION

II. LITERATURE REVIEW

1. Coronavirus

1.1. SARS and MERS

1.2. Seasonal Virus

1.3. 2019 Novel Coronavirus

2. Weather Prediction Model

2.1. Numerical Weather Prediction (NWP)

2.2. Deep Learning Model

3. Applied Technologies

3.1. Hadoop

3.1.1. HDFS

3.1.2. MapReduce

3.1.3. Yarn

3.2. Spark

3.2.1. Low-Level API

3.2.2. Structured API

3.2.3. Machine Learning on Spark

3.3. Docker

3.3.1. Micro-Service

3.3.2. Image and Container

3.3.3. Networks

3.3.4. Kubernetes

4. Conclusion

III. METHOD AND DATA

1. Development Environment

2. Data

2.1. Daily COVID-19 confirmed cases

2.2. Max Temperature Data

3. EDA, Exploratory data analysis

4. Prediction Modeling

4.1. Data Preprocessing

4.2. Multi-Layer Perceptron

IV. Result

V. Discussion

VI. Conclusion

- 4 -

I. INTRODUCTION

Coronavirus, which began to infect humans through bats, has appeared in various shapes since 2003. The variety of Coronavirus, which appeared in SARS in 2003, MERS in 2009 and COVID-19 in 2019, has social and economic implications as well as human health. In particular, this COVID-19 is a deadly situation, with the WHO issuing a "Pandemic" proclamation. Since the first confirmed case of COVID19 appeared in Wuhan, China on December 8, 2019, data from the World Health Organization (WHO) have shown that more than 2,314,621 confirmed cases have been reported worldwide by April 20, 2020. Among them, 157,847 people died, with a fatality rate of around 7 percent[19]. Fortunately, in some countries with a well-respected COVID-19 medical system, the number of confirmed cases remains double-digit every day, and the number of confirmed cases continues to decline. However, the impact of COVID-19 has led to a setback in global economic and social infrastructure, and a tentative recovery period is expected, as is the result of the 2008 global financial crisis. Economically, the U.S. stock market plunged due to the outbreak of the new coronavirus infection, resulting in massive unemployment within a week. In Europe, many countries that share borders from Italy to Spain, France and Germany were closed by COVID-19. This has slowed the growth rate of many countries tied to the euro this year.

Contrary to this disastrous reality, individual thinking predicts that these negative effects will create many new opportunities from a long-term perspective. Historically, many developments and advances are made by the limited environment of the times. For example, during World War II, there were many advances in communication and technology that we are currently using, and the commercialization of penicillin, among medical technologies, was one of the inventions that changed the world. We are now incorporating and developing technologies that existed but were not frequently used to fit with the times. Among them are remote education and telecommuting, grocery shopping using the Internet, and ordering food. These technologies existed in the past, but did not feel the need, and lacked the technology until commercialization. However, it shows that it is growing and adapting to new environment according to the current situation. Therefore, through temperature and humidity-based neural network learning, we will predict when the end of COVID-19 proliferation will come. Predicting the timing, we try to seize the economic opportunities that a new social culture brought by COVID-19 will bring.

I conducted a search paper to answer the question of when we would get freedom from the coronavirus - when would economic recovery begin?

First, I will talk about what is coronavirus ? and how we can predict the weather. I will also talk about what technology is needed to make climate predictions. The technology and data will then be used to predict the climate and find a point in time when the corona virus will decrease.

- 5 -

II. LITERATURE REVIEW

Bill Gates, founder of Microsoft and CEO of the Bill & Melinda Gates Foundation in 2015, predicted a number of infections and economic declines through the virus during a TED lecture. The disease will kill more than 10 million people within a few decades, and the route of infection can be found anywhere, not just in plane and in markets. The World Bank estimates that if we have a worldwide flu epidemic, global wealth will go down by over three trillion dollars and we'd have millions and millions of deaths [1].

1. CORONAVIRUS

Coronavirus (CoV) is a virus that can be infected with humans and various animals, meaning the RNA virus with a gene size of 27 to 32 kb [8]. There are four types of coronavirus (Alba, Beta, Gamma, Delta), and in the case of Alpha and Beta, Gamma and Delta can infect humans and animals, meaning that they can be infected by animals. So far, a total of six types of human-infected coronavirus have been known. There are types that cause colds (229E, OC43, NL43,HKU1) and types that can cause severe pneumonia (SARS-CoV, MERS-CoV) [8].

1.1. SARS and MERS

The emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in China at the end of 2019 is a one of kind of coronavirus from the bat. Phylogenetic analysis revealed that SARS-CoV-2 is 79% similar to SARS-CoV, which occurred in China in 2003, and 50% similar to MERS-CoV [4].

sommaire suivant