README.md 6.79 KB
Newer Older
lindawangg's avatar
lindawangg committed
1
# COVID-Net and COVIDx Dataset
lindawangg's avatar
lindawangg committed
2 3 4 5 6
<p align="center">
	<img src="assets/covid-2p-rca.png" alt="photo not available" width="70%" height="70%">
	<br>
	<em>Example chest radiography images of COVID-19 cases from 2 different patients and their associated critical factors (highlighted in red) as identified by GSInquire.</em>
</p>
lindawangg's avatar
test  
lindawangg committed
7

8
[Linda Wang and Alexander Wong, "COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest Radiography Images", 2020.](https://arxiv.org/abs/2003.09871)
lindawangg's avatar
lindawangg committed
9

lindawangg's avatar
lindawangg committed
10 11
The COVID-19 pandemic continues to have a devastating effect on the health and well-being of global population. A critical step in the fight against COVID-19 is effective screening of infected patients, with one of the key screening approaches being radiological imaging using chest radiography. It was found in early studies that patients present abnormalities in chest radiography images that are characteristic of those infected with COVID-19.  Motivated by this, a number of artificial intelligence (AI) systems based on deep learning have been proposed and results have been shown to be quite promising in terms of accuracy in detecting patients infected with COVID-19 using chest radiography images. However, to the best of the authors' knowledge, these developed AI systems have been closed source and unavailable to the research community for deeper understanding and extension, and unavailable for public access and use. Therefore, in this study we introduce COVID-Net, a deep convolutional neural network design tailored for the detection of COVID-19 cases from chest radiography images that is open source and available to the general public. We also describe the chest radiography dataset leveraged to train COVID-Net, which we will refer to as COVIDx and is comprised of 5941 posteroanterior chest radiography images across 2839 patient cases from two open access data repositories. Furthermore, we investigate how COVID-Net makes predictions using an explainability method in an attempt to gain deeper insights into critical factors associated with COVID cases, which can aid clinicians in improved screening. By no means a production-ready solution, the hope is that the open access COVID-Net, along with the description on constructing the open source COVIDx dataset, will be leveraged and build upon by both researchers and citizen data scientists alike to accelerate the development of highly accuracy yet practical deep learning solutions for detecting COVID-19 cases and accelerate treatment of those who need it the most.

lindawangg's avatar
lindawangg committed
12 13
If you would like to contribute COVID-19 x-ray images, please contact us at linda.wang513@gmail.com and a28wong@uwaterloo.ca or alex@darwinai.ca. Lets all work together to stop the spread of COVID-19!

lindawangg's avatar
lindawangg committed
14
If you are a researcher or healthcare worker and you would like access to the GSInquire tool to use to interpret COVID-Net results on your data or existing data, please reach out to a28wong@uwaterloo.ca or alex@darwinai.ca
lindawangg's avatar
lindawangg committed
15

lindawangg's avatar
lindawangg committed
16 17
Our desire is to encourage broad adoption and contribution to this project. Accordingly this project has been licensed under the GNU Affero General Public License 3.0. Please see [license file](LICENSE.md) for terms. If you would like to discuss alternative licensing models, please reach out to us at: linda.wang513@gmail.com and a28wong@uwaterloo.ca or alex@darwinai.ca.

lindawangg's avatar
lindawangg committed
18 19 20 21 22 23 24 25 26 27 28 29 30
If you find our work useful, can cite our paper using:

```
@misc{wang2020covidnet,
    title={COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest Radiography Images},
    author={Linda Wang and Alexander Wong},
    year={2020},
    eprint={2003.09871},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
```

lindawangg's avatar
lindawangg committed
31 32 33 34
## Requirements
* Tested with Tensorflow 1.13 and 1.15
* Keras 2.3.1
* OpenCV 4.2.0
lindawangg's avatar
lindawangg committed
35
* Python 3.6
lindawangg's avatar
lindawangg committed
36

lindawangg's avatar
lindawangg committed
37 38 39 40 41
## COVIDx Dataset
Currently, the COVIDx dataset is constructed by the following open source chest radiography datasets:
* https://github.com/ieee8023/covid-chestxray-dataset
* https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

lindawangg's avatar
lindawangg committed
42 43 44
We provide jupyter notebooks for [creating the COVIDx dataset](create_COVIDx.ipynb) and the [preprocessing](preprocessing.ipynb) used for training.
This project is still a work in progress and will continuously update these files.

lindawangg's avatar
lindawangg committed
45
The COVIDx dataset can be downloaded [here](https://drive.google.com/file/d/1-T26bHP7MCwB8vWeKufjGmPKl8pesM1J/view?usp=sharing).
46 47
Preprocessed ready-for-training COVIDx dataset can be downloaded [here](https://drive.google.com/file/d/1zCnmcMxSRZTqJywur7jCqZk0z__Mevxp/view?usp=sharing). Note: for most up-to-date data for train/test,
generate using the [preprocessing script](preprocessing.ipynb).
lindawangg's avatar
readme  
lindawangg committed
48

lindawangg's avatar
lindawangg committed
49
Chest radiography images distribution
lindawangg's avatar
lindawangg committed
50 51
|  Type | Normal | Bacterial| Non-COVID19 Viral | COVID-19 Viral | Total |
|:-----:|:------:|:--------:|:-----------------:|:--------------:|:-----:|
52 53
| train |  1349  |   2540   |       1355        |        66      |  5310 |
|  test |   234  |    246   |        149        |        10      |   639 |
lindawangg's avatar
stats  
lindawangg committed
54

lindawangg's avatar
lindawangg committed
55
Patients distribution
lindawangg's avatar
lindawangg committed
56 57
|  Type | Normal | Bacterial | Non-COVID19 Viral| COVID-19 Viral | Total |
|:-----:|:------:|:---------:|:----------------:|:--------------:|:-----:|
58 59
| train |  1001  |     853   |        534       |       47       | 2435  |
|  test |   202  |      78   |        126       |        5       |  411  |
lindawangg's avatar
readme  
lindawangg committed
60

lindawangg's avatar
lindawangg committed
61
## Training and Evaluation
lindawangg's avatar
lindawangg committed
62
Releasing soon but can download COVID-Net and start training/inferencing [here](https://drive.google.com/file/d/1FyfcAkRf-0gQ1nOrDJ9ccGVSZAO9VFP1/view?usp=sharing).
lindawangg's avatar
lindawangg committed
63

lindawangg's avatar
lindawangg committed
64 65 66 67
Input tensor (N, 224, 224, 3): `input_1:0`

Output tensor (N, 4): `dense_3/Softmax:0`

lindawangg's avatar
lindawangg committed
68 69
## Results
<p align="center">
lindawangg's avatar
lindawangg committed
70
	<img src="assets/confusion.png" alt="photo not available" width="50%" height="50%">
lindawangg's avatar
lindawangg committed
71 72 73 74
	<br>
	<em>Confusion matrix for COVID-Net on the COVIDx test dataset.</em>
</p>

lindawangg's avatar
lindawangg committed
75
<div class="tg-wrap" align="center"><table class="tg">
lindawangg's avatar
lindawangg committed
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110
  <tr>
    <th class="tg-7btt" colspan="4">Sensitivity (%)</th>
  </tr>
  <tr>
    <td class="tg-7btt">Normal</td>
    <td class="tg-7btt">Bacterial</td>
    <td class="tg-7btt">Non-COVID19 Viral</td>
    <td class="tg-7btt">COVID-19 Viral</td>
  </tr>
  <tr>
    <td class="tg-c3ow">73.9</td>
    <td class="tg-c3ow">93.1</td>
    <td class="tg-c3ow">81.9</td>
    <td class="tg-c3ow">100.0</td>
  </tr>
</table></div>

<div class="tg-wrap"><table class="tg">
  <tr>
    <th class="tg-7btt" colspan="4">Positive Predictive Value (%)</th>
  </tr>
  <tr>
    <td class="tg-7btt">Normal</td>
    <td class="tg-7btt">Bacterial</td>
    <td class="tg-7btt">Non-COVID19 Viral</td>
    <td class="tg-7btt">COVID-19 Viral</td>
  </tr>
  <tr>
    <td class="tg-c3ow">95.1</td>
    <td class="tg-c3ow">87.1</td>
    <td class="tg-c3ow">67.0</td>
    <td class="tg-c3ow">80.0</td>
  </tr>
</table></div>

lindawangg's avatar
lindawangg committed
111
## Pretrained Models
lindawangg's avatar
lindawangg committed
112
Can download COVID-Net tensorflow model from [here](https://drive.google.com/file/d/1FyfcAkRf-0gQ1nOrDJ9ccGVSZAO9VFP1/view?usp=sharing)