README.md 10.8 KB
Newer Older
1 2
# COVID-Net Open Source Initiative

3 4
**Note: The COVID-Net models provided here are intended to be used as reference models that can be built upon and enhanced as new data becomes available. They are currently at a research stage and not yet intended as production-ready models (not meant for direct clinicial diagnosis), and we are working continuously to improve them as new data becomes available. Please do not use COVID-Net for self-diagnosis and seek help from your local health authorities.**

lindawangg's avatar
lindawangg committed
5
<p align="center">
6
	<img src="assets/covidnet-small-exp.png" alt="photo not available" width="70%" height="70%">
lindawangg's avatar
lindawangg committed
7 8 9
	<br>
	<em>Example chest radiography images of COVID-19 cases from 2 different patients and their associated critical factors (highlighted in red) as identified by GSInquire.</em>
</p>
lindawangg's avatar
test  
lindawangg committed
10

11 12
**Core COVID-Net team: Linda Wang, Alexander Wong, Zhong Qiu Lin, James Lee, Paul McInnis**

13
The COVID-19 pandemic continues to have a devastating effect on the health and well-being of the global population.  A critical step in the fight against COVID-19 is effective screening of infected patients, with one of the key screening approaches being radiological imaging using chest radiography.  It was found in early studies that patients present abnormalities in chest radiography images that are characteristic of those infected with COVID-19.  Motivated by this, a number of artificial intelligence (AI) systems based on deep learning have been proposed and results have been shown to be quite promising in terms of accuracy in detecting patients infected with COVID-19 using chest radiography images.  However, to the best of the authors' knowledge, these developed AI systems have been closed source and unavailable to the research community for deeper understanding and extension, and unavailable for public access and use.  Therefore, in this study we introduce COVID-Net, a deep convolutional neural network design tailored for the detection of COVID-19 cases from chest radiography images that is open source and available to the general public.  We also describe the chest radiography dataset leveraged to train COVID-Net, which we will refer to as COVIDx and is comprised of 16,756 chest radiography images across 13,645 patient cases from two open access data repositories.  Furthermore, we investigate how COVID-Net makes predictions using an explainability method in an attempt to gain deeper insights into critical factors associated with COVID cases, which can aid clinicians in improved screening. **By no means a production-ready solution**, the hope is that the open access COVID-Net, along with the description on constructing the open source COVIDx dataset, will be leveraged and build upon by both researchers and citizen data scientists alike to accelerate the development of highly accurate yet practical deep learning solutions for detecting COVID-19 cases and accelerate treatment of those who need it the most.
lindawangg's avatar
lindawangg committed
14

15 16 17
For a detailed description of the methodology behind COVID-Net and a full description of the COVIDx dataset, please click [here](assets/COVID_Netv2.pdf).

Currently, the COVID-Net team is working on COVID-RiskNet, a deep neural network tailored for COVID-19 risk stratification.  Stay tuned as we make it available soon.
lindawangg's avatar
lindawangg committed
18

lindawangg's avatar
lindawangg committed
19 20
If you would like to contribute COVID-19 x-ray images, please contact us at linda.wang513@gmail.com and a28wong@uwaterloo.ca or alex@darwinai.ca. Lets all work together to stop the spread of COVID-19!

lindawangg's avatar
lindawangg committed
21
If you are a researcher or healthcare worker and you would like access to the GSInquire tool to use to interpret COVID-Net results on your data or existing data, please reach out to a28wong@uwaterloo.ca or alex@darwinai.ca
lindawangg's avatar
lindawangg committed
22

lindawangg's avatar
lindawangg committed
23 24
Our desire is to encourage broad adoption and contribution to this project. Accordingly this project has been licensed under the GNU Affero General Public License 3.0. Please see [license file](LICENSE.md) for terms. If you would like to discuss alternative licensing models, please reach out to us at: linda.wang513@gmail.com and a28wong@uwaterloo.ca or alex@darwinai.ca.

Desmond Lin's avatar
Desmond Lin committed
25 26 27 28
If there are any technical questions, please contact:
* desmond.zq.lin@gmail.com
* paul@darwinai.ca
* jamesrenhoulee@gmail.com
lindawangg's avatar
lindawangg committed
29

lindawangg's avatar
lindawangg committed
30 31 32 33 34 35 36 37 38 39 40 41 42
If you find our work useful, can cite our paper using:

```
@misc{wang2020covidnet,
    title={COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest Radiography Images},
    author={Linda Wang and Alexander Wong},
    year={2020},
    eprint={2003.09871},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
```

lindawangg's avatar
lindawangg committed
43
## Requirements
44 45 46 47 48

Install requirements using `pip install -r requirements.txt`

The main requirements are listed below:

lindawangg's avatar
lindawangg committed
49 50
* Tested with Tensorflow 1.13 and 1.15
* OpenCV 4.2.0
lindawangg's avatar
lindawangg committed
51
* Python 3.6
52
* OpenCV
53
* PyDicom
lindawangg's avatar
lindawangg committed
54

lindawangg's avatar
lindawangg committed
55
## COVIDx Dataset
56

lindawangg's avatar
lindawangg committed
57
**Update: we have released the brand-new COVIDx dataset with 16,756 chest radiography images across 13,645 patient cases.**
58 59

The current COVIDx dataset is constructed by the following open source chest radiography datasets:
lindawangg's avatar
lindawangg committed
60
* https://github.com/ieee8023/covid-chestxray-dataset
61 62 63 64 65
* https://www.kaggle.com/c/rsna-pneumonia-detection-challenge

We especially thank the Radiological Society of North America and others involved in the RSNA Pneumonia Detection Challenge, and Dr. Joseph Paul Cohen and the team at MILA involved in the COVID-19 image data collection project, for making data available to the global community.

### Steps to generate the dataset
lindawangg's avatar
lindawangg committed
66

67 68 69 70 71 72 73 74
1. Download the datasets listed above
 * `git clone https://github.com/ieee8023/covid-chestxray-dataset.git`
 * go to this [link](https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/data) to download the RSNA pneumonia dataset
2. Create a `data` directory and within the data directory, create a `train` and `test` directory
3. Use [create\_COVIDx\_v2.ipynb](create_COVIDx_v2.ipynb) to combine the two dataset to create COVIDx. Make sure to remember to change the file paths.
4. We provide the train and test txt files with patientId, image path and label (normal, pneumonia or COVID-19). The description for each file is explained below:
 * [train\_COVIDx.txt](train_COVIDx.txt): This file contains the samples used for training.
 * [test\_COVIDx.txt](test_COVIDx.txt): This file contains the samples used for testing.
lindawangg's avatar
lindawangg committed
75

76
### COVIDx data distribution
lindawangg's avatar
readme  
lindawangg committed
77

lindawangg's avatar
lindawangg committed
78
Chest radiography images distribution
79 80
|  Type | Normal | Pneumonia | COVID-19 | Total |
|:-----:|:------:|:---------:|:--------:|:-----:|
lindawangg's avatar
lindawangg committed
81
| train |  7966  |    8514   |    66    | 16546 |
82
|  test |   100  |     100   |    10    |   210 |
lindawangg's avatar
stats  
lindawangg committed
83

lindawangg's avatar
lindawangg committed
84
Patients distribution
85 86 87 88
|  Type | Normal | Pneumonia | COVID-19 |  Total |
|:-----:|:------:|:---------:|:--------:|:------:|
| train |  7966  |    5429   |    48    |  13443 |
|  test |   100  |      97   |     5    |    202 |
lindawangg's avatar
readme  
lindawangg committed
89

lindawangg's avatar
lindawangg committed
90
## Training and Evaluation
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111
The network takes as input an image of shape (N, 224, 224, 3) and outputs the softmax probabilities as (N, 3), where N is the number of batches.
If using the TF checkpoints, here are some useful tensors:

* input tensor: `input_1:0`
* output tensor: `dense_3/Softmax:0`
* label tensor: `dense_3_target:0`
* class weights tensor: `dense_3_sample_weights:0`
* loss tensor: `loss/mul:0`

### Steps for training
Releasing TF training script from pretrained model soon.
<!--1. To train from scratch, `python train.py`
2. To train from an existing hdf5 file, `python train.py --checkpoint output/example/cp-0.hdf5`
3. For more options and information, `python train.py --help`
4. If you have a GenSynth account, to convert hdf5 file to TF checkpoints,
`python export_to_meta.py --weightspath output/example --weightspath cp-0.hdf5`-->  

### Steps for evaluation

1. We provide you with the tensorflow evaluation script, [eval.py](eval.py)
2. Locate the tensorflow checkpoint files
112
3. To evaluate a tf checkpoint, `python eval.py --weightspath models/COVID-Netv2 --metaname model.meta_eval --ckptname model-2069`
113
4. For more options and information, `python eval.py --help`
114 115 116 117 118 119 120 121

### Step for inference
**DISCLAIMER: Do not use this prediction for self-diagnosis. You should check with your local authorities for the latest advice on seeking medical assistance.**

1. Download a model from the [pretrained models section](#pretrained-models)
2. Locate models and xray image to be inferenced
3. To inference, `python inference.py --weightspath models/COVID-Netv2 --metaname model.meta_eval --ckptname model-2069 --imagepath assets/ex-covid.jpeg`
4. For more options and information, `python inference.py --help`
lindawangg's avatar
lindawangg committed
122

123 124
## Results
These are the final results for COVID-Net Small and COVID-Net Large.   
lindawangg's avatar
lindawangg committed
125

126 127 128 129 130 131
### COVIDNet Small
<p align="center">
	<img src="assets/cm-covidnet-small.png" alt="photo not available" width="50%" height="50%">
	<br>
	<em>Confusion matrix for COVID-Net on the COVIDx test dataset.</em>
</p>
lindawangg's avatar
lindawangg committed
132

133 134 135 136 137 138 139 140 141 142 143 144 145 146 147
<div class="tg-wrap" align="center"><table class="tg">
  <tr>
    <th class="tg-7btt" colspan="3">Sensitivity (%)</th>
  </tr>
  <tr>
    <td class="tg-7btt">Normal</td>
    <td class="tg-7btt">Pneumonia</td>
    <td class="tg-7btt">COVID-19</td>
  </tr>
  <tr>
    <td class="tg-c3ow">95.0</td>
    <td class="tg-c3ow">91.0</td>
    <td class="tg-c3ow">80.0</td>
  </tr>
</table></div>
lindawangg's avatar
lindawangg committed
148

149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165
<div class="tg-wrap"><table class="tg">
  <tr>
    <th class="tg-7btt" colspan="3">Positive Predictive Value (%)</th>
  </tr>
  <tr>
    <td class="tg-7btt">Normal</td>
    <td class="tg-7btt">Pneumonia</td>
    <td class="tg-7btt">COVID-19</td>
  </tr>
  <tr>
    <td class="tg-c3ow">91.3</td>
    <td class="tg-c3ow">93.8</td>
    <td class="tg-c3ow">88.9</td>
  </tr>
</table></div>

### COVID-Net Large
lindawangg's avatar
lindawangg committed
166
<p align="center">
167
	<img src="assets/cm-covidnet-large.png" alt="photo not available" width="50%" height="50%">
lindawangg's avatar
lindawangg committed
168 169 170 171
	<br>
	<em>Confusion matrix for COVID-Net on the COVIDx test dataset.</em>
</p>

lindawangg's avatar
lindawangg committed
172
<div class="tg-wrap" align="center"><table class="tg">
lindawangg's avatar
lindawangg committed
173
  <tr>
174
    <th class="tg-7btt" colspan="3">Sensitivity (%)</th>
lindawangg's avatar
lindawangg committed
175 176 177
  </tr>
  <tr>
    <td class="tg-7btt">Normal</td>
178 179
    <td class="tg-7btt">Pneumonia</td>
    <td class="tg-7btt">COVID-19</td>
lindawangg's avatar
lindawangg committed
180 181
  </tr>
  <tr>
182 183 184
    <td class="tg-c3ow">94.0</td>
    <td class="tg-c3ow">90.0</td>
    <td class="tg-c3ow">90.0</td>
lindawangg's avatar
lindawangg committed
185 186 187 188 189
  </tr>
</table></div>

<div class="tg-wrap"><table class="tg">
  <tr>
190
    <th class="tg-7btt" colspan="3">Positive Predictive Value (%)</th>
lindawangg's avatar
lindawangg committed
191 192 193
  </tr>
  <tr>
    <td class="tg-7btt">Normal</td>
194 195
    <td class="tg-7btt">Pneumonia</td>
    <td class="tg-7btt">COVID-19</td>
lindawangg's avatar
lindawangg committed
196 197
  </tr>
  <tr>
198 199 200
    <td class="tg-c3ow">90.4</td>
    <td class="tg-c3ow">93.8</td>
    <td class="tg-c3ow">90.0</td>
lindawangg's avatar
lindawangg committed
201 202 203
  </tr>
</table></div>

lindawangg's avatar
lindawangg committed
204
## Pretrained Models
205

lindawangg's avatar
lindawangg committed
206 207
|  Type | COVID-19 Sensitivity | # Params (M) | MACs (G) |        Model        |
|:-----:|:--------------------:|:------------:|:--------:|:-------------------:|
lindawangg's avatar
lindawangg committed
208 209
|  ckpt |         80.0         |     116.6    |   2.26   |[COVID-Net Small](https://drive.google.com/file/d/1xrxK9swFVlFI-WAYcccIgm0tt9RgawXD/view?usp=sharing)|
|  ckpt |         90.0         |     126.6    |   3.59   |[COVID-Net Large](https://drive.google.com/file/d/1djqWcxzRehtyJV9EQsppj1YdgsP2JRQy/view?usp=sharing)|