The National Institutes of Health (NIH) has made genomic data about COVID-19 publicly accessible in the cloud. This published dataset can help researchers understand the COVID-19 virus through insights into the differences in genetic sequencing among infected patients. Thus, with this intelligence researchers can determine how quickly the virus is evolving and how patients react to it.
The Coronavirus Genome Sequence Dataset is created by the National Center for Biotechnology Information and consists of researcher-submitted data, including normalised Sequence Read Archive (SRA) file formats. The SRA is a bio-informatics repository of DNA sequences. According to the NIH, the Coronavirus Genome Sequence Dataset contains more than 13,000 SRA files. The project is part of the NIH Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability (STRIDES) initiative.