The public cloud has transformed the way software applications run. It has essentially allowed businesses and institutions of all types to gain access to on-demand computing, storage and more. In the field of genomics, it gave even the smallest lab access to the same infrastructure being used by the largest research centers and even more in many cases. It also made collaboration more seamless as it provided the opportunity for shared data sources and easier sharing of results. One point of discussion that is still up for debate is whether the cloud is secure enough for life science applications.
To start the debate, let’s use the evolution of Amazon Web Services (AWS) as an example of how far cloud security has come to date. It has now been 12 years since the launch of one of the most popular cloud service providers, the AWS public cloud. In March of 2006, AWS launched with three core services: compute (EC2), storage (S3), and message queuing (SQS). Today, AWS comprises more than 90 services than span across many applications. All of these services are then restricted to users based on their Identity and Access Management (IAM). Account administrators have access control lists in which they can manage user credentials and access. This isn’t just an AWS story, every other cloud provider (e.g. Microsoft Azure, Google Cloud Platform) has a parallel story of how they’ve built their services around security. Security became even a larger priority for cloud providers to support the demand from life sciences users as HIPAA Omnibus rule took effect on 2013. Since then, cloud providers have developed quick deployment guides, architecture recommendations and have trained third-party solution providers to make sure users have readily available information about maintaining a secure cloud.
Thinking of what are going to be some of the next advances in data security, we can only speculate new technology such as blockchain (not to be confused with Bitcoins) to start playing a bigger role in genomics data security. Blockchain would provide better record keeping based on its core competency in being a ledger that cannot be manipulated. In this realm, the patient itself would have more visibility and control as to how its DNA data is being used and where. As a result, we could expect more individuals voluntarily submitting genomic and health data as they are able to see the effect of their contribution to breakthrough studies while not worrying about data misuse. Without using blockchain, The Harvard Personal Genome Project started such an effort, but contributions have not been at a large scale as from a study (1) we know that most people (86% in the study) fear misuse of their personal information. Genomics is already Big Data problem but to answer many of the toughest problems in medicine, there is still a need for even more data and giving power to the individual will be key regardless of the technology being used.
With or without blockchain, DNA data will continue to grow in exponential rates, genomics researchers and other bioinformaticians should not fear the cloud but should rather embrace it as it opens the opportunity for an accelerated path to insights by leveraging its scalability and flexibility.