Howard Vindin — https://en.wikipedia.org/wiki/Confocal_microscopy#/media/File:Depth_Coded_Phalloidin_Stained_Actin_Filaments_Cancer_Cell.png

The healthcare space is full of outdated, non-interoperable technology and the problems that come with it. EMR systems responsible for being the technical backbone can be decades old, and are wrought with standardization problems, interoperability issues, and pose security and operations risks that can compromise patient quality of care and privacy. This article will examine possible applications of blockchain technology in the pursuit of global, secure, distributed and decentralized patient health records, a goal the current healthcare system has yet to achieve.

The problem should first be examined on the basis of needs. The core focus of consideration in healthcare decision-making is the patient and corresponding information — admit, discharge, and transfer information, lab test information, diagnostic information, etc.. There are specific requirements of this information as well — to name vital ones, that there has to be robust storage available for complex data (e.g. DICOM MRI data), there has to be unerring security (ideally quantum-proof security) to protect privileged patient health information (PHI), there has to be guaranteed access to the data on occasion of patient consent, caregiver consent, or in emergency situations, unconditional on anything except network connectivity, and such access — read and write access both — has to be rapid and persistent.

The basis for data models in healthcare has been worked on exhaustively, with dozens of formalized standards (e.g. the HL7 v2 data model, HL7 FHIR, ICD coding, SNOMED-CT nomenclature, ISO etc.). Most of this data can be considered to be included in the context of an all-inclusive, subdivided medical record, which is a strong candidate to be considered as a first class object inside a blockchain system built to accommodate health care data.

For the sake of explication, I will give a very brief overview of the fundamentals of blockchain technology. A single, continuous blockchain is a sequence of data structures called “blocks” which encode some kind of persistent data as a decentralized ledger. The first major implementation was Bitcoin, designed for the use case of currency, but the ecosystem surrounding it has rapidly expanded to include a vast number of applications, notably including shared file storage (Siacoin), smart contracts and arbitrary code execution (Ethereum, etc.) — the takeaway from the greater set of examples is that blockchain technology is a strong foundation for global, persistent, publicly available, censorship-resistant, and private technologies.

Developing a protocol model

Taking into account the requirements in the healthcare space, a scheme for implementing an EMR system on the blockchain can begin to be roughed out.

Let’s consider privacy and high-availability (HA) access to be the first major considerations. We’ve established that the medical record exists in some decentralized manner — probably among some percent of full nodes archiving large numbers of patient records, large enough to promote several-times-over redundancy of records and to ensure resistance against censorship attacks. The record must be securely encrypted and stored as an asset on the chain in order to meet the simultaneous and nearly contradictory requirements of public availability and patient privacy. The patient, however, cannot retain sole access to their record without sacrificing the possibility that their records can be accessed in an emergency — with the possible exception of carrying a key for decryption on their person. Some trust model must be developed to allow record sharing. The obvious solution for this seems to be a hybrid cypher as utilized in GnuPG and other similar software — namely, a symmetric key that can both decrypt and encrypt the same medical record, and asymmetric encryption to allow multiple parties to gain access to the medical record (it’s worth noting that this incurs essentially zero overhead over any healthcare systems currently using storage-level or database-level encryption).

It should be clear at this point that the overhead in such a system, required to achieve secure patient records in a secure format on the public internet, is limited to the time required to download an encrypted copy of the patient’s record and decrypt it. This is particularly faced if local availability and high data speed connectively can be guaranteed. However, we have already run into some problems. First, it is of poor sense to store the record itself on the blockchain, as the total compromise of patient records would occur in the event of a failure of the encryption algorithm being discovered. Second, the data access is supposed to be rapid, and massive performance degradation would occur if record components with high data requirements had to be downloaded at the same time.

So we have to consider two additional requirements. First, the record ideally should not be stored on-chain. It may be secure with some given encryption algorithm today, but proven insecure tomorrow. Instead we must use irreversible and lossy hashing algorithms to summarize the data, and store it off-chain. Risks of hash collisions still exist, but this can be considered an acceptable risk, in that the risk is only the possible substitution of false data in the next-to-impossible event of a collision, versus the mass exposure of PHI in the event of the failure of an encryption algorithm. Second, the data cannot be stored as a single record. Instead, it is wiser to consider a patient record to be a mathematical graph of components (in the sense of graph theory), namely to provide a manifest of parts and pointers inside the data structure to those parts.

A seasoned programmer at this point might recognize an obvious possibility. This is what a filesystem is tasked with — namely, an encrypted filesystem. The current implementation that comes to mind for me is EncFS, in that it allows for rapid access to a description of data, while the data itself can be encrypted on its own (there is no reason why one encypted EncFS filesystem cannot contain another, and it is not difficulty to see why this might be a desired property). A node hosting a patient record can rapidly mount the encrypted filesystem in order to serve files from it according to a request, without having to decrypt the entire filesystem. The filesystem can technically be made to be divisible as well, to promote sharding of a record if necessary, so long as it’s ensured that files or segments thereof can be reliably reconstructed from the sharded components.

So if we have taken the patient records off-chain for the sake of extreme caution in security, what is the use of a blockchain in this application?

There are still quite a few benefits. First, a patient name, date of birth, and other similar information can provide a high quality means to search patient records — technologies in the vein of Elasticsearch can easily be leveraged off-chain to search patient records. The blockchain can persist these lookup traits in tandem with identifiers for nodes on the network — potentially people who are hosting the data, possibly the same ones who were entrusted with the data for direct use, such as a hospital hosting patient records. An off-chain protocol can easily be leveraged for the negotiation of access to these records — considerations usually made for PHI data sharing can likely be ignored if the presence of a valid decryption key is present, which immediately establishes trust.

For a trusted data host with data access, in brief, a secure connection, such as with SSL, would be made, a patient identifier and sufficient information to establish trust would be provided, namely cryptographic proof of consent of patient or a provider previously established on-chain as having access to the patient’s records or some component therein, or, hypothetically, some sort of emergency override system where enough verifiable affected parties can collaborate to open a patient record in an emergency, when trust cannot be obtained (it should be noted that this is a can of worms, and in most cases likely incompatible with cryptographic-based enforcement of consent to access of records).

For a non-trusted host with data access, we’re probably assuming patient records are being shared either on-chain, or more likely, on a side chain or similar - unless maybe we’re employing a network-of-trust approach to try to minimize possible exposure of patient records due to failure of an encryption algorithm. It is vital to note that poorer security is achieved with this general approach, just due to the lingering possibility of encryption algorithm failure (the search space for an encryption algorithm is pretty much always beyond human calculation — until someone figures out that it isn’t). The negotiation is probably less stringent in this case — we probably have the patient’s manifest as one item with a single, secure symmetric key used for access, which has been shared with the person attempting to access the record previously. The manifest itself would then probably point to other objects stored on-chain, essentially like a graph database — for the sake of patient privacy, this approach would probably be best if only the patient manifest was associated publicly with any information identifying the patient.

A filesystem change, or a change to the patient records, attempted by someone with trust for those records, if so permissioned by a given EMR blockchain, for an off-chain solution, can be implemented as an instant change on a shared, likely in-memory store, SSH-accessible filesystem with strict access requirements. The change itself can be described as a sequence of hashes to the filesystem, calling into mind the change tracking of systems like Git. The hashes of those changes can be broadcast on-chain, and used as the single source of truth for who has the most up to date patients corresponding to a patient. Consider the importance of, say, being able to publicly and securely announce that a unique, universal identifier for a patient has an allergy to some antibiotics, or that they’re diabetic. Or to be more accurate, publicly denoting that a change in their record has occurred, such that it can be accessed by interested parties. At this point it should be clear that multi-signature techniques and approaches similar to the Lightning/Raiden/etc. systems can be employed for off-chain negotiation and on-chain record keeping, namely of patient identification information, record change hashes, and access logging, and other information as desired.

Economics

For the most part, these considerations have been hashed out on existing cryptocurrencies — the balance of miners, who encoded transactions, versus full nodes and light nodes, who store complete records or partial records of the full blockchain. It is important, at least in the context of market economies, to consider that economic incentives are necessary for computing the chain and for storing patient records, and that a transaction market is appropriate for all ends, but also that, given the trust requirements and access control requirements for health care data, incentives may exist to promote some nodes having access to specific data, such as a specification in on-chain negotiations or similar that fees should be prioritized for some nodes over others, perhaps based on an off-chain network of trust.

Interoperability

Any such solution would almost certainly be required to comply with existing standards, particularly HL7, ICD and SNOMED-CT. Each of those standards already exists in many versions, and have been implemented hundreds of times. A blockchain EMR or meta-EMR system as described would have to be able to reliably interface with a variety of existing EMRs as a form of migration from the existing EMR layer. Having worked with several and authored one HL7 implementation, I can at least comment that it is possibly to abstractly deal with even a protocol that complex in a way that allows for high code reuse or even a domain-specific language (DSL) in order to describe the differing rules governing translation of data between two given EMRs. In essence, this is the same sort of problem faced by linguistic translation engines, in that mappings and composable functions/transformations have to be described in between those two systems. If the on-chain technology contained its own reference implementation for an EMR in order to standardize use (reminscent of the EOS vs. Ethereum issue), the translations would only have to be defined between the non-blockchain EMRs and the on-blockchain EMRs. But, speaking of Ethereum, it is also possible to implement individual EMRs, access systems, and really any other subject mentioned in this article with a standard similar or equivalent to Ethereum’s ERC20 standard. In this case, we might see the standard for translation between these sub-EMR systems even beginning to take the shape of District0x or a similar technology! There are certainly advantages to that approach as well — one that comes to mind is that a patient record could be stored in an encrypted distributable format that could be loaded into an arbitrary EMR by a trusted node. There are massive ramifications, and accordingly possibilities, to be had with the idea of implementing EMR-like systems on a computing-centric blockchain such as Ethereum, which while fascinating are outside the scope of this article.

Last notes

This is a very basic, un-proofread, brainstorming overview, based on my brief experience in the health care sector and with blockchain technology. I don’t dare attempt to write implementation of a system like this, knowing that it would be years before it could be deployed with full confidence, and fully aware of the massive momentum required to introduce and standardize new technologies in the healthcare space. This article was written with the sole purpose of jotting down some thoughts on how the blockchain could affect the healthcare landscape and how implementations may look.