Suppose you are building a data exchange system between a consortium of organizations using an enterprise blockchain network. Documents of large file-size will need to be stored and retrieved routinely. A decision is made to use Hyperledger Fabric as the framework by the powers that be.
Hyperledger Fabric is an excellent framework to run an enterprise blockchain network and can work for a lot of use cases. However, it is not advisable storing large data sets on the ledger. This is a common issue across all blockchain frameworks.
Many industries that use enterprise blockchain technologies, use it for features like security, high-availability, data-immutability and decentralized network. Any blockchain framework is suitable only for small data sets. The dissemination of large data will create a lag on network. This lag may not be acceptable in a production system. We need a solution to this problem.
Enter IPFS, the Inter Planetary File System. IPFS helps solving the problem of file storage. Large files can be easily stored and efficiently managed on the IPFS.
In the remainder of this article we will discuss 3 different setups in increasing order of security measures.
Simple storage and retrieval on IPFS
The idea is to create an IPFS network parallel to the Fabric network. In a multi-organization blockchain network, we can install and run the IPFS processes on the fabric peers. But, segregating the fabric peer and IPFS processes will give better performance.
acceptable in a production system. We need a solution to this problem.
The app will be communicating with both the networks. The step one in this process will be sending the files to be uploaded to the IPFS network. If the App belongs to Org1 then they will send the file to be uploaded through one of its nodes in the IPFS network. After a successful response from IPFS API application, which will contain the hash-code for the uploaded file. This hash-code will be the content to be stored on the Fabric ledger. A further breakdown of the whole process is shown as below.
Encrypted storage and retrieval
To secure our file, we will encrypt the file with a key. Behind the each of the IPFS and Fabric API applications sit their respective networks, the infrastructure setup for each network may also be in different geographies.
As a bonus, the redundancy of large files when stored on a ledger can be avoided by using IPFS.
Authorization scheme with encrypted storage and retrieval
How about access to a subset of organizations within a larger set of organizations within a Fabric channel?
Data can still be privately stored on ledger while sharing it to an arbitrary number of organizations. A set of organizations may be organized in sets, called collections in Fabric.
Therefore, on the IPFS network the access to file data should be determined based on whether a node from an organization is permitted to see the hash-code of file on the ledger.
To achieve this, we devised a process where the files are encrypted using a symmetric key stored on the ledger along with the file hash-code.
Let’s say, for a successfully uploaded file, the IPFS API will send a response with hash-code of the file and encryption key. Now this data is delivered to the Fabric API through which the chaincode is invoked and stored on the ledger.
The above diagram shows how the hash-code+key is stored to the ledger, now when the user needs to view a file, she will need to fetch both the hash-code and the key. Organizations with the encryption key are the only ones that will be able to view the content of the file. This ensures the rules imposed inside the fabric collection configuration are also applicable on the IPFS network. We will try to go into more detail about Hyperledger Fabric and IPFS in upcoming articles.