This section explains in detail the potential solutions for centralized data sources, centralized oracles, and trusting that those oracles are returning valid and accurate data. The section has two main subsections and distributing sources, distributing oracles. It also covers in detail the in-contract aggregation algorithm and provides an insight into the planned off-chain aggregation solution.

Distributing sources is a way to ensure data validity by obtaining the result of a query from multiple providers of the same (or similar) data. A single oracle may retrieve data from multiple data sources using the same query and aggregate them into an answer. As mentioned in Section 2 (page 6) of the white paper, the method of aggregation can be customized specific to the data which is being processed (like discarding outliers). Additional problems related to distributed data sources is where one data provider obtains and releases information which it received from another data provider directly. This could result in invalid data coming from two separate data sources.

Distributing oracles is the major advantage of the ChainLink network. This is a distributed system of oracles (the term “nodes” can be used interchangeably within the ChainLink network) contacting multiple data sources (some overlapping, some distinct) returning a single answer.

In-contract aggregation will be used with the initial implementation of the ChainLink network. The aggregated answers from each node will be ran through a smart contract which will compute the final answer A to return to the user-created smart contract. Using in-contract aggregation provides value in that it remains simple, trustworthy, and flexible. The smart contract is available on the public blockchain and may be audited for errors.

However, in-contract aggregation has a problem with oracles simply copying other oracles and returning the copied answer so that they don’t need to query a data source, called freeloading. Algorithm 1 (below) shows how freeloading can be eliminated by using a commit / reveal scheme. This means that each oracle will reveal their answer in the form of a commitment to the contract, which prevents other nodes from seeing the provided answer. After enough commitment messages have been received in the aggregation smart contract (this can vary due to block times and data provider response speed), nodes will reveal their answer within the decommit message.

Algorithm 1: In-Contract Aggregation 1. Wait until a query is received from the user-created smart contract

2. Select nodes from the set of all valid session IDs

3. Broadcast the query to the selected nodes

4. Wait until commit messages are received from enough oracles to account for faulty nodes

5. Broadcast that enough messages were received

6. Wait until decommitments are received from enough oracles to account for honest nodes

7. Send the answer along with the nodes that provided it to the user-created smart contract

Off-chain aggregation is the longer-term solution for the ChainLink network and addresses the key problem that persists with in-contract aggregation, transaction cost. As more oracles (or nodes) are used, the transaction cost increases. Off-chain aggregation allows ChainLink to send a single aggregated answer to the smart contract, so that no matter how many oracles are used, it is still one transaction. Obtaining consensus of an aggregated answer in ChainLink will be different than Byzantine Fault Tolerant algorithms because the smart contract needs to receive an answer without participating in the aggregation itself. Consensus is accomplished by allowing each oracle generate a partial signature of its answer which together would create a verifiable signed answer from the collection of answers provided by all participating oracles. Freeloading is prevented in off-chain aggregation by requiring each oracle to obtain their data from the data provider instead of from one another.

Section 4 ends here, but there is still a need to explain the algorithm mentioned for off-chain aggregation. Below is the simplified version of algorithm 2. The term enough is used frequently, and in all cases is used to account for faulty nodes.