Collusion detection and prevention mechanisms within Volunteer Desktop Grids

Customminer ( @cm-steem ) - February 9, 2017

The following is a literature review (more of a paper review tbh) for an university assignment in Dec 2016. I achieved a 60% grade (oh well), time to share it! What do you think? Don't be too harsh :P

1 Introduction

This literature review will introduce the concept of volunteer destkop grids, the weaknesses of BOINC’s majority voting mechanism, the possible motivations of collusion before discussing multiple research papers which focus on the detection and prevention of collusion and manipulation within VDGs.

1.1 What are volunteer desktop grids

One of the most successful Volunteer Desktop Grids (VDGs) is the Berkeley Open Infrastructure for Networked Computing (BOINC); BOINC is an opensource distributed computing platform which utilises volunteer computing resources for the purposes of scientific research [1].

Currently, there are approximately 56 active projects [2] ranging from attempting to solve diseases [3] to searching for extraterrestrial intelligence [4]. VDGs offer significant cost savings over public cloud computing platforms [5], and the scale of volunteer computing resources available for scientific research is immense [6].

1.2 Weakness of BOINC’s default result verification mechanism

By default within BOINC, tasks are replicated to a minimum of 3 workers to ensure the validity of returned results at the cost of decreased throughput. However, if the majority of users within a voting pool work together (collude) to return an identical result they can mask fraudulent results as a valid result; collusion presents a significant threat to the validity of research performed upon VDGs.

The default BOINC platform doesn’t include the detection nor punishment of collusive behaviour; there are many proposed mechanisms documented in research papers however their complex implementation is entirely up to the discretion of project administrators. It’s possible that due to the complexity of counter-collusion mechanisms newly created BOINC projects which lack significant funding don’t explore their implementation, resulting in potential exposure to collusive behaviour.

1.3 Motives for collusion

Despite BOINC existing for over 14 years, the motives behind worker collusion were largely limited to the manipulation of user credit scores or to disrupt scientific research, there was no financial incentive to collude.

In October 2013, the cryptocurrency ’Gridcoin’ was created which began continuously rewarding verified BOINC computation within team Gridcoin [7]. Shortly after the launch of Gridcoin, the cryptocurrency ’Ripple’ [8] performed a temporary distribution of their XRP token in return for verified World Community Grid computation within team Ripple [9].

The BOINC project ’[email protected]’ is planning to distribute an Ethereum token to their workers for verified computation [10] and the upcoming BOINC project ’Project Rain’ which I am developing encourages the distribution of approximately 36 cryptocurrencies to BOINC users [11].

A consequence of the monetization of BOINC computation is the growing financial incentive for malicious workers to collude, therefore it’s important that counter-collusion mechanisms are researched more thoroughly, to combat the increasing probability of collusion.

2 Sabotage-tolerance mechanisms for volunteer computing systems[12]

2.1 Abstract

Sarmenta, L.F’s research into Sabotage-Tolerance mechanisms laid significant ground work for the detection and handling of malicious workers within volunteer desktop grids (VDGs)[12].

Prior to this research paper, BOINC VDGs relied entirely upon the replication of tasks and majority voting and did not punish users for returning invalid results[1].

Sarmenta [12] introduced multiple innovative concepts such as:

Spot-checking (SC) which occasionally distributes precomputed tasks to detect misbehaving workers.

Improving majority voting introducing Credibility Based Fault Tolerance (CBFT) which requires a minimum probability of correctness within which is based upon the workers total SC count prior to validating returned results.

Blacklisting (BL) which immediately bans malicious workers who fail SC tasks or CBFT as SC voting pools.

Backtracking (BT) which redistributes tasks which blacklisted workers previously worked upon, increasing confidence in result validity.

2.2 Methodology

The proposed algorithms were not implemented within a BOINC environment, instead, 100 Monte Carlo simulations were performed involving 10,000 randomly generated tasks spread across 200 workers.

Workers who were banned due to failing spot-checks were reintroduced as new workers in the subsequent simulations.

Workers that were not blacklisted retained their credibility statistics between simulations.

Multiple combinations of the proposed mechanisms were tested.

2.3 Results

The combination of CBFT, SC and BL resulted in significant improvements in error rate and runtime efficiency over majority votin, however an increased redundancy requirement was observed as the faulty fraction and sabotage rate parameters were increased.[12]

The combination of CBFT and SC, without BL resulted in a significant surge in error rate after a length of stay of 120. Sarmenta [12] attributed this observation to positive credibility from previous simulations not being reset; this flaw has exposed potential weakness in their proposed ST mechanism against scenarios where workers change behaviour over time.

Utilizing the outcome of CBFT enhanced voting as a form of SC was shown to improve the error rate and efficiency of computation over the combination of CBFT, SC and BL.[12]

2.4 Areas of potential improvement

Repeating the simulations with data sourced from an operational BOINC VDG environment could further demonstrate the suitability of the proposed mechanisms within VDGs.

Exploring collusion scenarios where workers collude together in an attempt to manipulate majority voting pools into accepting invalid results as valid should be explored.

Applying unique worker fingerprints to returned results could potentially expose colluding workers who attempt to return identical results.

The researchers believe that the concept of CBFT is the most important development within this research paper, and that its adaption to additional areas of VDGs should be explored in future research.

2.5 Conclusions

CBFT introduced the concept of credibility to the evaluation of result validity, it was an innovative development which deserves further research.

Blacklisting can be subverted by the creation of a new user account; the effectiveness of blacklisting could be improved by distributing tasks only to workers who were registered prior to the current batch of tasks. Such an improvement would however result in the temporary underutilisation of newly registered worker’s computing resources.

If workers can differentiate between a normal and SC tasks, they could act honestly for the SC task to evade blacklisting and preserve submitted fraudulent results. Further research into obfuscation of SC tasks or their inclusion within normal tasks as proposed by Araujo et al[13] is a worthwhile area of continued research.

The strategy of blacklisting workers for failing a single SC task without taking prior SCs into consideration is too aggressive. Likewise, the practice of blacklisting workers for falling within the minority of a CBFT as SC voting outcome is vulnerable to manipulation; colluding workers could force bans upon honest workers.

When a worker is identified as malicious either through spot-checking tasks or CBFT as spot-checking, a large amount of potentially legitimate work is backtracked. Further research in the area of partial backtracking rather than complete backtracking may be worthwhile, especially if colluding workers begin forcing honest workers onto the blacklist.

Without blacklisting, the combination of CBFT and SC results in an error rate surge during later simulations due to maintaining worker credibility between simulations. Further research into ageing credibility or isolating credibility within batches of work to prevent sophisticated collusion scenarios is worthwhile.

3 Counter-Collusion mechanisms

3.1 A Dynamic Approach for Characterizing Collusion in Desktop Grids[14]

3.1.1 Abstract

Canon Et Al[14] present two new algorithms for the categorization of workers into groups based upon historical behaviour observations and for estimation of probability that collusion exists between workers within Desktop Grids (DGs).

The proposed characterization algorithm was able to accurately assign workers to either colluding or non-colluding categories in the majority of tested scenarios.

Collusion Detection is performed during the result verification phase rather than as a post-process at a later date, reducing error rates and replication of work on the fly whilst minimizing delay in the verification of returned tasks.

3.1.2 Methodology

The simulations were performed against a modified combination of traces from four different DG platforms, including two BOINC based projects. Approximately 1 million completed work units were gathered from the combined DG traces. The traces were sourced from the Failure Trace Archive (FTA)[15].

Up to 200 workers were involved in the simulations, which covered 27 scenarios.

3.1.3 Results

Canon Et Al[14] observed that scenarios with small quantities of workers and the majority of users being honest, the proposed collusion characterization mechanism is able to accurately and efficiently detect colluding users resulting in a reduction of replication and error rate in comparison to majority voting.

3.1.4 Conclusions

A significant resource overhead was observed during the categorization and collusion probability calculation phases when 200 workers were involved; Desktop Grids (DG) with a significant quantity of workers will be unable to implement such algorithms without further optimization. Additional research into reducing the frequency of evaluating worker collusion category and probability should be explored.

Collusion can be accurately detected if a majority of workers are honest, however, since the researchers make the assumption that the largest group of users are honest, their proposed algorithms are potentially vulnerable to 51 percent attack.

It was demonstrated that the proposed mechanisms work across multiple types of DG platforms; further literature review of counter-collusion mechanisms implemented within alternative DG platforms should be performed.

3.2 A Scheduling and Certification Algorithm for Defeating Collusion in Desktop Grids[16]

3.2.1 Abstract

This paper is a continuation of Canon Et Al’s prior research regarding collusion within DGs; their previous research covered characterization and estimating collusion probabilities of workers[14], whereas this paper proposes additional counter-collusion techniques within the DG’s result verification and task allocation components.[16]

The proposed task distribution mechanism sacrifices task throughput in return for a higher confidence that collusion has not taken place.

The researchers continue to make the assumption that the majority of workers in desktop grid are non-colluding users; as a result, the proposed mechanisms remain vulnerable to 51 percent collusion attack scenarios.

This research paper introduces two improved VDG components

1. Result certification system (RCS)

This takes details produced by the characterization component into account, specifically the worker’s probability of collusion and collusion group when deciding upon a valid voting pool outcome.

The proposed Collusion Aware Algorithm (CAA) requires a minimum level of ’correctness probability’ to authenticate a valid result within a voting pool. If it’s not met, additional tasks are replicated to random workers until the minimum is met, after which the result with the highest correctness probability is accepted as the valid result.

2. Job allocator

Tasks are replicated on an individual basis instead of the default BOINC system where multiple workers are assigned the task in parallel. By distributing tasks in a serial manner, attackers are forced to risk exposure as they cannot guarantee that their colluding partners will be assigned the same task.

3.2.2 Results

CAA is more accurate and has less overhead than the default BOINC replication mechanism in the majority of tested scenarios.

CAA has a higher overhead than BOINC when faced with a large quantity of colluding groups.

3.2.3 Methodology

Similarly to their prior research[14], the researchers sourced DG trace data from the FTA.[15]

They chose 2000 workers out of the 220000 available, and 150000 extracted work units were duplicated until 1 million were available for the simulations.

They ran 38 scenarios 10 times, resulting in 380 million total jobs being simulated.

3.2.4 Areas of potential improvement

Canon Et Al[16] express an interest in further researching sophisticated collusion scenarios, such as workers building up an honest reputation before attempting collusion attacks which was observed as a potential weakness within Sarmenta’s[12] Sabotage-Tolerance mechanisms. They briefly mention how they would approach hardening their mechanisms against such attack scenarios.

Whilst the researchers were able to demonstrate the considerable improvements of CAA over BOINC, they did not compare throughput of the two systems. Since CAA distributes tasks in a serial manner instead of in parallel there should be a noticeable difference which should have been acknowledged.

An area of further improvement could be temporarily denying workers that have a poor reputation the allocation of tasks.

3.2.5 Conclusions

Moving from parallel to serial task replication forces colluding workers to risk exposure whilst improving accuracy and decreasing overhead, although potentially to the detriment of throughput.

The computing overhead of CAA exponentially increases when faced with a large quantity of groups containing at least two workers. There is an upper limit for the quantity of groups but this limit is not explicitly quantified.

A potential weakness observed is that malicious actors could potentially flood the RCS across multiple voting pools with a significant quantity of groups (each containing two colluding workers with poor reputation) to consume server resources.

3.3 An Evaluation of Two Trust-Based Autonomic Organic Grid Computing Systems for Volunteer-Based

Distributed Rendering[17]

3.3.1 Abstract

Kantert et al.[17] present three new self-adapting trust based counter-collusion algorithms which run on the Trusted Desktop Grid (TDG) platform. They compare the replication factor, correctness and throughput of their algorithms against BOINC’s majority based voting mechanism under multiple attack scenarios.

Kantert et al.[17] implemented a form of reputation for workers within their TDG platform, establishing the trustworthiness of workers to characterise and minimize the impact of collusion within Volunteer Desktop Grids (VDGs).

The proposed trust based algorithms are able to modify workers reputation immediately after result verification, thus it’s able to quickly react to workers changing behaviour.[17]

Initial tests involving the multiple proposed algorithms revealed Dynamic Grouping Distribution Strategy (DGDS) to be the most effective; further experiments focused on the comparison between DGDS and BOINC’s default mechanism.[17]

3.3.2 Results

The researchers achieve a significantly reduced replication factor in scenarios where colluders make up less than 50 percent of the worker base.[17]

The researchers achieve an improved confidence in validity of returned results in scenarios where colluders make up less than 50 percent of the worker base.[17]

3.3.3 Methodology

Kantert et al.[17] modified their TDG platform to mimic the centralized BOINC platform by ensuring there is only a single work submitter during the experiment.

Kantert et al.[17] state that they utilised 17 nodes with a combined 204 CPU cores to represent 100 separate workers for the simulations.

Kantert et al.[17] replicated the rendering of the film ’Big Buck Bunny’[18] which was previously rendered via the BURP[19] BOINC project; 30 sections of the film were selected each containing 1000 frames, for a total of 30k work units with similar lengths.[17]

A source for the gathered BURP statistics was not provided.

3.3.4 Areas of potential improvement

To combat periodic worker collusion attack scenarios, the researchers propose to investigate additional reputation parameters.

Potentially reverse the validity of tasks returned by workers who were previously honest but at a later date acted maliciously, in an effort to improve accuracy of returned results.

Investigating additional counter-collusion mechanisms to monitor for potential ongoing collusion attacks.

Expanding test scenarios to take into account resource utilisation of both the central server and workers for the proposed algorithms.

3.3.5 Conclusions

In scenarios where colluding workers make up less than 50 percent of the total worker pool, DGDS and the other proposed algorithms outperform BURP significantly in terms of replication factor.[17]

DGDS is significantly more vulnerable to bursty collusion attack scenarios by periodic workers than BURP which could result in successful targeted attacks against specific work units.[17]

Within their 4th experiment, Kantert et al.[17] declare an inability to simulate collusion higher than 20 percent despite demonstrating up to 60 percent collusion in the first 3 experiments. It’s unclear why this is the case, perhaps because they were unwilling to expose their true scale of vulnerability towards dynamic attackers, considering the 4th experiment demonstrated that only 60 second bursts are required to deteriorate correctness by 15 percent.[17]

Kantert et al’s[17] replication factor efficiency claims are suspect; they define ’replication factor’ as a single task being distributed to multiple workers[17], however they consider the dynamic group of workers within DGDS as a single occurrence of replication rather than counting every worker within the group.

A sophisticated mechanism by which only a single worker within the DGDS pool has to perform the computation whilst maintaining a low error rate is not described; it appears that each worker is required to perform the calculation which would result in DGDS’s replication factor being greater than that which Kantert et al[17] are disclosing.

The omission of a source for gathered BURP statistics makes it difficult to trust the BURP statistics with which they compare their algorithms against.

DGDS is potentially more efficient in terms of replication factor, however further investigation into the TDG platform is required to verify the legitimacy of their claimed replication factor.

3.4 Collusion Detection for Grid Computing[20]

3.4.1 Abstract

Staab and Engel [20] propose a collusion detection mechanism used to postprocess large sets of results returned by workers within a volunteer desktop grid (VDG). The goal of such analysis is to flag potentially fraudulent results by evaluating the correlation of behaviour between workers.[20]

In order to accurately detect collusion, the proposed mechanism requires a significant quantity of observations, hence execution is performed as a postprocess.[20]

Unlike other counter-collusion mechanisms, preventative measures are not taken during task distribution. Suspicious verified results are redistributed to random workers to improve confidence in the validity of the computation.[20]

Cluster graphing was utilised in an attempt to separate workers into honest and malicious groups; Staab and Engel [20] compare the MCL and MinCTL graphing algorithms for performing this task. Prior to this paper, no comparison between MCL and MinCTL for the purpose of colluding worker detection had been made.

Staab and Engel’s [20] collusion detection mechanism evaluates how often workers vote together, vote against one another and their historical outcome within voting pools when attempting to detect the following two types of colluders:

Unconditional Colluders (UC): workers who always attempt to collude to return invalid results.[20]

Conditional Colluders (CC): workers who only collude when they are certain that they are certain that their collusion attempt will succeed.[20]

3.4.2 Results

MinCTC did not produce any false positive results for the UC scenario where as MCL did produce false positive results.[20]

MinCTC requires more observations than MCL.[20]

MinCTC performs worse than MCL as the quantity of workers increases.[20]

3.4.3 Methodology

The researchers randomly generated 1000 voting pool outcomes with 100 workers for the two simulations. In comparison to other countercollusion mechanism papers, this is a small scale experiment.

The researchers only performed the cluster graphing simulations with a Percentage of Malicious workers (Pmal) of 0.1 (10 percent) despite evaluating the correlation values for UC and CC with values ranging from 0 to 0.5 in 0.1 increments. They refer to a non-peer reviewed technical report which provides further information into scenarios with a Pmal greater than 0.1, however said research does not concern cluster graphing algorithms. [? ]

3.4.4 Areas of potential improvement

Staab and Engel [20] propose that future research to explore the replacement of cluster graphing with a less complex solution, they express an interest in distributing the process to volunteers to reduce the cost of detection.

• A graph figure for MinCTC accuracy when faced with CC wasn’t produced, potentially due MinCTC’s inability to detect CC at a Pmal of

0.1, however in figure 1 the CC case shows that there is a correlation of

0.9 at a Pmal of 0.1 which decreases to a correlation of approximately 0.5 with a Pmal of 0.5; had they expanded MinCTC simulations to include increased levels of Pmal it’s possible that MinCTC may have successfully detected CC.

3.4.5 Conclusions

The proposed collusion detection mechanism is unable to detect collusion in scenarios where malicious workers collude at a rate less than 30 percent.[20]

Staab and Engel [20] didn’t explore scenarios where colluding workers make up the majority of the workers, consequentially, collusion detection above 50 percent is unclear.

Both MCL and MinCTC require refining of the input parameters through tool-assisted trial and error to select values which produce accurate collusion detection; the results of their simulations indicated that the input parameters required further improvement.

MCL is more appropriate for a production DG environment than MinCTC despite the high false positive rate, due to the ability to detect CC.

The algorithm presented to calculate correlation of collusive workers is the most important development within this paper; it should be considered by others for future research or implementation.

3.5 Defeating Colluding Nodes in Desktop Grid Computing Platforms[21]

3.5.1 Abstract

Silaghi et al.[21] propose new mechanisms for evaluating the legitimacy of large sets of completed work units; they are able to differentiate between invalid results accidentally submitted by independent honest workers from intentionally invalid results submitted by groups of colluders.

The verification of returned work is delayed until there is a sufficient quantity of data to evaluate workers trustworthiness. The proposed mechanism is reactive rather than preventative, as malicious workers are not denied task distribution.

The mechanism replicates tasks to additional workers when a voting pool’s projected outcome is deemed suspicious; If a pool only contains malicious workers, the work is entirely redistributed. If an honest worker loses the vote, a single task is replicated to an honest worker to prevent sabotage.[21]

The proposed mechanism sorts workers into the following types based on their voting characteristics:

Unintentional collusion (UC) - A worker whose computer is malfunctioning, constantly returning invalid results.[21]

Intentional collusion (IO) - They maliciously attempt to collude when they believe they are able to manipulate a voting pool outcome.[21]

Occasional collusion (OC) - They switch between unintentional and intentional colluding behaviour, however they never act as a honest worker.[21]

Honest worker (HW) - These workers do not display signs of collusion and always return valid results.[21]

3.5.2 Methodology

A fixed quantity of 1000 workers each ran approximately 30 work units 100 times, for a total of 30,000 completed tasks; average values were calculated from the 100 simulations for use in the graphs.[21]

Honest workers formed the majority of workers within the performed simulations.

Silaghi et al.[21] assume that workers are unaware of implemented countercollusion mechanisms, so sophisticated collusion scenarios are not evaluated.

Baseline statistics were gathered for comparison using BOINC’s default voting mechanism.

3.5.3 Results

For the majority of tested scenarios, the proposed sabotage tolerance mechanism was capable of maintaining a low error rate.

Malicious workers are detected once they frequency of collusion is between 30-49 percent at the cost of additional task replication. If the frequency of collusion is below 30 percent then collusion goes entirely undetected.

UC detection has an estimated 1000 percent improvement over BOINC’s default majority pool voting mechanism.

3.5.4 Areas of potential improvement

The researchers express an interest in continuing to refine and refactor their mechanisms for characterising workers.

Silaghi et al.[21] acknowledge their assumption that attackers are unaware of the counter-collusion mechanisms in place, potentially expressing an interest in pursuing further research to combat sophisticated collusion attack scenarios. However, they also state that there’s no proof that their collusion tolerance mechanism can be evaded, potentially dismissing future research.

Silaghi et al.[21] make the assumption that the majority of workers are honest; in a production VDG environment this assumption cannot be made, therefore further research into 51 percent attack scenarios should be investigated as the proposed sabotage tolerance mechanism is potentially vulnerable.

Silaghi et al.[21] don’t characterise workers who mainly act honestly, but occasionally either submit invalid work or participate in targeted bursts of collusion.

3.5.5 Conclusions

Silaghi et al.[21] make unsubstantiated claims regarding successful detection of OC workers, neglecting evidence apparently due to space limitations. It’s difficult to trust the validity of such claims when there is insufficient backing evidence.

Whilst the researchers assume that colluding workers are unaware of implemented counter-collusion mechanisms, these mechanisms are documented and publicly available online. If an attacker was to model a collusion scenario that took all current counter-collusion research into account they may be able to escape detection.

IC workers could maintain a sabotage rate below 30 percent to evade detection; improved collusion detection for low sabotage rates should be researched further as targeted attacks are possible.

When a voting pool is deemed suspicious, the proposed mechanism picks a random honest worker to perform additional verification. If the chosen honest worker is secretly a colluder they may be able to subvert this mechanism, resulting in the verification of an invalid result.

The researchers imply that there are no possible monetary punishments within volunteer computing, this is not true in cases where users are externally rewarded for their average BOINC credit[7]; users can be punished monetarily by silently reducing their rewarded credit.

4 Conclusions

The reviewed papers made several similar assumptions:

Colluding workers do not represent the majority of the user base - this assumption can only be made for large projects such as World Community Grid or [email protected] which have 100k+ workers, new projects or lesser known projects with only a few thousand workers are vulnerable to colluding workers representing the majority of active workers. Further research hardening VDGs from such a scenario is required.

Colluding workers only return invalid results - Colluding workers could honestly solve a work unit with one user account and immediately return the identical result with a second user account to minimize computing resource utilization whilst maximizing rewarded credit. This scenario is most plausible when attackers are monetarily motivated.

The most substantial research paper was by Sarmenta [12]; it introduced multiple new concepts for detecting and punishing malicious workers, without this paper it’s likely that BOINC would not be as successful as it currently is.

The counter-collusion research papers share a similar trait - they’re too complex for inexperienced BOINC project administrators to implement and they’re not implemented by default within the BOINC platform. These mechanisms should be made open source or included within the BOINC platform so that small BOINC projects improve security against malicious workers, allowing project owners to focus their time on scientific research.

Malicious workers have nothing at stake when they attempt to manipulate the outcome of a vote, if there was a backing cryptocurrency asset which influenced vote weight then it’s possible that colluding attacks would be minimized, even in scenarios where colluders are the majority of workers. Projects such as [email protected][10] may be the first to explore such concepts.

References