Very simply, Silk Road was the worst thing to happen on Bitcoin. Freedom is not absolute. We live and act in a society, and selling drugs was one thing, but the reality was it also allowed things that were far worse.

I will not share all of the work I used to do. Much of it was not nice. I have enclosed a report below that is suitably anonymised — one that Dr Pang and myself completed for the SA Police. It was 12 years ago now, and we used network-analysis tools to trace chats in a child-predation and child-porn case.

The party involved was sentenced to prison for grooming girls aged between 11 and 15 for sex and also for the sale of photographs of these girls (he sold these for money). The report, “Three-dimensional visualization of social interactions networks”, was used in the creation of a system that the state and federal police used in tracking and prosecuting many sex crimes.

It is a part of my life that still haunts me.

At one part of my life, I would come home from a case and cry in a corner. My trips to South America and Sub-Saharan Africa in this period involved government-forensic work. I was contracted on anti people-smuggling and anti sex-trafficking engagements. My role was in tracing communications and the money trail.

Bitcoin is not designed to be anarchist. That would be a system that allows all I despise.

I was a pastor and a trustee of a church bank once. Something that is a lifetime ago now. I talked to a lot of people who were victims of drugs then. I have seen what bath salts and ice do to people. It broke me.

It shattered a part of me, and then I would come back and make the cracks wider. Over and over.

This is the reason I took the moniker “Prof. Faustus.” I sold my soul for knowledge, the knowledge that could allow me to make life better for a few others, and it was at the cost of my own self. It is why I walked away and stopped being a pastor, and it is why I cannot ever see a foolish idea such as anarchy with anything other than disdain. There is no other way to say it, doing what I did is corrosive. Seeing what others did and being involved in it remove the humanity from you. A little at a time, but it leaves you less human each time you see it.

Some live in isolation, see the tip of an iceberg that they see to be evil, and do not ever come to find what is below and what is suppressed as a result of this. Government, like all other things, is just people. Some are good and some evil, and we can only ensure that the world is free and that others do not suffer through our eternal vigilance.

My wife knows and understands, but even my children do not know what I used to do. So, yes, I used to work for government. I was a cyber-forensic analyst for a long time. I worked on tracing money flows and more. And this is why we have Bitcoin and not Zcash, and why Zcash will never be legal (even if it was not flawed, and no, I am not saying what the flaw is here).

I watched people I cared for degrade and fall apart under the influence of hard drugs. I watched myself change when I was working in forensics to stop sex crimes and people smuggling. I saw the layers of humanity planed off painfully. One part of what made me more human after another. And, I became angrier with a few friends outside my family and, years ago, Dave.

It made studying and gaining more knowledge and skills easier. I became a semi-recluse outside of work and family, and even then, I had been absent. There is no other way I would have completed nearly all the SANS certifications that were available, and no way I would have been doing double digits in post-grad degrees otherwise. Knowledge comes with costs.

The only reason some want to turn Bitcoin, Zcash, or Monero into an anonymous system is to try and stop the forms of analysis that you can do on Bitcoin. At scale, Bitcoin is private, not anonymous. That is, you can be secure in the knowledge that you cannot be traced short of a massive and costly effort. That said, when you do find something of the form I was analysing in the report below, then you have a means to stop it.

Many want to make a better version of Silk Road. This will never be on Bitcoin. Silk Road was not merely about drugs and guns as many try to say. It was not a victim-less crime. The take-down of Silk Road resulted in more than 17 convictions for sex crimes. Richard Huckle (here in the UK) was served 22 life sentences for 71 child sex offences for sites on the dark web that linked to SR.

It is time we started to talk about the dark places.

Bitcoin is a system designed to act as honest money.

Three-dimensional visualization of social interactions networks

Abstract:

This paper presents a study of data visualisation as used in the investigation of a child grooming case. In this, we demonstrate how a series of Data mining and network visualization technologies can be used in order to map and report on the relationships between individuals in a social network and to uncover seemingly hidden relationships between the individuals. Due to the nature of the investigation and data reported, the names of the people involved in the chatrooms have been modified.

Keywords

Security, Forensics, Social Networks, Visualisation

1 Introduction

This study involves the case of an adult who was engaged in an inappropriate online chat with an underage girl on the internet for the purposes of soliciting sex (child grooming). Data mining and network visualization were used in order to discover the evidence both for and against the investigation of the accused. The benefits that derived from this form of three-dimensional visualisation arise from the simplification of complex datasets (such as social networks, chats and logs) into an easily comprehensible 3-D map that a user can rotate, zoom and otherwise interact with. This is where the strengths of visualisation technologies come to the fore. These technologies allow the forensic investigator to uncover previously hidden relationships that exist within the data. More importantly, the visualisation techniques that are available today make reporting to a lay jury simpler, and allow the jurists to see the evidence in a manner more cogent with human thought processes.

In the visualization, tightly connected groups can be seen to be packed tightly together, and the outsiders’ to the conversations end up displayed further apart on the edges of the network display. Although not originally designed for this purpose, the GEOMI program has allowed for the display of social relationships between chat users in social networks. This program has further been used to model changes and alterations to logs and to detect tampering with digital evidence. This paper concludes with the argument that through the use of simple tools, an investigator can be empowered to formulate better hypothesis concerning the activities and intentions of suspected individuals in a manner that allows for the discovery of further evidence.

2 Methods

The inspiration for this project was taken from the biological interactions of complex systems. In this, networked relationships such as the complex interactions of bees and ants in a hive environment were noted to conclude in cogent behaviour from seemingly unrelated and chaotic interactions.

2.1 Network visualisation and statistical analysis tools

The GEOMI software (Ahmed et al., 2005; Ho et al., 2008) was used to visualize the network, GEOMI was downloaded from www.systemsbiology.org.au. The friendship network data was stored in a PostgreSQL database version 8.3 and accessed by GEOMI using JDBC database connection. Statistical analysis was performed using GEOMI, the Jung Java package[1]

Can visualize the network in 3-D

User can zoom, rotate and interact with the map

Can be adapted for analysing many different types of networks

The network is arranged so that tightly connected groups are shown closer together, and loosely connected nodes are placed further apart in the visualized network

2.2 Attributes of the social interactions dataset

The social interaction dataset was taken from an online internet web chat room. The data was obtained as a result into the investigation of a suspected child grooming incident. The dataset, a capture of messages and communications including private messaging and broadcasts to the users wall were supplied to the SA Police by the operators of the social network.

This study analyse the social interaction networks built from person-to-person conversations. Note that the direction in which the message is sent is important for the purpose of this study. However, the number of times a message is sent is not important. Messages which are broadcast to all other users have been removed from the analyses as these do not directly allow for the correlation of directed communications. The dataset contained the content of the conversation, the IP address of each username, date and time of online events such as log-in, log-out, broadcast, and conversation. Whilst the analysis utilised the authentic username, the date and time of the communication with the complete content of the conversations, and original source IP addresses, the authors have altered the user names and IP addresses for the purposes of reporting this study. As such, this data has been altered and those accounts that are reported in this report have been kept isolated to protect the privacy of the people concerned. To protect the identity of the users, the actual username is replaced with an anonymous alias using the following methods.

Figure 1. Alice’s friends and their connections. A) Original network. B) Resolving multiple identities. C) Using only two-way connections. D) Resolving multiple identities + only using two-way connections. For the above networks, only the principal connected component of the network is shown, except for c), as Bob is not in the principal connected component of this network. The node for Bob’s network is larger than other nodes and is highlighted in yellow.

· Random words were chosen from the OpenOffice spellcheck dictionary (http://wiki.services.openoffice.org/wiki/Dictionaries) and this was used to replace the actual username.

· Any non-printable characters are removed using the ‘sed’ Linux command line filter.

The data was parsed into tab separated format using customised Perl scripts and was then stored using a PostgreSQL relational database. This was selected as it could be directly accessed by GEOMI using JDBC connection to generate visualization for the social interactions network.

3 Analysis on the friendship network between Alice and Bob

This study focuses on the analysis of the social interactions between Alice and Bob (their anonymous alias is used here to protect their real identity). In this study, the social interaction network was built for an individual referred to as Alice. This is in effect a social network of everyone whom the selected person talks to. To simplify the visual representation and the statistical analysis, Alice was then removed from the network (Figure 1). Next, a social network consisting of Bob’s friends and other connections was created using the same process and method described above (Figure 2). The anonymous aliases for those friends that are shared by Alice and Bob have been maintained in each analysis.

Figure 2. Bob’s friends and their connections. A) Original network. B) Resolving multiple identities. C) Using only two-way connections. D) Resolving multiple identities + only using two-way connections. For the above networks, only the principal connected component of the network is shown. The nodes for Alice’s network are larger than other nodes and are highlighted in yellow.

3.1 Using IP addresses to find users with multiple online identities, and to find geographical proximity of users

The IP address recorded from log-in and log-out records was used to locate and correlate users with multiple usernames. The first three layers of the IP address is used to determine whether the same user utilised more than one username. Similarly, using the first two layers of the IP address, a network of each person linking to each first two layer of the IP address was formed. The resulting network provides a rough idea of the geographical proximity of each user to other user (Figure 1).

3.2 Using two-way conversations to find more reliable social interactions

To improve the reliability of the social interaction networks, interactions which only occur in one direction were removed from the network. These included unanswered requests. In order to find the two-way conversations in Alice’s social network, the first step is finding all of Alice’s friends whom she has two-way interactions with. This is followed by finding all the two-way interactions between Alice’s friends. The above steps are performed to uncover all the two-way interactions for Bob’s social network. These two-way conversation networks are compared to the original interaction networks.

Figure 3. Friends in common between Alice and Bob and their connections. The network is built after resolving multiple identities and using two-way connections only.

3.3 Combining the search for multiple online identities with the use of reciprocal conversations to improve interaction networks data

Two separate approaches to improve the reliability of social interaction networks were described above. To create a high confidence interaction network, both methods are used to build the social interaction networks for Alice and the social interaction networks for Bob. The search for multiple online identities using IP address is applied before the use of reciprocal conversations to build high confidence network. These high confidence friendship networks are compared to the original interaction networks in terms of their network properties. The degree of overlap between the high-confidence social interaction networks of Alice and Bob are measured. This displayed how many friends they shared and will provide insights as to whether they both part of a larger community, a small close knit network of friends, or friends met online (Figure 4).

Figure 4. Visualisation of the first two layers of IP addresses amongst chat-room users. Chat-room users are represented as blue nodes, and the first two layers of the IP address are represented as blue nodes. There are no connections between two IP addresses, and similarly there is no connection between two chat-room users.

From the above analysis, nine (9) networks were created via the processes detailed in this paper and returned. Their network properties were summarised and compared providing insight into whether each of the step for improving the reliability of the network was useful or not. Their scale-free distribution and the distribution of the clustering co-efficient was then compared and analysed.

3.4 Using the high-confidence interaction network to provide insight to the relationship between Alice and Bob

The aim of this study was to understand the relationship between Alice and Bob. It is important to analyse whether Alice was central to the social network of Bob and vice versa. This also provided insight as to whether Bob and Alice are close friends who share many mutual friends or are they are likely to have met each other online and not related to each other otherwise. This analysis is also useful for similar types of study.

4 Results

The results of the analysis have been reported in the sections below.

4.1 Simple test of the online chatroom

The registration procedure to participate in a chat in this website involves the user logging in with an individualised username and a password. Multiple log-in simultaneously using the same IP address but different username is possible. If another person attempts to log-in with the same username as another user that has already logged-in, the user whom is in the chat room first automatically gets kicked out of the chatroom.

4.2 The nature of the interaction dataset

This study analyse the social interaction networks built from person-to-person conversations. This dataset contains data on pairwise interactions, for example, person A sent a message to person B. Note that the direction in which the message is sent is important for the purpose of this study, but the number of times a message is sent from person A to person B is unimportant. Although a user may broadcast a message to all other users, these messages are removed from the rest of the analyses due to their unspecific nature. The dataset also contains the content of the conversation, the IP address of each username, date and time of online events such as log-in, log-out, broadcast, and conversation.

4.3 Replacing actual username with anonymous alias

Using words randomly selected from the dictionary, each anonymous alias can be easily referred to during the analysis, with the advantage that it is easier to remember than an identity code. A table which converts each username to an anonymous alias was created and maintained, but this table is not provided to protect the identity of the subjects. Anonymous alias for friends shared by Alice and Bob are kept consistent.

4.4 Analysis on the friendship network between Alice and Bob

This study focuses on the analysis of the social interactions between Alice and Bob, their anonymous alias is used here to protect their real identity. The aim is to analyse the interactions between Alice and Bob in the context of their social interaction networks and friendship networks. We introduce the concept of a social interaction network for an individual person, which is social network of everyone whom the selected person talks to. For example, the social interaction network for Alice will include everyone she talks to online. Since Alice interacts with everyone in her social network, therefore the number of edges she contributes to the network is equal the number of friends she has. A reasonable simplification, therefore, is to omit Alice from the network to make network visualization clearer. She is also omitted from the statistical network analysis to make the statistical results consistent with the visualization. We have also created a social network for Bob using the method described above. From visual inspection of the network and the interactions data, we realise that there could be multiple usernames which are similar, and these may represent multiple identities used by same person. Also, users with many friends tend send messages to many people without receiving a reply. This motivates us to improve the original network by consolidating multiple user names, and to remove interactions which only occur in one direction.

4.5 Disambiguating multiple online identities using IP addresses changes network properties

In the online chatroom, a user can utilize one or more usernames, and the number of usernames they use is not restricted. It is easy and common for online chatroom users to employ more than one online username, and this may affect the quality of the network and poses problems for network analysis. It was desirable to be able to measure the number of people in the network that use multiple identities. It was also desirable to know if merging multiple usernames affected the result of network analysis in comparison to the original network. The first three layers of the IP address is used to determine whether the same user utilises more than one username. For example, if user ‘magnification’ has IP address 59.94.96.XXX, where XXX are numbers, and username ‘hallway’ has IP address 59.94.96.YYY, where YYY are numbers, they are defined as the same user.

The first table of anonymous aliases was used to create a second updated table. In this updated table, the updated the list of anonymous aliases was updated with the online identities replaced using a unique anonymous alias. Continuing with the example, both username ‘magnification’ and username ‘hallway’ now has the anonymous alias ‘cake’ (Table 1). A total of 13 users that use multiple identities were noted in the dataset. This would have been identified as different 33 users without use of this approach.

Table 1. Multiple identities which represent the same person detected by IP address.

The next stage was to compare the social interaction networks after the users with multiple online identities had been identified are to the original network in terms of their network properties. For Alice’s network, the number of nodes for the largest connected component decreases from 53 to 48, and the number of edges increases from 89 to 92. The average clustering co-efficient (average CC) for the biggest connected component increases from 0.38 to 0.48. The diameter of the network decreases from 8 to 6. For Bob’s network, the number of nodes for the biggest connected component decreases from 96 to 89, and the number of edges increases from 319 to 321. The average CC for the biggest connected component increases from 0.33 to 0.39, but the diameter remains unchanged. Therefore, this method decreases the number of nodes in the network, slightly increases the number of connections, the combination of which leads to a higher value for the average CC.

4.6 Visualising similarity in IP addresses amongst chat-room users

Using the first two layers of the IP address, a network of each person linking to each first two layer of the IP address was created. The resulting network provides a rough idea of the geographical proximity of each user to other user. The reader should observe that some usernames are commonly used and could represent more than one person, for example the user ‘cake’, which is related to 24 different IP address nodes. This is highlighted by the many geographical localisation and IP address that user name is linked to. In other areas, it is possible to observe how each user relates to each other in terms of IP address. It is possible to use a dictionary of IP address to find the geographical area of each person and use this information as evidence in court and to assist in criminal investigations.

4.7 Using reciprocal conversations affects network properties

The directionality of the message being sent is important for the analysis of social interaction network. Social relationships in which communications are reciprocated are more reliable than those where message is sent in only one direction, also called one-way communication. Users commonly establish new social interactions by sending a message to many people, and only receive replies for a small proportion of these, and this result in many one-way interactions. These one-way interactions may increase both network connectivity and network size, but they are not so important for this study where two-way communications are more reliable evidence of a relationship. They pose a problem for network analysis and could affect the quality of the network analysis results. These two-way conversation networks were compared to the original interaction networks where all interactions, including one-way interactions are included.

For Alice’s network, the number of nodes for the biggest connected component decreases from 53 to 34, and the number of edges decreased by a half from 89 to 36. The average clustering co-efficient (average CC) for the biggest connected component increases from 0.38 to 0.39. The diameter of the network increases from 8 to 13. For Bob’s network, the number of nodes for the biggest connected component decreases from 96 to 56, and the number of edges decreases from 319 to 110. The average CC for the biggest connected component decreases from 0.33 to 0.31, and the network diameter increases from 6 to 7. In both Alice’s and Bob’s network, this method almost halves the number of nodes in the network. Therefore, this method decreases the number of nodes and the number of edges in the BCC of social interactions, however, the effect of this method on the average CC is different for each network.

4.8 Determination of multiple online identities and use of reciprocal conversations to improve the reliability of social interaction networks

Two separate approaches to improve the reliability of social interaction networks have already been presented leading to the analysis of their effects on network properties. These two approaches were then combined leading to the disambiguation of multiple online identities and the removal of one-way interactions to improve the quality of the social interaction networks for both Alice and Bob. The aim was to understand whether both methods work synergistically in changing network properties.

In comparison to the original network for Alice, the number of nodes for the largest connected component decreases from 53 to 36, and the number of edges decreased by a half from 89 to 49. The average clustering co-efficient (average CC) for the biggest connected component increases from 0.38 to 0.50. The diameter of the network increases from 8 to 10. For Bob’s network, the number of nodes for the biggest connected component decreases from 96 to 55, and the number of edges decreases from 319 to 116. The average CC for the biggest connected component decreases from 0.33 to 0.29, and the network diameter remains at 6. In both Alice’s and Bob’s network, this method almost halves the number of nodes in the network. Therefore, this method decreases the number of nodes and the number of edges in the BCC of social interactions, however, the effect of this method on the average CC is different for each network.

Table 2. Network statistics of biggest connected component

4.9 Alice and Bob only share a small number common friends

The end result (Figure 3) was a low number of social interactions between Bob and Alice and their mutual “friends”.

5 Discussion: The effect of multiple identities

The presence of multiple identities causes the network to be less connected and creates the illusion that there are more users in the network. Consolidating multiple identities help recovers missing connections in the network and therefore could increase the clustering co-efficient.

Through the use of these techniques, it is correlate multiple identities even when a Proxy server is used and several individuals are located on the same IP address.

6 Conclusion

This analysis could also be applied to other social networks and in the detection of fraud in connected datasets. These processes show the interconnectivity between social networks and provided evidence that an individual was engaged in communications with a number of minors even though the individual had attempted to cover his tracks using a number of identities.

From the results, the reader can see that a study of data using advanced visualisation can aid in the determination of subjects in social network investigations as wide as child grooming cases. The use of this form of Data mining and network visualization technologies can create a map of the detailed relationships between individuals in a social network allowing an investigator to uncover seemingly hidden relationships between the individuals.

References

1. Ahmed A, Dwyer T, Forster M, Fu X, Ho J, Hong S-H, Koschutzki D, Murray C, Nikolov N S, Taib R, Tarassov A, Xu K (2005) In GEOMI: GEOmetry for Maximum Insight, Proceeding of 13th International Symposium on Graph Drawing, Limerick, Ireland, September 2005; Limerick, Ireland; pp 468–479.

2. Ho E, Webber R, Wilkins MR (2008). Interactive three-dimensional visualization and contextual analysis of protein interaction networks. Journal of Proteome Research, 7: 104–12.

3. Ridge E., Kudenko D., Kazakov D. and Curry E., (2005) “Moving Nature-Inspired Algorithms to Parallel, Asynchronous and Decentralised Environments,” in Self-Organization and Autonomic Informatics (I), vol. 135, pp. 35–49.

[1] (http://jung.sourceforge.net/download.html), the R statistical analysis software (2005), version 2.7.0, and the igraph library version 0.5.1, (http://cneurocvs.rmki.kfki.hu/igraph).