Introduction

What is Real-Time Crypto

Have you ever wondered how to build a cryptocurrency trading bot? Have you wondered how you can apply machine learning to the financial markets? Have you wondered how you can get out of Excel and into the world of big data? You have come to the right place! Welcome to Real-Time Crypto. In this book, we will build on your basic Python and data skills and turn you into a bona fide big data engineer. Along the way, you will learn how to trade cryptocurrency in real-time, using cutting edge big data technology and machine learning.

Let's start with a basic definition of what we mean by Real-Time Crypto and what we will build in this book:

A big data pipeline which receives streaming data from a cryptocurrency exchange, processes the data, applies a predictive algorithm, executes a trade, and stores the results for future usage.

To accomplish this we are going to have to:

Process large volumes of data

Combine multiple data sources

Apply a machine learning model

Processing Data In this book you will gain exposure to the fundamentals of data processing. These include: normalizing data fields: putting like with like

putting like with like information extraction: finding critical information buried within the data

finding critical information buried within the data data enrichment: adding relevant information to improve the quality of the data. Combining Sources

We will need to combine multiple data sources to accurately inform our trading decisions. This is a common task for a data scientist or data engineer so we'll get hands-on by combining historic pricing data and real-time data from an exchange.

Applying Models

We are going to train a machine learning model on a historic, static data set. In our real-time pipeline we will learn how to apply this model to data on the fly.



Cryptocurrency trading pipelines don’t have to be complicated, and they shouldn’t be. Many times, the simplest solution is the best solution. The complexity of your pipeline's architecture should depend on the volume of your data and sophistication of your strategy. Remember though, simple can be powerful.

Goals of the Book

By the time you finish this book you'll have a foundational understanding of cryptocurrency analysis and automation. You'll have a solid understanding of machine learning-based cryptocurrency trading pipelines-how they work and how to build one-and you’ll have a complete pipeline running on your computer! You will feel confident in your ability to take what you've learned and apply it to the creation of production-level pipelines for processing whatever data may come your way.

Real-World Use Case

Rather than work with a toy dataset that doesn’t look anything like the stuff you typically deal with, you’ll be implementing a real-world use case with real world data: predicting how Bitcoin prices will change in order to time trades with your very own trading bot.



First, we will process historical data and train a predictive machine learning model. We’ll be using the past few years of minute-by-minute Bitcoin prices. Our time series forecasting model will predict price swings in Bitcoin so that we know when to trade.



Next, we will incorporate this algorithm into a streaming data pipeline. This pipeline will take real-time data from Gemini, a major cryptocurrency exchange, and use our algorithm to determine whether to execute trades. Our pipeline's architecture will include websockets, Kafka, Spark, and Elasticsearch.