Jesus Rodriguez is the CTO and co-founder of IntoTheBlock, a platform focused on enabling an intelligent infrastructure for the crypto markets, as well as chief scientist of AI firm Invector Labs and an active investor, speaker and author in crypto and artificial intelligence. This article originally appeared in CoinDesk’s Institutional Crypto newsletter.

One of the established beliefs in the cryptocurrency market is its susceptibility to news and social media. Like any other nascent and still irrational financial market, unexpected developments captured in news or social media tend to impact price. As a result, there is increasing interest in leveraging machine learning techniques such as sentiment analysis to detect possible correlations with the price of cryptocurrencies and digital tokens. Despite its importance, most attempts to leverage sentiment analysis are too basic to output any tangible intelligence and quite often produce misleading results.

The challenges of efficiently leveraging sentiment analysis to evaluate the behavior of an asset are not unique to the crypto space. Producing true insights based on textual sentiment is a very difficult task that, most of the time, requires natural language processing (NLP) models optimized for a specific financial domain. Large quantitative hedge funds use armies of machine learning experts to train NLP models in a very specific task like analyzing earning reports in order to get an edge in a medium frequency trade. Efficiently leveraging sentiment analysis for crypto assets requires machine learning depth and rigor.

To understand that statement, let’s start by diving a bit deeper into the characteristics of sentiment analysis methods.

A gentle introduction to sentiment analysis

In Act II, Scene II of the famous play Richelieu; Or the Conspiracy, British playwright Edward Bulwer-Lytton coined a phrase that has transcended generations: “The pen is mightier than the sword.” Centuries after, that famous quote brilliantly encapsulates the importance of sentiment analysis. Emotions in textual communication are sometimes more conducive to actions than physical actions themselves.

Conceptually, sentiment analysis is a subdiscipline of NLP that focuses on identifying the affective states of textual communications. Contrary to popular beliefs, sentiment analysis is not a single technique but rather a subdiscipline of the deep learning space that covers different types of affection detection in textual data. From that perspective, there are several types of sentiment analysis that could be relevant in the context of crypto-asset intelligence:

Polarity Analysis : This type of sentiment analysis ranks textual sentiment in positive, negative and neutral. For instance, the sentence “the bitcoin price rally has reenergized the market” would likely be classified as positive by most models.

: This type of sentiment analysis ranks textual sentiment in positive, negative and neutral. For instance, the sentence “the bitcoin price rally has reenergized the market” would likely be classified as positive by most models. Emotion/Tone Analysis : Instead of an overall qualifier for the text, this type of analysis centers on scoring the different types of emotions present in a particular text. Emotions such as sadness, happiness or anger are a common focus of emotion analysis algorithms. For instance, the sentence “this bitcoin rally is crazy,” will show high levels of excitement and joy.

: Instead of an overall qualifier for the text, this type of analysis centers on scoring the different types of emotions present in a particular text. Emotions such as sadness, happiness or anger are a common focus of emotion analysis algorithms. For instance, the sentence “this bitcoin rally is crazy,” will show high levels of excitement and joy. Aspect Sentiment Analysis: This type of sentiment analysis focuses on interpreting the sentiment about specific subjects within a sentence rather than a sentence as a whole. For instance, in the sentence “Bakkt futures are a major milestone for the bitcoin market,” aspect analysis will determine the sentiment related to “Bakkt futures” instead of the complete sentence.

Looking at the previous list, we can clearly see the benefits of sentiment analysis for crypto assets. However, there are also plenty of challenges that should be considered before venturing into using these types of techniques. Contextualization, subjectivity, irony or even bad grammar are among the factors that can easily trick the best NLP algorithms.

Sentiment analysis for crypto assets

Crypto is a nascent asset class that is still vulnerable to the irrationality of financial markets and the lack of proper disclosure channels. From that perspective, it is only logical to assume NLP techniques such as sentiment analysis can identify alpha or smart beta generator factors to predict the behavior of crypto assets. Reality is a bit different.

When applying sentiment analysis to crypto assets, we are likely to encounter two main types of challenges:

Limitations of mainstream NLP technologies when applied to a domain-specific problem such as crypto asset analysis. Incorrect assumptions about how sentiment is reflected in news and social media.

The first challenge can almost be seen as an unexpected side effect of the rapid growth of NLP technologies. Today, it is relatively easy for a developer to incorporate sentiment analysis into applications using simple APIs that don’t require any deep learning expertise.

While NLP APIs can be effective analyzing the sentiment of a generic sentence, they perform extremely poorly when trying to extrapolate domain-specific knowledge of a specific sentence. For instance, analyzing the sentence “a bitcoin ETF approval could be imminent” requires NLP models that are specialized in the semantics of market-specific terminology and that are able to extrapolate sentiment at a more granular level than from just a sentence.

The second challenge is related to misconceptions about how sentiment is reflected in news and social media commentary. As a source of intelligence, news can be highly informative but quite useless when comes to sentiment analysis. The reason is obvious: the sentiment in well-written news should trend around neutral. Social media behaves in the exact opposite way. Conversations about cryptocurrencies in Twitter or Telegram tend to contain relevant sentiment but, for the most part, are based on a reaction to public material information, which means that they are unlikely to generate any informational edge. Additionally, social media threads tend to be noisy and relatively subjective, which can produce misleading sentiment analysis results.

From a purely technological standpoint, building effective sentiment analysis models for crypto assets requires models trained in the terminology of crypto markets, but that also analyze news as sources of information and social media feeds as amplifiers of sentiment. However, if we get past this technological challenge, we are now faced with one of the biggest psychological misconceptions when comes to sentiment analysis models in the crypto space.

The sentiment-market impact fallacy

The sentiment-market impact fallacy describes a phenomenon that is notorious or irrational, such as nascent financial markets in which investors assume a direct correlation between a sentiment score and a price movement. To explain this behavioral economics dynamic, let’s imagine that you are using an analytics tool that analyze the sentiment of recent bitcoin tweets. Psychologically, most investors are inclined to interpret the sentiment as a leading indicator based on the following rules:

If the sentiment is positive that’s a bullish indicator for the price of bitcoin.

If the sentiment is negative that’s a bearish indicator for the price of bitcoin.

However, if your model is analyzing public, material information, the sentiment should be interpreted as a lagging indicator following some non-intuitive rules:

If the sentiment is positive and the price of bitcoin does not go up, that is a bearish signal.

If the sentiment is negative and the price of bitcoin does not go down, that is a bullish signal.

Being aware of sentiment-price bias positions sentiment analysis not as a leading indicator but as an often relevant factor in a trading strategy.

From sentiment analysis to market impact analysis

From an informational standpoint, the crypto market is noisy and full of unexpected events. In terms of sentiment analysis, that combination of factors is a nightmare. Instead of narrowly focusing on sentiment analysis, we should probably develop a more holistic approach. A sentiment-market impact indicator would be a combination of polarity (negative, positive, neutral), emotion (anxious, excited, sad…) and aspect-based (topics, entities…) analysis over long periods of time. This approach would require the training of models specialized in the dynamics of crypto assets to evaluate the sentiment in the context of specific market conditions.

The idea of sentiment-market impact models is conceptually trivial: quantify the impact that combinations of sentiment, emotions and topics can have on a crypto asset during specific market conditions. Part of the beauty of this approach is that it doesn’t have to be completely unsupervised like most sentiment models today; it can be trained on domain-specific knowledge of crypto markets. For instance, we could train a model to learn that positive articles about Chinese investment in crypto can have a positive impact in a market that had been relatively bearish for the last week. The core principle of sentiment-market impact analysis models would be to contextualize the knowledge of sentiment models to the specifics of the crypto market.