Compiled by Rekhit Pachanekar

The automated trading system or Algorithmic Trading has been at the centre-stage of the trading world for more than a decade now. A “trading system”, more commonly referred as a “trading strategy” is nothing but a set of rules, which is applied to the given input data to generate entry and exit signals (buy/sell).

Although formulating a trading strategy seems like an easy task, in reality, it is not! Creating a successful trading strategy requires exhaustive quantitative research, and the brains behind a quantitative trading strategy are known as “Quants” in the algorithmic trading world. We can define a quant as a professional employed by a quantitative trading firm who applies advanced mathematical and statistical models with the sole objective to create an alpha-seeking strategy, ie a profitable trading strategy that can consistently generate returns that are independent of the direction of the overall market.

The percentage of volumes attributed to algorithmic trading has seen a significant rise in the last decade. In the US and other developed markets, High-Frequency Trading and Algorithmic trading accounts for an estimated 70% of equities market share. In India, the percentage with respect to the total turnover has increased up to 49.8%.

The growth in automated trading has led to significant changes in the basic architecture of automated trading systems over the past decade and continues to do so. For firms, especially those using high-frequency trading systems, it has become a necessity to innovate on technology in order to compete in the world of algorithmic trading, thus, making the algorithmic trading field a hotbed for advances in computer and network technologies.

In this post, we will demystify the architecture behind automated trading systems for our readers. We compare the new architecture of automated trading systems with the traditional trading architecture and understand some of the major components behind these systems.

We will cover the following points in this blog:

Traditional Architecture

Any trading system, conceptually, is nothing more than a computational block that interacts with the exchange on two different streams.

Receives market data Sends order requests and receives replies from the exchange.

In its basic form, we can portray the exchange of data from the Exchange and the Automated trading system as follows:

The market data that is received typically informs the automated trading system of the latest order book. It might contain some additional information like the volume traded so far, the last traded price and quantity for a scrip. However, to make a decision on the data, the trader might need to look at old values or derive certain parameters from history. To cater to that, a conventional system would have a historical database to store the market data and tools to use that database. The analysis would also involve a study of the past trades by the trader. Hence another database for storing the trading decisions as well. Last, but not least, a GUI interface for the trader to view all this information on the screen.

Taking all the points above into consideration, the traditional architecture of the entire automated trading system can now be broken down into

The exchange(s) – the external world

The server

Market Data receiver



Store market data



Store orders generated by the user

Application

Take inputs from the user including the trading decisions



Interface for viewing the information including the data and orders



An order manager sending orders to the exchange

Limitations of traditional architecture

However, it was found that traditional architecture could not scale up to the needs and demands of Automated trading with DMA. The latency between the origin of the event to the order generation went beyond the dimension of human control and entered the realms of milliseconds and microseconds. Order management also needs to be more robust and capable of handling many more orders per second. Since the time frame is minuscule compared to human reaction time, risk management also needs to handle orders in real-time and in a completely automated way.

For example, even if the reaction time for an order is 1 millisecond (which is a lot compared to the latencies we see today), the system is still capable of making 1000 trading decisions in a single second. Thus, each of these 1000 trading decisions needs to go through the Risk management within the same second to reach the exchange. You could say that when it comes to automated trading systems, this is just a problem of complexity.

Another point which emerged is that since the architecture now involves automated logic, 100 traders can now be replaced by a single automated trading system. This adds scale to the problem. So each of the logical units generates 1000 orders and 100 such units mean 100,000 orders every second. This means that the decision-making and order sending part needs to be much faster than the market data receiver in order to match the rate of data.

The New System architecture of the Automated Trading System

To overcome the limitations of the traditional system architecture, the engine which runs the logic of decision making, also known as the ‘Complex Event Processing’ engine, or CEP, moved from within the application to the server. The Application layer is now a little more than a user interface for viewing and providing parameters to the CEP.

The problem of scaling in an automated trading system also leads to an interesting situation. Let us say 100 different logics are being run over a single market data event (as discussed in the earlier example). However, there might be common pieces of complex calculations that need to be run for most of the 100 logic units, let’s say, the calculation of greeks for options. If each logic were to function independently, each unit would do the same greek calculation, which would unnecessarily use up processor resources. In order to optimize on the redundancy of calculation, complex redundant calculations are typically hived off into a separate calculation engine which provides the greeks as an input to the CEP in the automated trading system.

Although the application layer is primarily a view, some of the risk checks (which have now become resource-hungry operations owing to the problem of scale), can be offloaded to the application layer, especially those that are to do with the sanity of user inputs like fat finger errors.

The rest of the risk checks in an automated trading systems are performed now by a separate Risk Management System (RMS) within the Order Manager (OM), just before releasing an order. The problem of scale also means that where earlier there were 100 different traders managing their risk, there is now only one RMS system to manage risk across all logical units/strategies. However, some risk checks may be particular to certain strategies, and some might need to be done across all strategies. Hence the RMS itself involves strategy level RMS (SLRMS) and global RMS (GRMS). It might also involve a UI to view the SLRMS and GRMS.







Now let us understand the server components in more detail.

Market Adapter

Exchange or any market data vendor sends data in their own format. Your algorithmic trading system may or may not understand that language. Exchange provides you with an API or an Application Program Interface which allows you to program and create your own adapter which can convert the format of the data into the format your system can understand.

Complex Event Processing Engine

This part is the brain of your strategy. Once you have the data, you would need to work with it as per your strategy, which involves doing various statistical calculations, comparisons with historical data and decision making for order generation. The type of order, order quantity is prepared in this block.

What you call a Trading System is actually a CEP System

A complex event is nothing but a set of incoming events. These include stock trends, market movements, news etc. Complex event processing is performing computational operations on complex events in a short time. In an automated trading system, the operations can include detecting complex patterns, building correlations and relationships such as causality and timing between any incoming events.

CEP systems process events in real-time, thus the faster the processing of events, the better a CEP system is. For example, if an automated trading system is designed to detect a profit-making opportunity for the next 1 second, but the time taken by the CEP system exceeds this threshold, then the trading system won’t be able to make any profits.

The CEP system comprises of four parts:

CEP engine

CEP rules

CEP WS

CEP result interface

The two primary components of any CEP system are the CEP engine and the set of CEP rules. The CEP engine processes incoming events based on CEP rules. These rules and the events that go as an input to the CEP engine are determined by the trading system (trading strategy) applied.

For a quant, the majority of his work is concentrated in this CEP system block. A quant will spend most of his time in formulating trading strategies; performing rigorous backtesting, optimization, and position-sizing among other things. This is done to ensure the viability of the trading strategy in real markets. No single strategy can guarantee everlasting profits. Hence, quants are required to come up with new strategies on a regular basis to maintain an edge in the markets.

There are a number of popular automated trading systems that are widely used in current markets. These range from Momentum strategies, Statistical arbitrage, Market making etc. See our very insightful blog on Algorithmic Trading Strategies, Paradigms and Modelling Ideas to know more about these trading systems.

Order Routing System

The order is encrypted in the language which the exchange can understand, using the APIs which are provided by the exchange. There are two kinds of APIs provided by the exchange: native API and FIX API. Native APIs are those which are specific to a certain exchange. The FIX (Financial Information Exchange) protocol is a set of rules used across different exchanges to make the data flow in security markets easier and effective. We will talk about FIX further in the next section.

In case of an open economy, one can send orders through the automated trading system to exchanges or non-exchanges and ORP should be able to handle orders to different destinations.

Here, we would like to point out that the order signal can either be executed manually by an individual or in an automated way. The latter part is what we consider an “Automated trading system”. The order manager module comprises of different execution strategies which execute the buy/sell orders based on pre-defined logic. Some of the popular execution strategies include VWAP, TWAP etc. There are different processes like order routing, order encoding, transmission etc. that form part of this module. See our blog on Order Management System (OMS) to know more about these processes.

Risk management in Automated Trading Systems

Since automated trading systems work without any human intervention, it becomes pertinent to have thorough risk checks to ensure that the trading systems perform as designed. The absence of risk checks or faulty risk management can lead to enormous irrecoverable losses for a quantitative firm. Thus, a risk management system (RMS) forms a very critical component of any automated trading system.

There are 2 places where Risk Management is handled in automated trading systems:

Within the application – We need to ensure those wrong parameters are not set by the trader. It should not allow a trader to set grossly incorrect values nor any fat-finger errors.

Before generating an order in OMS – Before the order flows out of the system we need to make sure it goes through some risk management system. This is where the most critical risk management check happens. See our blog on “Changing trends in trading risk management” to know more about risk management aspects and risk handling in an automated trading system.

For learning more on order manager you can see our post on “Order Management System in Automated Trading System”

For learning more on risk management you can see our post on “Changing Trends in Trading Risk Management in Automated Trading Systems”

The emergence of protocols for automated trading systems

As we have seen previously on the automated trading system tutorial, since the new architecture was capable of scaling to many strategies per server, the need to connect to multiple destinations from a single server emerged. So the order manager hosted several adaptors to send orders to multiple destinations and receive data from multiple exchanges.

Each adaptor acts as an interpreter between the protocol that is understood by the exchange and the protocol of communication within the system. Multiple exchanges would thus, require multiple adaptors.

However, to add a new exchange to the automated trading system, a new adapter has to be designed and plugged into the architecture since each exchange follows its protocol that is optimized for features that the exchange provides. To avoid this hassle of adapter addition, standard protocols have been designed. The most prominent amongst them is the FIX protocol. This not only makes it manageable to connect to different destinations on the fly but also drastically reduces the go-to-market time when it comes to connecting with a new destination.

The presence of standard protocols makes it easy for the automated trading system to integrate with third-party vendors for analytics or market data feeds as well. As a result, the market becomes very efficient as integrating with a new destination/vendor is no more a constraint.

In addition, simulation becomes very easy as receiving data from the real market and sending orders to a simulator is just a matter of using the FIX protocol to connect to a simulator. The simulator itself can be built in-house or procured from a third-party vendor. Similarly recorded data can be replayed with the adaptors being agnostic to whether the data is being received from the live market or from a recorded data set.

The emergence of low latency architectures

With the building blocks of an automated trading system in place, the strategies now have the ability to process huge amounts of data in real-time and make quick trading decisions. Today, with the advent of standard communication protocols like FIX, the technology entry barrier to setup an algorithmic trading desk or an automated trading system, has become lower and consequently, the world of algorithmic trading has become more competitive. As servers got more memory and higher clock frequencies, the focus shifted towards reducing the latency for decision making. Over time, reducing latency has become a necessity for many reasons like:

The strategy makes sense only in a low latency environment

Survival of the fittest – competitors pick you off if you are not fast enough

The problem, however, is that latency is really an overarching term that encompasses several different delays. Although it is very easily understood, it is quite difficult to quantify. It, therefore, becomes increasingly important how the problem of reducing latency is approached.

If we look at the basic life cycle in an automated trading system,

A market data packet is published by the exchange The packet travels over the wire The packet arrives at a router on the server side. The router forwards the packet over the network on the server side. The packet arrives on the Ethernet port of the server. Depending on whether this is UDP/TCP, processing takes place and the packet, stripped of its headers and trailers makes its way to the memory of the adaptor. The adaptor then parses the packet and converts it into a format internal to the algorithmic trading platform This packet now travels through the several modules of the system – CEP, tick store, etc. The CEP analyses and sends an order request The order request again goes through the reverse of the cycle as the market data packet.

In an automated trading system, high latency at any of these steps ensures a high latency for the entire cycle. Hence latency optimization usually starts with the first step in this cycle that is in our control i.e, “the packet travels over the wire”. The easiest thing to do here would be to shorten the distance to the destination by as much as possible.

Colocations are facilities provided by exchanges to host the trading server in close proximity to the exchange. The following diagram illustrates the gains that can be made by cutting the distance.

In an automated trading system, for any kind of a high-frequency strategy involving a single destination, Colocation has become a defacto must. However, strategies that involve multiple destinations need some careful planning. Factors like the time taken by the destination to reply to order requests and its comparison with the ping time between the two destinations must be considered before making such a decision. The decision may be dependent on the nature of the strategy as well.

Network latency is usually the first step in reducing the overall latency of an automated trading system. However, there are plenty of other places where the architecture can be optimized.

Propagation latency

In an automated trading system, propagation latency signifies the time taken to send the bits along the wire, constrained by the speed of light of course.

Several optimizations have been introduced to reduce the propagation latency apart from reducing the physical distance. For example, the estimated roundtrip time for an ordinary cable between Chicago and New York is 13.1 milliseconds. Spread Networks, in October 2012, announced latency improvements which brought the estimated roundtrip time to 12.98 milliseconds. Microwave communication was adopted further by firms such as Tradeworx bringing the estimated roundtrip time to 8.5 milliseconds. Note that the theoretical minimum is about 7.5 milliseconds. Continuing innovations are pushing the boundaries of science and fast reaching the theoretical limit of the speed of light. Latest developments in laser communication, earlier adopted in defence technologies, has further shaved off an already thinning latency by nanoseconds over short distances.

Network processing latency

Network processing latency signifies the latency introduced by routers, switches, etc.

The next level of optimization in the architecture of an automated trading system would be in the number of hops that a packet would take to travel from point A to point B. A hop is defined as one portion of the path between source and destination during which a packet doesn’t pass through a physical device like a router or a switch. For example, a packet could travel the same distance via two different paths. But It may have two hops on the first path versus 3 hops on the second. Assuming the propagation delay is the same, the routers and switches each introduce their own latency and usually as a thumb rule, more the hops more is the latency added.

Network processing latency may also be affected by what we refer to as microbursts. Microbursts are defined as a sudden increase in the rate of data transfer which may not necessarily affect the average rate of data transfer. Since automated trading systems are rule-based, all such systems will react to the same event in the same way. As a result, a lot of participating systems may send orders leading to a sudden flurry of data transfer between the participants and the destination leading to a microburst. The following diagram represents what a microburst is.

The first figure shows a 1-second view of the data transfer rate. We can see that the average rate is well below the bandwidth available of 1Gbps. However, if we dive deeper and look at the second image (the 5-millisecond view), we see that the transfer rate has spiked above the available bandwidth several times each second. As a result, the packet buffers on the network stack, both in the network endpoints and routers and switches may overflow. To avoid this, typically a bandwidth that is much higher than the observed average rate is usually allocated for an automated trading system.

Serialization latency

Serialization latency for an automated trading system signifies the time taken to pull the bits on and off the wire.

A packet size of 1500 bytes transmitted on a T1 line (1,544,000 bps) would produce a serialization delay of about 8 milliseconds. However, the same 1500 byte packet using a 56K modem (57344bps) would take 200 milliseconds. A 1G Ethernet line would reduce this latency to about 11 microseconds.

Interrupt latency

Interrupt latency in an automated trading system signifies a latency introduced by interrupts while receiving the packets on a server.

Interrupt latency is defined as the time elapsed between when an interrupt is generated to when the source of the interrupt is serviced. When is an interrupt generated? Interrupts are signals to the processor emitted by hardware or software indicating that an event needs immediate attention. The processor, in turn, responds by suspending its current activity, saving its state and handling the interrupt. Whenever a packet is received on the NIC, an interrupt is sent to handle the bits that have been loaded into the receive buffer of the NIC. The time taken to respond to this interrupt not only affects the processing of the newly arriving payload, but also the latency of the existing processes on the processor.

Solarflare introduced “OpenOnload” in 2011, which implements a technique known as kernel bypass, where the processing of the packet is not left to the operating system kernel but to the userspace itself. The entire packet is directly mapped into the userspace by the NIC and is processed there. As a result, interrupts are completely avoided.

Thus, the rate of processing each packet is accelerated. The following diagram clearly demonstrates the advantages of kernel bypass.

Application latency

Application latency for an automated trading system signifies the time taken by the application to process.

This is dependent on the several packets, the processing allocated to the application logic, the complexity of the calculation involved, programming efficiency etc. Increasing the number of processors on the system would, in general, reduce the application latency. Same is the case with increased clock frequency. A lot of automated trading systems take advantage of dedicating processor cores to essential elements of the application like the strategy logic for eg. This avoids the latency introduced by the process switching between cores.

Similarly, if the programming of the strategy in an automated trading system has been done keeping in mind the cache sizes and locality of memory access, then there would be a lot of memory cache hits resulting in further reduction of latency. To facilitate this, a lot of systems use very low-level programming languages to optimize the code to the specific architecture of the processors. Some firms have even gone to the extent of burning complex calculations onto hardware using Fully Programmable Gate Arrays (FPGA). With increasing complexity comes increasing cost and the following diagram aptly illustrates this.

Levels of sophistication

The world of high-frequency algorithmic trading has entered an era of intense competition. With each participant adopting new methods of ousting the competition, technology has progressed by leaps and bounds. Modern-day algorithmic trading architectures are quite complex compared to their early-stage counterparts. Accordingly, advanced automated trading systems are more expensive to build both in terms of time and money.

Standard 10GE network card Low latency 10GE network card FPGA ASIC Latency 20 microseconds + application time 5 microseconds + application time 3-5 microseconds Sub microsecond latency Ease of deployment Trivial Kernel driver installation Retraining of programmers Specialists Man years effort to develop Weeks Months 2-3 man-years 2-3 man-years

Building an entire automated trading system can be beyond the scope of an individual retail trader. For traders who want to explore the algorithmic way of trading can opt for automated trading systems that are available in the markets on a subscription basis. A trader can subscribe to these automated systems and use the algorithmic trading strategies that are made available to the users on these systems. We have highlighted some of the popular automated trading systems in our blog, “Top Algo Trading Platforms in India”. Traders who know programming can formulate and backtest their strategies in programming platforms like Python and R.

Build Your Own Automated Trading Systems

Beginner traders can learn to build their own algorithmic trading strategies and trade profitably in the markets. The following steps can serve as a rough guideline for building an algorithmic trading strategy:

Ideation or strategy hypothesis – come up with a trading idea which you believe would be profitable in live markets. The idea can be based on your market observations or can be borrowed from trading books, research papers, trading blogs, trading forums or any other source. Get the required data – To test your idea you would require historical data. You can get this data from sites like Google finance, Yahoo finance or from a paid data vendor Strategy writing – Once you have the data, you can start coding your strategy for which you can use tools like Excel, Python or R programming. Backtesting your strategy – Once coded, you need to test whether your trading idea gives good returns on the historical data. Backtesting would involve optimization of inputs, setting profit targets and stop-loss, position-sizing etc. Paper trading your strategy – After the backtesting step, you need to paper trade your strategy first. This would mean testing your strategy on a simulator which simulates market conditions. There are brokers which provide the algorithmic trading platform for paper trading your strategy. Taking your strategy live – if the strategy is profitable after paper trading you can take it live. You can open an account with a suitable broker that provides the algorithmic trading facility.

The number of exchanges that allow algorithmic trading for professional, as well as retail traders, has been growing with each passing year, and more and more traders are turning to algorithmic trading.

Conclusion

This was a detailed post on automated trading system architecture which we are sure gave a very insightful knowledge of the components involved and also of the various challenges that the architecture developers need to handle/overcome in order to build a robust automated trading system. So what are you waiting for? Go Algo!!

If you want to learn various aspects of Algorithmic trading and automated trading systems, then check out the Executive Programme in Algorithmic Trading (EPAT®). The course covers training modules like Statistics & Econometrics, Financial Computing & Technology, and Algorithmic & Quantitative Trading. EPAT® equips you with the required skill sets to build a promising career in algorithmic trading. Enroll now!

Disclaimer: All data and information provided in this article are for informational purposes only. QuantInsti® makes no representations as to accuracy, completeness, currentness, suitability, or validity of any information in this article and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.