This is part of a series describing the history, context, and technical details of the modding we’ve been doing resulting in this big En Masse / NA TERA commotion:

Introduction . What’s all this about?

. What’s all this about? Part 1: The History – A Timeline . Take a walk down memory lane from 2012 to today.

. Take a walk down memory lane from 2012 to today. Part 2: Techno Mumbo Jumbo ⬅ You’re here!

⬅ You’re here! Part 3: What Can Be Done? . What do we know, and how should we address it?

. What do we know, and how should we address it? Conclusion (not yet published). Obligatory closing thoughts.

Feel free to skip around to the parts that interest you.

Last time we talked about what happened and when, but now we’re going to get into the how. How do DPS meters and tera-proxy work? How even do the skill prediction and other things work? I’ll be answering that in this installment.

First, a disclaimer. This post analyzes in detail the alteration of game behavior. Many places will consider this “discussion of cheats, hacks, or exploits”. If you want to post a link to this post anywhere, make sure it follows any submission rules there about this sort of topic.

If it’s so sensitive and controversial, why am I still posting it? Because I would argue this post has nothing new to offer to anyone with malicious intentions. Remember, TERA has been out for 5 whole years already, and there’s plenty of underground communities dedicated to illicit activities. Plus, the recent storm has only brought even more attention to tera-proxy. The proxy, along with ShinraMeter, CasualMeter, and Alkahest, are all open source. Everyone can look at the code, and anyone with the technical knowledge and ability to do bad things would have already figured out what we’re doing by now.

So, again. I’ll be doing my best to explain everything with the expectation that you, the reader, have minimal knowledge of these sorts of systems. For developers, this may be a total slog and you probably already knew all of this anyway. For anyone else, bear with me, and if there’s anything confusing, please let me know what could use some clarification and I’ll be happy to edit it in.

First, we’ll talk about how TERA’s networking works normally. Then we’ll look at how we use that knowledge in a packet sniffer (such as Shinra), and then we’ll extend that idea onto tera-proxy. Now, tera-proxy does nothing on its own; everything it does comes from plugins, or “modules”. So to round it all off, we’ll talk about how tera-proxy modules work at the end.

If you’d like, you can skip all the way down to the tera-proxy sections. Everything above is not strictly necessary for those, so that’s going to cut your reading time by a decent chunk.

Throughout this post, I’ll be adding a number of charts and diagrams. Some may be helpful, and some may not. If you’re not sure how to read a diagram, don’t fret too much. They mostly reiterate what I’ve already written about, so you’re not missing anything if you just don’t get it.

How TERA Talks to the Server

When you click the big “PLAY” button on the launcher, or when you back out to server select, TERA sends a request to a specific website to get a list of game servers. From there, it can do things like log you into the last server you played on based on what EME’s account server said it was, or it can show you the server list.

Either way, once a game server is selected, the TERA client tries to connect to it. I’m going to assume the vast majority of people are familiar enough with the concept of “server” and “client”, but here’s a diagram anyway, which will serve as a convenient reference point for the various approaches we’ll be taking later on.

In particular, we’ll be zooming in on exactly what’s going on in the “Client” and “Server” bubbles here, and then we’ll be taking a step back to look at how other programs fit into this.

After the client establishes a connection to the server, they do something that’s called a handshake; the client and server talk to each other to make sure that this is indeed for a TERA connection and they are ready, and then they share their encryption keys.

I assume most people are at least familiar with the concept of encryption, but we’re going to want to define a few more terms just to be sure we know what’s going on. Remember, this is going to be a very simplified explanation, so don’t expect it to be 200% accurate. Just enough to help everyone understand the concepts.

Let’s choose a very simple example. Let’s say you want to encrypt a message like “hello”. There’s lots of ways (infinite, actually) to go about doing that, so we’re going to need to pick a specific encryption algorithm. Let’s make one up right now.

Start with a number n—for instance, 2. Then, for each letter in the original message, you turn it into the nth letter after it, and then increase n by 1 and go to the next letter. So, in this example:

You see “h”. Since n = 2, you turn it into “j” (h → i → j ) and make n = 3.

= 2, you turn it into “j” (h → i → ) and make = 3. You see “e”. Since n = 3, you turn it into “h” (e → f → g → h ) and make n = 4.

= 3, you turn it into “h” (e → f → g → ) and make = 4. … and so on, until you end up with “jhpqu”.

You know how we picked the starting number of 2? That’s called the key. You can sort of think of it like a password; if you know the correct key, you’ll know how to correctly decrypt the result, but if it’s wrong, all you get is meaningless data.

But the starting number wasn’t the only thing that mattered. Our encryption also had to keep track of what n was at each step of the way. That’s the encryption state. This isn’t something you’re supposed to modify and play around with yourself, but it is important. For “hello” with a key of 2, the last three letters came out to “pqu”. If we skip the “e” and try “hllo”, then the last three letters become “opt”. Even though we started with the same letters, we got a very different result because we had a different encryption state when we reached them.

So, with the definitions of encryption key and encryption state out of the way, we can get back to how the client and server talk to each other.

The two establish a connection and send their their encryption keys to each other. (In this case, the encryption algorithm requires two keys.) After that, they both hold a copy of their encryption states: one state for themselves, and one state for the other. Why two states? Because this is a “duplex” connection—data goes both ways. The client can send stuff to the server, and the server can send stuff to the client, and both of those directions needs its own encryption.

The rest of the connection is then encrypted. So it’s important that both the server and client are tracking the correct encryption states. As they process incoming data, they have to decrypt it, which updates the incoming encryption state. As they send outgoing data, they have to encrypt it, which updates the outgoing encryption state. If either side gets either state wrong, you end up with garbage values and eventually a disconnect.

In fact, if anyone tries to add, modify, or delete a single thing in the encrypted data as it gets passed between the client and server, the decryption will go into a wrong state and you’ll never be able to recover. You’ll read garbage values until you realize something isn’t going right. TERA chalks it up to a disconnect and boots you right back to server select.

To the casual eye, it might sound pretty fragile, so why don’t you disconnect very often at all? How is it handling when you receive data in the wrong order, or have packet loss and drop packets like the bass at an EDM concert? Well, the short answer is that you can’t. TERA’s network connections use something called TCP. Here’s a quick summary: TCP is built to be reliable, error-free, and in-order. If you receive something, but you know there was something sent before it and you never got that, you have to ask the other side to resend the previous message and wait for it before you can continue.

TCP means that as far as TERA is concerned, packets are never received out of order, packets are always correct, and you can never miss one (unless you get totally disconnected, in which case you obviously miss all). That also explains some lag spikes. If you have packet loss, the client has to play catch up, and it can’t go forward while it eagerly awaits retransmission of every single packet that it missed. All the server can do is resend those and everything else that happened while trying to recover, and for some connections that can snowball hard.

So now we know how data is encrypted, transmitted, and decrypted between a TERA client and server, but what exactly is being transmitted? To answer that, we’ll need to talk about protocols.

A protocol, in terms of network communications, is a specific format, or a set of rules, for how to transmit information. We know one already, actually; TCP is a very heavily used protocol that dictates how tons of packets are being sent over the internet on a daily basis. In fact, you’re most likely viewing this post thanks to a TCP connection. A TCP packet can hold any arbitrary data, but for our analysis on TERA networking, it holds TERA’s network data. That data follows BHS’s own custom protocol.

As I go into the specifics of the custom protocol, let’s take a minute to think about why we even need a protocol. As humans, natural language is easy for us to use and process. “Move my character one step forward” is very clear to an English speaker, but computers aren’t humans, and while it’s a bit of a cliche to say that computers “speak in ones and zeroes”, the idea of it should be enough to show why an English sentence just won’t cut it. We need to represent that same idea in a way that’s compact and efficient for a computer to process.

Under this custom protocol, we can construct what I personally call a “message” (think “sentence”). Each message starts with two numbers. The first number says how long the message is (as in, “how many letters?”), and the second is an opcode—short for “operation code”, which defines what operation to perform. Like taking a step forward. Any additional data will depend on the opcode. For player movement, you’ll want to send details like your speed and position in game, so for the message that uses the player movement opcode, there’s a very specific order and format in which to put information like your in game position and other things like movement type.

So, if that’s how we can construct a valid message, then the game’s protocol is comprised of the rules for what opcodes mean and how to format and order additional data and all of that stuff.

Notice that I switched from “packet” to “message”, even though you probably only ever heard about packets. Semantically speaking, they’re different; you may want to send two actions happening at the same time, in which case you can just bunch them together under the same TCP packet. Other times, a message is just too large and you don’t want to make the packet too big, so you have to split it across multiple TCP packets. But for the most part you’ll see people just calling them “packets” because it’s a lot less ambiguous than “message”, which could also refer to things like chat messages or system messages. So from now on, I will likely use the two terms interchangeably.

Now you know pretty much all there is to know on how TERA’s network communication works. One side, client or server, constructs a message following the rules set by TERA’s protocol, then it gets encrypted and sent over, where it’s decrypted and read according to the protocol.

Packet Sniffers and DPS Meters

The first type of publicly released mods that wanted to look at this network data used a method called packet sniffing. A packet sniffer is able to read (and only read) the data that’s being transmitted over a network, and often has tools to help analyze the packets.

So we set up a packet sniffer to read the data over your net connection, but remember: TCP is a standardized and widely known protocol while BHS’s TERA protocol is custom made, so our packet sniffer can only help us as far as TCP. Even then, we’re seeing much more than what the server and client see. We’re seeing all the other stuff that happens from TCP: packets being resent and packets being received out of order. So we have to do all this work of figuring out how to recombine the packets that we’re seeing so that we can most accurately reconstruct what the server and client are sending and receiving. Sometimes, rarely, you get it wrong, and everything breaks because our encryption and decryption depend on their state, and if we mess up reconstructing a message, we end up with an incorrect encryption state. (If your meter sometimes occasionally breaks, this may be why.)

But let’s assume that, as is the case the vast majority of the time, we got it right. We’re reading the packets properly—but they’re still encrypted! Well, that’s easy to handle. We know how the handshake looks, and how the server and client exchange encryption keys, so we can just recreate the same encryption states and we’re ready to decrypt.

We’ve rebuilt the packets being sent and we’ve decrypted their data, so now we’re down to BHS’s custom protocol.

As mentioned, each message under this protocol always begins with the message length and the opcode. That’s convenient for us, because if we see an opcode that we don’t care about, we can totally skip ahead to the next message. Unfortunately, opcodes generally change on every major TERA patch, and that’s why you need to update meter and proxy. (A “major patch” is one where the first number of the patch version goes up; check the version list on the right side of the EME patch notes for an example.) Luckily, we can actually very easily call a function in the TERA client to convert an opcode to a readable name like C_PLAYER_LOCATION .

When we do encounter an opcode we recognize and care about, we need to figure out how to read the rest of the message. Say we found a message with the opcode corresponding to C_PLAYER_LOCATION and we want to use the info from it, but we don’t know how that info was stored. That’s where more work comes in, because we haven’t yet figured out a way to automatically dump this from the client.

If we don’t know how the rest of C_PLAYER_LOCATION is formatted, we look at the data to try to see what they correspond to. Don’t forget that we’re working with computers, so at the end of the day, everything is just numbers. “Looks like this number is a 1 when you start running, and a 2 when you stop, and a 3 when you jump. This number looks like it could correspond to a letter of the alphabet, so we can try converting it to text and seeing what it says.” And so on.

This is a tedious process known as reverse engineering.

DPS meters are only interested in a very specific set of opcodes, and since both Shinra and CasualMeter are in popular use (and especially since one developer works closely with both), they usually just need to figure out how those handful of messages look and then they’re done dealing with TERA’s networking without too much hassle.

tera-proxy

But we had to go a step further. For the sake of completeness, let’s first talk about what exactly a proxy server is. You can imagine the connection between a client and server as being like two people having a chat with each other. A packet sniffer will be sitting off to the side, letting the client and server have a direct chat with each other while it’s up to the packet sniffer to eavesdrop and make as much sense of the conversation as they can or need, while being unable to interfere or even interact at all with the conversation.

A proxy server, the middleman, sits between them. The client must talk to the proxy and receive replies from the proxy. The server must also talk to the proxy and receive replies from the proxy. The client and server no longer interact directly with each other.

This can be used in a number of ways. Maybe the middleman is a translator, and the client and server need something like that between them in order to understand each other. Maybe the client and server are just really mad at each other because the server is based in Japan and only wants to serve clients that live in Japan, in which case the proxy can act as a Japanese client on behalf of the real client. Even if the real client is completely fluent in Japanese, everything the client sends must still go through the proxy, likely completely unmodified, because the real client is not living in Japan and those are the server’s rules.

You should have a pretty good idea by now how tera-proxy fits into this. Actually, there’s two proxy servers; remember how the client has to fetch the server list so it knows where to make a connection? The first thing we do here is proxy the server list, so when the client requests it, that request goes to us first. We ask the real server list what’s up, and when we get the reply, we can change whatever it says before handing it back to the client. Like adding an extra entry called “Mount Tiltrannas” that points to a game proxy server.

So the game client got this new server list and now it wants to connect to this special server we’ve added. We know the connection handshake, and we know that the real client and real server are going to exchange encryption keys which we can save and forward very easily.

Since the proxy talks to both the client and server, it handles two connections. Each connection has two ways data can flow (towards or away from the proxy), so that means we now have four encryption states to take care of: the two between client ↔ proxy and the two between proxy ↔ server.

But if we do our job right, both sides are oblivious that there’s even anyone in between them, because as far as they’re concerned, the data they’re receiving is getting decrypted properly. Everyone’s happy.

Unlike packet sniffers, we’re not just eavesdropping anymore. This data from both sides is being sent straight to the proxy and only the proxy. It’s our job to decrypt according to one side, re-encrypt for the other side, and then send it to its intended destination. Since we can do that, we can also do things like modify the data, adding and removing things at our leisure because now we have control of the encryption states.

Besides everything already mentioned, there’s one more thing that distinctly set tera-proxy apart from DPS meters.

tera-proxy doesn’t have a specific focus like meters have with DPS. We no longer care only about meter-related packets anymore. If someone finds any opcode at all that looks interesting to them, they can try to figure out how its packet data looks, write up a definition for it, and (hopefully) add it to tera-proxy. It might get used by others too, and it can also be verified or updated by other people because that’s the beauty of open source. In fact, as of the time of writing this, we have well over 400 definitions mapped out thanks to a lot of people. We’ve been hard at work.

That means we know how to handle a lot more opcodes, but what does that mean for our proxy?

tera-proxy Modules

By itself, tera-proxy does nothing substantial. It overrides the server list to point to itself, it decrypts and encrypts the data, and it’s capable of reading and writing messages with the knowledge we’ve given it of TERA’s protocol. But by default, it has no reason to read or write anything.

That’s what modules are for. Each module essentially says, “Hey, when you see any of these particular packets, I want you to tell me the data it contains and then I want you to do these things.” This is called a “hook”. If you hook C_PLAYER_LOCATION , then when tera-proxy sees a C_PLAYER_LOCATION packet, it reads it and converts it to a format that’s much simpler for a programmer to use.

Again, let’s look at some old diagrams:

We start with this 0012 0123 0256 0390 4800 0002 message, the section outlined in the middle. After decrypting a packet, that’s what tera-proxy might see, so it checks what opcode 0123 is and finds out that it’s C_PLAYER_LOCATION . It knows that you registered a hook for that packet, so it first tries to make sense of the packet. Checking the definition we’ve written for that packet, we have names for each field like before:

So now it runs your hook, and in your hook, you can retrieve (and modify) the values by messing with a variable called position or moveType .

You can also silence the packet to prevent the intended recipient from ever seeing it, and you can also construct your own packets.

And that’s pretty much all a module can do to a TERA connection.

Maybe you’ve noticed it by now, or maybe you haven’t.

It took well over 3,000 words and plenty of graphs and charts and diagrams to explain how TERA’s whole networking system works. Then we described tera-proxy modules in just a handful of sentences.

There’s a reason for that—and it’s not because you learned everything you needed to know.

The real reason is that you didn’t need to know in the first place. That’s the proxy’s blessing and curse.

In our example, we wanted to get a C_PLAYER_LOCATION and modify it. We can do it in three lines of code. We never had to bother ourselves with the details of the encryption, the protocol, the proxying, or any of that.

There’s more. Most languages require the use of a compiler; you write code, give it to the compiler, and you get back “machine code” that you can run. For tera-proxy, I use a language called JavaScript. It’s an “interpreted language”, which means it doesn’t need compiling for someone to run it. Actually, you’ve probably already been using JavaScript quite a lot; it’s the language used to make webpages interactable, and you’d be hard-pressed to find a single website that doesn’t use a single line of JavaScript.

Since we don’t need to compile anything, anyone can just open up a text editor, write some module code in JavaScript, run the proxy, and it just works.

Does that sound very powerful? It should. Can you make really cool stuff with it? You bet. Can you also cheat really hard with it?

Unfortunately, yeah. But that discussion is reserved for part 3.

Most modules will be rather self-explanatory (when a cutscene happens, hide it from the client and instantly send the skip packet; when a Vanguard is completed, instantly send the accept rewards packet; etc). But a lot of people are up in arms about the skill prediction stuff, so it’s time to get into a little more detail about that.

Skill Prediction

First, let’s pick some arbitrary ping. Say 200 ms. That means you sent a “ping” packet to the server at some time, and when you received a “pong” packet back from the server, it had taken 200 ms. For now, it will be sufficient to assume that this means the time it takes for a packet to reach the other side was 100 ms, so when we sent something and got something back, it took twice that time, or 200 ms.

There’s only one thing in TERA that happens on your screen as soon as you press a button, and that’s movement. This is a technique called client-side prediction. It’s still going to take 100 ms for the server to receive your new position, and then half of everyone else’s ping for them to see it, but at least on your own client, you see when you step forward or back instantaneously. It feels smooth and fluid.

That’s just about the only thing client-side predicted as far as we’re concerned.

What does that say about casting skills? Well, when you press the button, your client doesn’t actually do much of anything yet. It sends a request to do it to the server, who verifies a number of things (Was it off cooldown? Was it on a valid target? Are you not stunned or slept?) and then sends back an acknowledgment along with other information like the attack speed. Only then does your client animate that skill.

Let’s ignore chain skills for a moment. We’ve casted our first skill, but now we want to cast the second. We can’t do that while we’re already in a skill, so we have to wait for the first skill to finish animating.

But our first skill already took 200 extra ms to start.

And then we have to wait another 200 ms for the server to acknowledge our second skill.

That’s what’s colloquially known as “ping tax”, and you pay it on most everything you do in TERA.

Lockons are extra awful in this department. Entering the lockon pays ping tax, acquiring a lock pays ping tax, and then firing the skill pays ping tax.

And Rapid Fire? All right, let’s look at this. Let’s say you’ve got a total of an arbitrarily chosen 120 + 60 = 180 attack speed, which translates to 150% animation speed. Each hit of Rapid Fire chains into the next after somewhere around 300 ms at base speed (I think?), so we can cut that down to 300 ms ÷ 1.5 = 200 ms per hit of RF.

First, a perfect world where latency doesn’t exist. We can send each hit of RF as soon as we get the chance, so we can send the seventh and final hit after 6 animation cycles (200 ms). 6 × 200 ms = 1.2 seconds.

But we have 200 ms ping. That means to send the next hit, we first have to wait for the acknowledgement of the previous hit to reach us (200 ms), and then animate it (200 ms). Six cycles of that is 6 × 400 ms = 2.4 seconds.

In fact, it doesn’t matter how much attack speed you get. In this case, you’ll always take 1.2 seconds longer to reach the seventh hit of RF because you have to pay that ping tax six times no matter what. That’s disgusting.

(As an aside, for someone living very close to the servers with the luxury of about 30 ms ping, their RF time comes out to 6 × 230 ms = 1.38 seconds. That’s only 0.18 seconds (180 ms) more than the perfect case scenario—less than the time we’d be waiting for our very first hit at 200 ms ping. If your ping is already very low, there’s very little to gain, especially because you have to balance those millisecond bonuses with just how much desyncing you’re going to be dealing with. I explain that in more detail below.)

So how do we fix it?

Well, we can’t get rid of latency. It’ll always take 100 ms for the server to receive anything you send.

But what we can do is try our best to eliminate the ping tax on the client. When you try to cast a skill, skill predictors are really just tera-proxy modules that predict the acknowledgment reply from the server and send it to the client.

You say you want to cast Rapid Fire? The predictor instantly replies that you did it. That’s a whopping 0 ms on the ping tax. No need to wait for the server when you can just assume it was successful.

Sometimes, we get it wrong. There’s a lot of things the client doesn’t check before sending the server a request to cast a skill, so skill predictors have to be checking if you’re stunned or slept or knocked down, if you have appropriate prerequisite buffs to cast something, if you have glyphs or buffs that alter a specific skill’s attack speed… If we guess any of those wrong, you get desynced. Your client might still animate it, but it might be too slow, or too fast, or it wasn’t actually cast at all. Damage doesn’t happen, but you’re still stuck in animation.

The three main skill predictors, each written by a now banned developer, handles things very differently. Mine had lockons for lockons and fast-fire to focus solely on three skills (Rapid Fire, Burning Heart, and Burst Fire), both of which made no attempt to resolve desyncs. Both Bern and Pinkie aimed to generalize the approach for all skills in the game, but I can’t discuss how they went about resolving desyncs and similar issues due to not personally examining any of their code.

But that’s pretty much it. Your client asks to cast a skill and the proxy instantly replies with an acknowledgment so it doesn’t have to wait. We’re not actually casting anything if we couldn’t already do it. We’re just tricking the client into thinking it’s doing it instead of waiting for the server to say it was valid.

That covers all the technical parts of how everything is working, from TERA’s network connection to DPS meters and from the proxy to its modules.

I hope it wasn’t too difficult to get through, but there’s a ton of information to cover, and a lot of details that typically get glossed over because most people don’t care to know much more beyond “tera-proxy is hacks and skill prediction is lagging servers”.

Armed with all this knowledge, the next thing we’ll be looking at is what EME and BHS have done to try to address these things and what they can do about it in part 3: What Can Be Done?.