Why LD_PRELOAD ?

Because most of the services do not expose TCP_INFO about the socket natively. So there are two ways, one is harder, one is trivial. I will talk about the latter one.

LD_PRELOAD allows you to intercept syscalls, grab necessary data and return back to normal processing. In turn, LD_PRELOAD can be easily injected using environment variable LD_PRELOAD= (globally or per-service), linked statically or whatever you want.

Here is the basic example to interept accept() syscall, get TCP metrics and return to the real syscall:

/* # gcc -shared -fPIC accept.c -o accept.so -ldl # LD_PRELOAD=/usr/share/accept.so /usr/local/bin/service */ #define _GNU_SOURCE #include <sys/syscall.h> #include <arpa/inet.h> #include <sys/types.h> #include <sys/socket.h> #include <stdio.h> #include <stdint.h> #include <dlfcn.h> #include <netinet/tcp.h> int (*orig_accept)(int sockfd, struct sockaddr *addr, socklen_t *addrlen); int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen) { struct tcp_info tcp; int len = sizeof(info); char *logfile = "/var/log/pdns/rtt.log"; FILE *fp; int fd; setbuf(stdout, NULL); orig_accept = dlsym(RTLD_NEXT, "accept"); fd = orig_accept(sockfd, addr, addrlen); if (getsockopt(fd, IPPROTO_TCP, TCP_INFO, (void *)&tcp, (socklen_t *)&len) == -1) printf("error: getsockopt()

"); fp = fopen(logfile, "a+"); if (fp == NULL) printf("error: cannot open %a

", logfile); fprintf(fp, "%u

", tcp.tcpi_rtt / 1000); fclose(fp); return fd; }

To verify if it’s working or not just type:

# grep accept.so /proc/26663/maps 7fb375525000-7fb375526000 r-xp 00000000 08:01 50341026 /usr/share/accept.so 7fb375526000-7fb375725000 ---p 00001000 08:01 50341026 /usr/share/accept.so 7fb375725000-7fb375726000 r--p 00000000 08:01 50341026 /usr/share/accept.so 7fb375726000-7fb375727000 rw-p 00001000 08:01 50341026 /usr/share/accept.so

Playground

It’s always useful to measure performance before/after changes to see if it was worth your time. I used this technique to evaluate the effect before the switch to anycast-based DNS solution.

But wait, DNS works in UDP mostly ;-) Yes, it works unless data size exceeds 512 bytes of response. Hence I employed this hackish method to measure DNS using TCP protocol because with UDP it’s not possible due to its nature:

UDP doesn’t have 3WHS, which means it’s not possible to measure RTT;

It would be possible to measure this if clients would use SO_TIMESTAMP for sending data and compare at receiving side.

Results