Introduction

We often hear about vulnerabilities in HTTP clients, such as web browsers, that are typically exploited by malicious web content, there’s nothing new here. But did you know that the FTP clients themselves can also have vulnerabilities that can be exploited? FTP clients can be targeted by malicious servers that the clients connect to.

In this blog post, I’ll show an interesting path traversal vulnerability we identified and responsibly disclosed to several affected vendors in November 2017. This vulnerability can affect multiple applications and libraries, allowing a malicious FTP server to create or overwrite files anywhere on the local file system. As you will see in the details below, this vulnerability results in a lack of validation, affecting not just FTP clients, but also many other applications and libraries in various ecosystems, such as Java, npm and others.

The Vulnerability

OK, let’s get into the issue! We want to code a function that downloads the contents of a remote FTP folder to a local one. As most of us already know, the FTP protocol itself does not offer a download folder command, but we can combine several other commands to achieve our goal.

We can:

1. List all the files in the remote folder ( LIST or NLST FTP commands)

2. For each file in the list results above: Download the file and save it to a local folder ( GET or MGET FTP commands)

An example of some Java code performing this behaviour, using the Apache commons-net library, might look like this:

private void downloadDirectory(FTPClient ftpClient, String remoteDir, String localDir) throws IOException { FTPFile[] subFiles = ftpClient.listFiles(remoteDir); for (FTPFile aFile : subFiles) { if (!aFile.isDirectory()) { String remoteFile = ftpClient.printWorkingDirectory() + File.separator + aFile.getName(); String localFile = localDir + File.separator + aFile.getName(); OutputStream downloadedStream = new BufferedOutputStream(new FileOutputStream(new File(localFile))); boolean success = ftpClient.retrieveFile(remoteFile, downloadedStream); outputStream.close(); } } }

The code above, iterates over each file returned by the server, and downloads it into a local destination folder. So for example, if the first file in the remote folder is named passwd , and our local destination folder is /var/data/sync/ , we’d end up downloading the file to /var/data/sync/passwd .

But what if the FTP server turns malicious, and instead of responding to the LIST command with passwd , it responds with ../../../../etc/passwd as the filename. The code above will end up placing the file into /var/data/sync/../../../../etc/passwd , practically overwriting /etc/passwd with the newly downloaded file.

You might say, ../../../../etc/passwd is not a valid filename, indeed it isn’t. But the RFC does not say it isn’t. Technically, it stays agnostic to the file system, leaving it for both the client and the server to figure out for themselves. For example, a windows based FTP server response to a LIST command might look like:

"05-26-95 10:57AM 143712 $LDR$", "05-20-97 03:31PM 681 .bash_history", "12-05-96 05:03PM <DIR> absoft2", "11-14-97 04:21PM 953 AUDITOR3.INI", "05-22-97 08:08AM 828 AUTOEXEC.BAK", "01-22-98 01:52PM 795 AUTOEXEC.BAT", "05-13-97 01:46PM 828 AUTOEXEC.DOS", "12-03-96 06:38AM 403 AUTOTOOL.LOG", "12-03-96 06:38AM <DIR> 123xyz", "01-20-97 03:48PM <DIR> bin", "05-26-1995 10:57AM 143712 $LDR$",

While a unix based one, looks like:

"zrwxr-xr-x 2 root root 4096 Mar 2 15:13 zxbox", "dxrwr-xr-x 2 root root 4096 Aug 24 2001 zxjdbc", "drwxr-xr-x 2 root root 4096 Jam 4 00:03 zziplib", "drwxr-xr-x 2 root 99 4096 Feb 23 30:01 zzplayer", "drwxr-xr-x 2 root root 4096 Aug 36 2001 zztpp", "-rw-r--r-- 1 14 staff 80284 Aug 22 zxJDBC-1.2.3.tar.gz", "-rw-r--r-- 1 14 staff 119:26 Aug 22 2000 zxJDBC-1.2.3.zip", "-rw-r--r-- 1 ftp no group 83853 Jan 22 2001 zxJDBC-1.2.4.tar.gz", "-rw-r--r-- 1ftp nogroup 126552 Jan 22 2001 zxJDBC-1.2.4.zip", "-rw-r--r-- 1 root root 111325 Apr -7 18:79 zxJDBC-2.0.1b1.tar.gz", "drwxr-xr-x 2 root root 4096 Mar 2 15:13 zxbox",

In fact, there are many other file system formats, here’s a list of them supported by the apache commons-net library:

OS400, AS400, L8, MVS, NETWARE, NT, OS2, UNIX, VMS, MACOS_PETER

So the typical FTP client does not validate the filenames, returning them as is, for the developers to validate.

Needless to say, the validation is overlooked. This is very obvious when looking at the projects hosted on github, or various snippet/example websites, like StackOverflow and CodeJava.

Case Study: Apache HIVE

Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data summarization, query and analysis. Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop. Among other things, it supports copying data from FTP servers, using the COPY-FROM-FTP command.

COPY FROM FTP host [USER user [PWD password]] [DIR directory] [FILES files_wildcard] [TO [LOCAL] target_directory] [options] options: OVERWRITE | NEW SUBDIR SESSIONS num

Looking at the code, we see the retrieveFileList() call.

/** * Run COPY FROM FTP command */ Integer run(HplsqlParser.Copy_from_ftp_stmtContext ctx) { trace(ctx, "COPY FROM FTP"); initOptions(ctx); ftp = openConnection(ctx); if (ftp != null) { Timer timer = new Timer(); timer.start(); if (info) { info(ctx, "Retrieving directory listing"); } retrieveFileList(dir); timer.stop(); if (info) { info(ctx, "Files to copy: " + Utils.formatSizeInBytes(ftpSizeInBytes) + ", " + Utils.formatCnt(fileCnt, "file") + ", " + Utils.formatCnt(dirCnt, "subdirectory", "subdirectories") + " scanned (" + timer.format() + ")"); } if (fileCnt > 0) { copyFiles(ctx); } } return 0; }

Inside retrieveFileList function, we see that the filename returned by the server is appended to the directory name without any validation ( name = dir + name; ). The file is added to a queue, to be downloaded later.

void retrieveFileList(String dir) { if (info) { if (dir == null || dir.isEmpty()) { info(null, " Listing the current working FTP directory"); } else { info(null, " Listing " + dir); } } try { FTPFile[] files = ftp.listFiles(dir); ArrayList<FTPFile> dirs = new ArrayList<FTPFile>(); for (FTPFile file : files) { String name = file.getName(); if (file.isFile()) { if (filePattern == null || Pattern.matches(filePattern, name)) { if (dir != null && !dir.isEmpty()) { if (dir.endsWith("/")) { name = dir + name; } else { name = dir + "/" + name; } } if (!newOnly || !isTargetExists(name)) { fileCnt++; ftpSizeInBytes += file.getSize(); filesQueue.add(name); filesMap.put(name, file); } }

Later, in the downloader thread, the file is being pulled out of the queue and downloaded from the server.

java.io.File targetLocalFile = null; File targetHdfsFile = null; if (local) { targetLocalFile = new java.io.File(targetFile); if (!targetLocalFile.exists()) { targetLocalFile.getParentFile().mkdirs(); targetLocalFile.createNewFile(); } out = new FileOutputStream(targetLocalFile, false /*append*/); }

A possible attack can be overriding the ssh authorized_keys file for the root user, making it possible to login as root later on. Let’s assume that Apache Hive instance connects to our FTP server, to download some merchant data daily. To execute this attack, we’d modify our FTP server to send back malicious path traversal filenames to the client. For instance, we can respond to a LIST command with ../../../../../../../home/root/.ssh/authorized_keys .

When Hive executes this statement (assuming it’s running as root), root’s authorized_keys ssh file will be overwritten with one known by the attacker.

COPY FROM FTP remote.merchant.domain.com USER 'foo' PWD '***' DIR data/sales/in FILES '.*' TO /data/sales/raw OVERWRITE

The above vulnerability was responsibly disclosed to the Apache Foundation. Timeline:

Date Event 2/11/2017 Vulnerability discovered by Snyk Security Research 8/11/2017 List of affected Apache products disclosed to the foundation. 5/2/2018 Apache informed us that they plan to release a fixed version by the end of February. 4/4/2018 Post published.

Details were also published in the CVE database on 4/4/2018 for the Apache Hive project.

CVE-2018-1315: ‘COPY FROM FTP’ statement in HPL/SQL can write to arbitrary location if the FTP server is compromised

Severity: Moderate

Vendor: The Apache Software Foundation

Versions Affected: Hive 2.1.0 to 2.3.2

Description: When ‘COPY FROM FTP’ statement is run using HPL/SQL extension to

Hive, a compromised/malicious FTP server can cause the file to be

written to an arbitary location on the cluster where the command is

run from. This is because FTP client code in HPL/SQL does not verify the destination

location of the downloaded code. This does not affect hive

cli user and hiveserver2 user as hplsql is a separate command line

script and needs to be invoked differently.

Mitigation: User who use HPL/SQL with Hive 2.1.0 through 2.3.2 should upgrade to

2.3.3. Alternatively, the usage of HPL/SQL can be disabled through

other means.

Summary

We have outlined how a vulnerability that some FTP client apps and libraries have is caused by data from the FTP server not being validated correctly. Similar issues have been found in the past. For example, In 2002, Steve Christey, a Principal Information Security Engineer at MITRE found the problem existed in multiple FTP clients, including the native linux FTP client, and wget.