After an informative presentation by Armon Dadgar at QCon New York that explored security requirements within modern production systems, InfoQ sat down with Dadgar and asked questions about HashiCorp’s Vault, an open source tool for managing secrets at scale.

In April HashiCorp announced the release of Vault, an open source tool for securely managing secrets and encrypting data in transit within the modern datacenter. According to the HashiCorp blog, a modern production system often has complex requirements for secret management. Secrets can be found throughout the application stack, such as credentials for databases, API keys for external services, and communication certificates. Comprehensive auditing is also typically required, as is secure storage of data at rest and the ability to perform regular key rolling. Vault aspires to solve all of these issues.

InfoQ sat down with Armon Dadgar, co-founder and CTO of HashiCorp, and asked questions about the usage of Vault, storing secrets within production, and how to implement security within the modern datacenter.

InfoQ: Welcome Armon. Could you briefly introduce Vault by HashiCorp please, and describe the problems that this tool is attempting to solve?

Dadgar: Hey Daniel, thanks for having me. When we first started thinking about Vault, we primarily sought to tackle the problem of secret distribution. By this I mean the process of getting things like API keys, database username and passwords, and TLS certificates to end applications. We wanted a way to store and distribute secrets that was secure, easy to use and cloud friendly. Once we built Vault, we realized it could do much more than that. Another HashiCorp project, Consul, provides a datacenter runtime with service discovery, configuration and coordination solutions for a modern infrastructure. With Vault, we see it as a security runtime. We've merged community contributions to support running Vault as an internal certificate authority, enabling organizations to issue certificates and use mutual TLS between applications. Many groups are now using Vault for "encryption-as-a-service" as well, which allows their applications to be more secure without needing to handle the nuances of cryptography. Vault was started with the goal of solving secret distribution, and it does that very well, but it also provides a complete solution for security related problems like auditing, certificate management, key rotation and encryption.

InfoQ: What kind of operational scale is Vault targeted at? For example, what benefits should I expect to see when deploying the tool at a startup, SME or large enterprise?

Dadgar: In general, I think the challenges of infrastructure grow exponentially with the size of an organization. At a small startup you know everybody personally, things are a bit like the wild west and this works because you trust the entire team. As an organization grows, this trust model begins to break down. You can't know every application, every server and every person anymore. Attackers know that and exploit that. It becomes much more critical to leverage tools like Vault to manage trust and enforce access controls. I think the value in Vault is that it provides a great story around auditing, access control, key management, and secret distribution. These are hard problems that must be solved and become more critical to get right the bigger an organization is. For smaller organizations, Vault has a ton of features like encryption as a service which lets teams focus on their product instead making sure they get the cryptography correct. Small teams that adopt Vault tend to have loose access controls which echo their organizational trust model, but as the team grows they can tighten Vault access controls instead of re-architecting their entire application to bolt on security.

InfoQ: Could you explain more about the dynamic secret and ACL support within Vault, and how this might assist with working with cloud vendor platforms, many of which appear to offer proprietary solutions to security?

Dadgar: One nasty problem with secret distribution is that you cannot force the other party to "forget" a secret. For example, once an application has been provided a database username and password it is very difficult to force the application to forget those credentials. The application may inadvertently log it to disk or transmit it over the network, at which point auditing and controlling access to the secret is nearly impossible. Dynamic secrets provide an elegant solution to this problem. Instead of providing an application with a shared, static username and password for the database, Vault generates a new set of credentials when a client requests access. These newly created credentials are provided to the client and tracked in a lease. When the client is done, or the lease is revoked, Vault deletes the credentials. Dynamic secrets are created on-demand, audited on a per-client basis and easily revoked. We no longer need to be as concerned if the application inadvertently logs to disk or leaks those credentials. Vault both reduces the impact of credential leakage and makes it faster and easier to remediate when it occurs. As you hinted at, dynamic secrets are extremely useful when dealing with the varied access control APIs of cloud platforms. There is no standardization between the APIs and this gets exposed to developers and operators, making it difficult to enforce a common set of ACLs. Vault helps solve this by presenting a single interface for clients and masking the differences between cloud vendors, database vendors, and other applications. The ACL system in Vault is agnostic to the underlying API, so that ACLs can be managed centrally and uniformly instead of trying to figure out how to achieve the same level of access control across each new tool and vendor.

InfoQ: We see that Vault supports leasing and renewing of credentials. How much will developers/operators implementing this feature have to modify their normal workflow?

Dadgar: There is fundamentally only two choices when integrating Vault, applications can either be Vault aware or not. The simplest is to be Vault unaware and to rely on a co-process that interfaces with Vault. If an application is Vault aware, it can make richer use of Vault and achieve the highest levels of security. However, this can be more challenging due to the need to make code modifications. Applications depending on a co-process still receive the secrets they need and do not need to handle lease management. For the co-process approach we recommend people use the consul-template tool, which was built to render templates based on data in Consul and supports Vault as well. It renders the secrets to a file, which we recommend writing to a volatile RAM disk like tempfs, that a Vault-unaware application can read. Consul-template manages the leases and if any of the secrets change it will re-render the template and reload the application to update credentials in a timely manner.

InfoQ: Auditing is obviously a vital component of many organisations security policy. What does Vault offer here?

Dadgar: Auditing is built into the core of Vault, so every request to and response from Vault can be audited. Vault has a notion of pluggable backends that make it easy to extend its functionality. Currently Vault has support for sending audit logs to disk and syslog, with planned integrations with Splunk and ELK as well. Because Vault is a security focused tool, it is designed to fail closed. Meaning, if audit logging is enabled and Vault fails to audit a request or response, it will refuse the client request. The thinking being that you never want to service a request if you would not be able to audit it later. Vault allows you to specify multiple audit backends to mitigate this risk so that no single audit backend is responsible for the availability of Vault.

InfoQ: How secure is Vault, and what encryption is used within the tool? Has Vault been audited by any external third-party security experts?

Dadgar: We recently completed an external code audit with iSEC and the NCC Group. They did a full audit of the Vault code base, with a specific focus on our use of cryptography. While no system is perfect, Vault achieves a high level of security assurance, validated in part by iSEC and NCC. The project is written in Go, which is memory safe, preventing an entire class of security vulnerabilities that depend on improper error checking by developers. Vault uses AES-GCM-256 which is considered to be the state of the art for data at rest and TLS 1.2 for data in transit to clients. We are committed to Vault's future and plan to do scheduled assessments with our security auditors to make sure we are not introducing any security vulnerabilities as the project evolves. Users of Vault get the benefit of "many eyes" from the open source community as well as our full-time development and paid 3rd party code audits to ensure a high-level of security assurance.

InfoQ: Vault is an online system that clients must request secrets from, what risk is there that a Vault outage causes down time?

Dadgar: HashiCorp has been in the datacenter automation space for several years, and we understand the highly-available nature of modern infrastructure. When we designed Vault, high availability was a critical part of the design, not something we tried to bolt on later. Vault makes use of coordination services like Consul or Zookeeper to perform leader election. This means you can deploy multiple Vault instances, such that if one fails there is an automatic failover to a healthy instance. We typically recommend deploying at least two Vault servers to mitigate the impact if a single instance should fail.

InfoQ: Thanks for your time today. Is there anything else that your would like to share with the InfoQ readers?

Dadgar: For a long time, the network perimeter was considered to be the first, best, and last line of defense. The Google "Aurora" attack is probably the most well known where an attacker got into the private network and was able to exploit the high-trust environment to exfiltrate data. Target was recently hacked through an HVAC system that happened to be on an internal network. The Natanz Iranian nuclear facility, built underground and disconnected from the Internet, was attacked by a virus that was able to access the internal network. There is a long list recent attacks including Neiman Marcus and Home Depot that have caused security professionals to re-assess their approach. The key conclusion is that we cannot consider an internal network secure because of a firewall, VPN, or even air gap. While the network perimeter is a fantastic line of defense, it shouldn't be the only one. One of the goals with Vault is to enable users to move towards a "zero trust" network, in which just being on the network does imply any level of access. This is especially critical as organizations adopt microservices, since their potential attack surface increases from a handful of monolithic applications to hundreds or thousands of independent services. As we continue to push Vault forward, our goal is to help secure those environments, by supporting things like mutual TLS between microservices, one-time-passwords for SSH, and fine grained service-to-service access controls. Security is often an afterthought but as attackers are getting increasingly sophisticated, so must developers and operators to stay ahead.

More information about Vault can be found on the Vault homepage, HashiCorp blog or the project’s Github repository.