“If Google ever stops working, it’s my fault.” That’s a lot of responsibility to heap on one person’s shoulders, but Benjamin Treynor Sloss seems to take it in his stride. The vice-president of engineering for the global company says it was a simpler job when he took on the role back in 2003, but over the years both Google and his job have expanded significantly.

The same can be said of the Dublin operation. Google announced its plans to open its Irish office in 2003, then formally opened in 2004 with just a handful of employees and modest plans for expansion. The company now employs about 6,000 people, including contractors. Although Google hasn’t broken down its current numbers, the last count showed 2,800 full-time employees.

The general perception of tech companies in Ireland seems to revolve around two things. The first is that multinationals are only in Ireland for the tax breaks; the second is that, despite evidence to the contrary, Ireland is still mainly involved in support roles, providing customer service, localisation and other business functions, rather than in the critical tech roles developing intellectual property and processes.

The latter perception is a hard one to shake, and one that Sloss says can have an impact on recruitment.

“I think that’s the major issue for me,” he says. “Once they talk to people on the team, they see it is a real engineering group.”

In fact, Sloss adds, Dublin was originally chosen because Google found it could recruit effectively.

“We were bottlenecked by our ability to recruit engineers,” he says.

Sloss plays a pivotal role in Google. Shortly after joining the company he developed its site reliability team, and he was also responsible for the development of its networking team. He is now responsible for Google’s data centres and operators too.

In total, Google has about 400 engineers working in Dublin.

“We’ve had engineering teams in Ireland for longer than we’ve had data centres here,” he says.

Teams

The site reliability team was formed in Ireland towards the end of 2004, and it has been growing ever since, now numbering about 180. The networking team, meanwhile, was set up within 18 months of the site reliability team and now involves about 130 employees. Both of those teams, he says, are central to Google’s infrastructure.

But, he adds, there is a bit of misconception about what engineering entails.

“People say ‘what is engineering?’. They think this – you make a phone. Yes, that’s true, but at the end of the day you type stuff into this phone, you may use Gmail or maybe you store photos on it, or put documents on it – it all has to go somewhere,” he says. “The engineering for that eventually is storage systems, traffic management systems, networking systems, collectively what we call infrastructure. It has all the building blocks for the applications you use.”

That infrastructure is what the two engineering teams at Google are involved in and, as Sloss says, it’s an integral part of Google’s business. When Gmail goes down, or Google Docs has an interruption, it makes news. Aside from its email users, Google has business customers that rely on Google Apps to keep their business up and running. Billions of people use Google services. As Sloss puts it, if his teams don’t do their jobs, Google goes down.

“I try not to think about it too much,” says Sloss. “You can’t think about what all the failure roads look like, or all the things that could go wrong, or how complex it is. The truth is that none of us can keep it all in our heads any more, so you develop divisions of labour and practices, and ways of monitoring how the systems are working so you can detect if things are going wrong or are inefficient.”

This means delivering results to users in a fast, efficient manner that is relevant. Failure to do so will see fickle consumers go elsewhere for their services, as other companies have found out.

Security problems

The Irish engineering teams have been involved in numerous projects, but one of the most significant to Google was a solution to security problems. The Irish site reliability team developed a system that implemented security for individual apps, moving away from the idea of a traditional perimeter firewall.

“Traditionally in enterprise security, the idea is that from the outside you can’t access anything the company does because there’s a firewall. [So] once you get through the firewall, everything is open and you can get to every service internally,” Sloss says. “That turns out to be actually a pretty bad model, because you only have to break security once to get in. Everyone who writes an application internally can ignore security because they rely on a firewall. It’s a walled castle.

The security project, named Beyond Corp, took the basic idea that instead of the walled-castle model, each application was assumed, while running, to be exposed to the world. The application itself has to be secured, without regard to how other applications are secured, allowing systems to have differing levels of security for people internally.

“The team here originated the idea. We knew in general that perimeter firewall security was not a great way to go,” Sloss says.

The Beyond Corp approach not only makes applications more secure, but also the system as a whole. It is a concept that has now been taken on board throughout Google, and in discussions with enterprise partners.

Google’s Irish engineers also took a major role in the publication earlier this year of Site Reliability Engineering: How Google Runs Production Systems.

“We’ve had, over the years, a set of practices and philosophy that have kept it an engineering team,” says Sloss. “Collectively it’s become a body of knowledge and practice within site reliability. Eventually my team said we should write a book about it; we’ve got to a point where this could be useful for everyone.”

The book, published by O’Reilly Media, is now available. Sloss says it was driven by the Irish team.

Input

Allowing teams all over the world to have input into how the company develops and improves itself is key to Google’s ability to remain innovative. One of its founders, Eric Schmidt, said a number of years ago that, as companies grow, they accumulate people who have the ability to say no, and cautioned against this habit. Sloss says Google has taken that to heart.

“Within Google it still is very straightforward to say ‘I have an idea’. We make distributed engineering efforts thrive in parts until you get to the point where someone wants you to invest a quarter of a million dollars in the idea,” he says.

“You can say it’s empowering, but it is making people aware what the problem is we’re trying to solve, making them aware of what the payoff is for solving those problems to Google, and then giving them the means to do it.

“One of the reasons I’m still here at Google more than 12 years on is that the culture has basically remained the same. The pace of innovation hasn’t slowed down. It’s one of the things I love.

“That’s ultimately what innovation looks like: a lot of well-intentioned, well-considered failures followed by success.”