After surviving 35 years, dozens of languages, hundreds of projects, thousands of meetings and millions of LOC, I now teach the basics to the computer-phobic

Bryan is a highly paid consultant in a position as a senior architect at a really big company. In the first part of his assignment, he concentrated heavily upon gathering requirements and designing a high-level architecture. In the latter part of his assignment, development tasks were thrown at the inexpensive offshore team.

Documents for architecture, detailed design specifications and development guidelines were written. Specifications for defining major interfaces and mocking up external systems (e.g.: databases, web services, etc.) for unit tests were written up, each with specific examples of what library to use, and how to use it to perform a given type of test. It was even explicitly stated that no external system was to be accessed from unit tests; everything was to be mocked. Real testing could be done in an integration environment.

Source control was set up.

There was a configuration file for each environment (development, integration, QA, production-parallel and production). Each developer had their own configuration file (which was also loaded by the infrastructure) to override the default values of system variables for purposes of local testing.

An automated build system was set up. Every check-in would trigger an automated check-out, full build and full run of all unit tests; if anything failed, the commit was aborted. If there wasn't a check-in, the build-process would trigger every 30 minutes.

Modules and tests were written. All the tests were passing and everything seemed fine. However, over time, this process took longer and longer, but everyone just assumed that it was because more and more code and tests were being written, and so it just took longer to check out, build and test.

As time went by, emails started going around between the production support folks and various development teams. It seems that the entire production environment, across all applications, would mysteriously slow to a crawl every now and then during the day. Over time, the problem got worse. And worse. And worse.

The developers on this team didn't think anything of it because their brand spanking-new application hadn't even made it to QA, let alone production, so surely it was safe to ignore all of those emails.

Management, on the other hand, didn't have the luxury of ignoring the problem. Meetings were held. Theories were postulated. Fingers were pointed. Network engineers were dragged into the conversation to find the source of the load on the network and servers. After some diagnostic testing and network packet sniffing, the source of the load was identified as the development server of Bryan's team.

Wait, a development server was pummelling the production environment? Tempers flared and directives were issued to shut it down until the source of the problem could be identified.

The development server was cut off from the network until the analysis had been performed. The SA's found that nothing was running that wasn't supposed to be there. The DBA's found nothing was spewing into or out of the database that wasn't supposed to be spewing. The problem was not with the server or database.

After a detailed code walk-through, Bryan found that the following anti-pattern was used in every test case that relied on another system in the firm:

class MyWidgetTest { private static final URL REMOTE_SYSTEM = new URL ( getEnvironmentValueFor ( Constants . MYWIDGET_URL ) ) ; @ Test public void getWidgetsFromRemoteSystemTest ( ) { HttpURLConnection con = ( HttpURLConnection ) REMOTE_SYSTEM . openConnection ( ) ; InputStream response = null ; BufferedReader reader = null ; List < MyWidget > list = new ArrayList < MyWidget > ( ) ; String line ; try { response = con . getInputStream ( ) ; if ( con . getResponseCode ( ) ! = HttpURLConnection . HTTP_OK ) { Assert . fail ( "Can't connect to remote system" ) ; } reader = new BufferedReader ( new InputStreamReader ( response , "UTF-8" ) ) ; while ( ( line = reader . readLine ( ) ) ! = null ) { // parse 'line' into a MyWidget and add it to the list } Assert . assertTrue ( "Make sure we got data" , list . size ( ) > 0 ) ; } finally { // close the response, reader and connection } } }

In the default dev.properties file, there was:

MYWIDGET_URL=http://xyz-QA.companyName.com/RemoteSystem/Function/...

And in each and every developer's local property-override file was:

MYWIDGET_URL=http://xyz-PROD.companyName.com/RemoteSystem/Function/...

That's right, each developer was overriding the default environment configurations and pointing at the production web services. Every time they ran the test-suite locally, they were performing large queries against those production web services, and bringing back huge data sets, only to validate that they got back at least one record, and then discard the whole thing!

All ten developers. Thousands of tests. Numerous times (randomly) during the day. Every day.

When the configuration overrides were changed to point at the QA web servers, the load on the production network and servers dropped back to normal. Of course, the tests still took forever to complete because they were hitting the grossly underpowered QA web servers that were backed by the grossly underpowered QA databases. Finally, Bryan forced them to mock up the web services to basically ignore the query parameters and just return a small set of canned records for each query.

The developers were finally able to run the entire test suite in seconds instead of 45 minutes. The network load had dropped dramatically. The QA servers were free to service actual QA testing instead of pointlessly feeding developer tests. The production servers were free to do actual work.

Afterward, there was a huge meeting to discuss why the offshore developers had ignored guidelines and pointed at production servers for the other systems. They replied that they wanted to make sure that they could access the production systems, and accessing them was the only way.

There is now a firm-wide effort under way to preclude users from directly accessing anything in QA or production as themselves. Only QA or production-specific application id's will be permitted, and an individual must be explicitly granted permisstion to su to that id in order to gain access.