How I used Mutation Testing to Improve my Code

And make my tests more honest

A while ago I wrote a small utility called amqp-delegate that uses the standard amqplib library to allow the easy creation and invocation of remote workers via an aqmp message bus such as Rabbit MQ .

I wrote an article about this called ‘Delegating Work using NodeJS and AMQP’.

I was at the beach when I wrote it and was feeling pretty lazy. In order to get a nice green 100% coverage badge on my repo I cheated and used /* istanbul ignore next */ to completely ignore my work delegator’s invoke function.

In my tests I added a little TODO note:

// TODO: work out how to test the channel.consume callback

and then I skipped the test.

I figured I’d tested this code with my integration tests so obsessing over unit test code coverage was just a waste of time. My code worked, it was therefore good code.

Then I read about a mutation testing library called Stryker Mutator and figured I’d add it to my library just to see what it might do for me.

What is mutation testing.

Mutation testing is a way of testing your tests. It’s easy, as outlined above, to cheat your test coverage reporting by skipping bits of code, but it’s also sometimes not obvious that your unit tests are not really doing their job.

Mutation testing breaks your code in clever ways, changing false to true , changing the values of strings and numbers , changing plus to minus , that sort of thing, and then runs your tests again and again for each mutation of your code. If the tests still pass despite the changes made to your code then your tests are considered to be broken.

In an ideal world none of your tests will survive your code being mutated.

Trying it out

Running my mutation tests showed huge swathes of red in my terminal as all the code that I’d told Istanbul to ignore got flagged, along with a bunch of other code that I’d assumed was well tested.

To give a more detailed example, here’s the code from the makeDelegator function I mentioned above.

Specifically here’s what the invoke function looked like. You can see why this is hard to unit-test.

const invoke = async (name, ...params) => {

if (!channel) throw new Error(QUEUE_NOT_STARTED)

const queue = await channel.assertQueue('', { exclusive: true })

const buffer = Buffer.from(JSON.stringify(params))

const correlationId = v4()

const replyTo = queue.queue return new Promise((resolve, reject) => {

channel.consume(

replyTo,

message => {

if (message.properties.correlationId === correlationId) {

try {

const result = JSON.parse(message.content.toString())

return resolve(result)

} catch (err) {

return reject(err)

}

} else return reject(WRONG_CORRELATION_ID)

},

{ noAck: true }

) channel.sendToQueue(name, buffer, { correlationId, replyTo })

})

}

The function works by registering a message consumer, then sending the data to the message queue. The message consumer waits until it gets the response with the correct correlationId and only then does it resolve or reject the promise .

The call to resolve or reject is buried deep in the response-message handler, making it very hard to unit test. Being something of a completist I just had to give it a go however.

Refactoring

The first step was to extract the response-message handler and test that in isolation.

The handler function needs access to the overarching promise’s resolve and reject functions as well as the correlationId to compare against the message ’s own correlationId . I created the following curried utility function:

src/utils/messageCorrelator.js

const { WRONG_CORRELATION_ID } = require('../errors') const messageCorrelator = (correlationId, resolve, reject) =>

message => {

if (message.properties.correlationId === correlationId) {

try {

const result = JSON.parse(message.content.toString())

return resolve(result)

} catch (err) {

return reject(err)

}

} return reject(WRONG_CORRELATION_ID)

} module.exports = messageCorrelator

This is easy to test. Just pass in stubs for the resolve and reject functions, and set up scenarios for when the correlationIds do or don’t match. Also, for completness, throw in a test for when the response message’s content is not parsable as JSON .

const { expect } = require('chai')

const { stub, resetHistory } = require('sinon') const messageCorrelator =

require('../../../src/utils/messageCorrelator') const { WRONG_CORRELATION_ID } = require('../../../src/errors') describe('utils/messageCorrelator', () => {

const RESOLVED = 'resolved'

const REJECTED = 'rejected'

const resolve = stub().returns(RESOLVED)

const reject = stub().returns(REJECTED)

const correlationId = '123456'

const content = { test: 'data', is: 'good data' } const correlate =

messageCorrelator(correlationId, resolve, reject)

let result context('given matching correlationId', () => {

context('given unparsable message content', () => {

const message = {

properties: {

correlationId

},

content: 'junk'

} before(() => {

result = correlate(message)

}) after(resetHistory) it('invoked reject with a SyntaxError', () => {

expect(reject).to.have.been.called

const err = reject.firstCall.args[0]

expect(err).to.be.instanceof(SyntaxError)

}) it('returned the evaluated rejection', () => {

expect(result).to.equal(REJECTED)

})

}) context('given parsable message content', () => {

const message = {

properties: {

correlationId

},

content: JSON.stringify(content)

} before(() => {

result = correlate(message)

}) after(resetHistory) it('invoked resolve with parsed content', () => {

expect(resolve).to.have.been.calledWith(content)

}) it('returned the evaluated rejection', () => {

expect(result).to.equal(RESOLVED)

})

})

}) context('given non-matching correlationId', () => {

const message = {

properties: {

correlationId: 'some-other-id'

},

content: JSON.stringify(content)

} before(() => {

result = correlate(message)

}) after(resetHistory) it('invoked reject with WRONG_CORRELATION_ID', () => {

expect(reject).to.have.been.calledWith(WRONG_CORRELATION_ID)

}) it('returned the evaluated rejection', () => {

expect(result).to.equal(REJECTED)

})

})

})

Now I had this utility I could pull out the invocation logic from the invoke function by making a simple curried invoker function as follows:

const messageCorrelator = require('./messageCorrelator') const invoker = (correlationId, channel, replyTo) =>

async (name, params) =>

new Promise((resolve, reject) => {

channel.consume(

replyTo,

messageCorrelator(correlationId, resolve, reject),

{ noAck: true }

) channel.sendToQueue(

name,

Buffer.from(JSON.stringify(params)),

{

correlationId,

replyTo

}

)

}) module.exports = invoker

This uses the messageCorrelator and is very easy to test. There’s no branching logic in this function at all. The function doesn’t care if the resulting promise is resolved or rejected, so the tests are very simple.

In a number of tests use a test utility to create a fake channel that I can pass in instead of a real amqp channel .

const fakeChannel = () => ({

assertExchange: stub(),

publish: stub(),

close: stub(),

assertQueue: stub(),

purgeQueue: stub(),

bindQueue: stub(),

prefetch: stub(),

consume: stub(),

ack: stub(),

nack: stub(),

sendToQueue: stub()

})

I use proxyquire to stub out the messageCorrelator as I’ve already tested that separately, so the test looks like this:

const { expect } = require('chai')

const { stub, match } = require('sinon')

const proxyquire = require('proxyquire') const { fakeChannel } = require('../fakes') describe('utils/invoker', () => {

const channel = fakeChannel()

const messageCorrelator = stub()

const correlationId = '12345'

const name = 'some name'

const param = 'some param'

const replyTo = 'some replyTo address' const invoker = proxyquire('../../../src/utils/invoker', {

'./messageCorrelator': messageCorrelator

}) const invocation = invoker(correlationId, channel, replyTo)

const message = 'a message' before(() => {

messageCorrelator.returns(message)

invocation(name, [param])

}) it('called the messageCorrelator with the right values', () => {

expect(messageCorrelator).to.have.been.calledWith(

correlationId,

match.func,

match.func

)

}) it('called channel.consume with the right values', () => {

expect(channel.consume).to.have.been.calledWith(

replyTo,

message,

{ noAck: true }

)

}) it('called channel.sendToQueue with the right values', () => {

expect(channel.sendToQueue).to.have.been.calledWith(

name,

Buffer.from(JSON.stringify([param])),

{ correlationId, replyTo }

)

})

})

Then I could refactor the original invoke function to be much simpler:

const invoke = async (name, ...params) => {

if (!channel) throw new Error(QUEUE_NOT_STARTED)

const queue = await channel.assertQueue('', { exclusive: true })

const correlationId = v4()

const replyTo = queue.queue

const invocation = invoker(correlationId, channel, replyTo)

return invocation(name, params)

}

This is much easier to test.

describe('invoke', () => {

const invocation = stub() context('before the delegator was started', () => {

before(() => {

delegator = makeDelegator({ exchange })

}) after(resetHistory) it('throws QUEUE_NOT_STARTED', () =>

expect(delegator.invoke())

.to.be.rejectedWith(QUEUE_NOT_STARTED))

}) context('after the delegator was started', () => {

const name = 'some name'

const param = 'some param' before(async () => {

queue = fakeQueue()

delegator = makeDelegator({ exchange })

channel = fakeChannel()

connection = fakeConnection()

channel.assertQueue.resolves(queue)

connection.createChannel.resolves(channel)

amqplib.connect.resolves(connection)

await delegator.start()

invocation.resolves()

invoker.returns(invocation)

await delegator.invoke(name, param)

}) after(resetHistory) it('called channel.assertQueue with the right params', () => {

expect(channel.assertQueue).to.have.been.calledWith('', {

exclusive: true

})

}) it('called the invoker with the right params', () => {

expect(invoker).to.have.been.calledWith(

correlationId,

channel,

queue.queue

)

}) it('called invocation with the right params', () => {

expect(invocation).to.have.been.calledWith(name, [param])

})

})

})

I then applied the same sorts of decomposition to the other parts of the code that were previously very hard to test.

Conclusion

By adding mutation testing I was forced to go back and make my code more amenable to being tested, and as a consequence I’ve made it much more modular and much easier to reason about. The result is without a doubt better code, even though it actually works no better than the pre-mutation testing code.

The benefits are such that over the last couple of weekends I went back over all of the open-source code bases I maintain and added mutation testing to all of them, and fixed up all of the issues that the mutation testing revealed.

Links