Exercism: The RNA Transcription Exercise

The Readme

The Test Suite

As with the Gigasecond exercise, it doesn’t take much to get this to pass. The example solution is:

1 2 3 4 5 6 7 8 9 class Complement def self . of_dna ( strand ) strand . tr ( 'CGTA' , 'GCAU' ) end def self . of_rna ( strand ) strand . tr ( 'GCAU' , 'CGTA' ) end end

(An aside: Did you know that Exercism has example solutions in their Git repo? I did not. I was wondering why 40% of people’s solutions looked exactly the same. Copying and pasting ain’t learning, folks)

I, personally, have never seen tr in the wild, so that was not my first solution. Mine was super dumb, and much longer.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 class Complement def self . of_dna ( strand ) ret = "" strand . each_char { | x | ret << find_dna_complement_of ( x ) } ret end def self . of_rna ( strand ) ret = "" strand . each_char { | x | ret << find_rna_complement_of ( x ) } ret end def self . find_dna_complement_of ( nucleotide ) case nucleotide when 'C' 'G' when 'G' 'C' when 'T' 'A' when 'A' 'U' end end def self . find_rna_complement_of ( nucleotide ) case nucleotide when 'C' 'G' when 'G' 'C' when 'U' 'A' when 'A' 'T' end end end

My commit message talks a bit about the duplication in this code and I try to highlight the knowledge duplication that I want to remove in my next commit. Which I did, thusly:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 #... def self . of_dna ( strand ) build_complement_for ( strand , "dna" ) end def self . of_rna ( strand ) build_complement_for ( strand , "rna" ) end def self . build_complement_for ( strand , type ) ret = "" strand . each_char { | x | ret << public_send ( "find_ #{ type } _complement_of" . to_sym , x ) } ret end #...

Better. But a giant red flag just went up — a parameter named type . That almost always means one thing: I’m trying to implement polymorphism without using classes. Rarely a good idea. Here come the RNA and DNA classes:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 class Complement def self . of_dna ( strand ) build_complement_for ( strand , DNA ) end def self . of_rna ( strand ) build_complement_for ( strand , RNA ) end def self . build_complement_for ( strand , type ) ret = "" strand . each_char { | x | ret << type . new ( x ) . complement } ret end end class RNA attr_accessor :nucleotide def initialize ( nucleotide ) self . nucleotide = nucleotide end def complement case nucleotide when 'C' 'G' when 'G' 'C' when 'U' 'A' when 'A' 'T' end end end class DNA #...same as RNA except for the complement return values end

Ok, but why don’t RNA and DNA know anything about strands? That seems like something they should know.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 class Complement def self . of_dna ( strand ) DNA . new ( strand ) . complement end def self . of_rna ( strand ) RNA . new ( strand ) . complement end end class RNA attr_accessor :strand def initialize ( strand ) self . strand = strand end def nucleotides strand . chars end def complement nucleotides . inject ( "" ) do | ret , nucleotide | ret << complement_of ( nucleotide ) end end #... end #...

And then we can make it more Ruby-like by using its to_* idiom, creating the to_dna and to_rna methods

Also, here I’m moving the resonsibility of knowing about complements. Previously, DNA would know what its RNA complement was. Why? DNA should know DNA complements and RNA should know RNA complements. If DNA wants to know about RNA, it should ask RNA.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 class Complement def self . of_dna ( strand ) DNA . new ( strand ) . to_rna end def self . of_rna ( strand ) RNA . new ( strand ) . to_dna end end class RNA attr_accessor :strand def self . complement_for ( nucleotide ) { "C" => "G" , "G" => "C" , "T" => "A" , "A" => "U" } [ nucleotide ] end #... def to_dna nucleotides . inject ( "" ) do | ret , nucleotide | ret << DNA . complement_for ( nucleotide ) end end end class DNA #... you can imagine what this looks like end

Finally, I tackle the obvious problem of inheritance. DNA and RNA are both examples of nucleic acids and their behavior is almost exactly the same. I’m comfortable with using inheritance here because I don’t see any of the common inheritance problems arising. This is a shallow, narrow object family; and there aren’t going to be weird grand-children classes or partial API implementations. My final solution to the problem:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 class Complement def self . of_dna ( strand ) DNA . new ( strand ) . to_rna end def self . of_rna ( strand ) RNA . new ( strand ) . to_dna end end class NucleicAcid attr_accessor :strand def initialize ( strand ) self . strand = strand end def nucleotides strand . chars end def transcribe_to ( acid ) nucleotides . inject ( "" ) do | ret , nucleotide | ret << acid . complement_for ( nucleotide ) end end def to_dna raise StandardError , "Call this on descendants" end def to_rna raise StandardError , "Call this on descendants" end def self . complement_for ( nucleotide ) raise StandardError , "Call this on descendants" end end class RNA < NucleicAcid def self . complement_for ( nucleotide ) { "C" => "G" , "G" => "C" , "T" => "A" , "A" => "U" } [ nucleotide ] end def to_dna transcribe_to ( DNA ) end def to_rna self end end class DNA < NucleicAcid def self . complement_for ( nucleotide ) { "C" => "G" , "G" => "C" , "U" => "A" , "A" => "T" } [ nucleotide ] end def to_rna transcribe_to ( RNA ) end def to_dna self end end

My one hesitation here was having a to_dna method on DNA and a to_rna method on RNA. These methods have to be there to satisfy Liskov, but I wondered if they were necessary. However, Ruby actually has a lot of methods like this. For example, String instances respond to to_s , and Integers respond to to_i . Realizing that made me a lot more comfortable with my approach.

As has been true in all of my Exercism solutions, this code goes far beyond what it needs to do in order to get the tests to pass. It’s also about 60 lines longer than Exercism’s own example solution. Is there value in this verbosity? That is not a question with a single answer. If I were writing this code to help me pass a Biology 101 class, then no. If I were writing it for use in a synthetic biology lab that is creating their own nucleic acids? Then maybe.

But I’m writing it for Exercism (and for these blog posts). Exercism wants me to, “Make the tests pass. Keep the code as simple, readable, and expressive as you can.” And it advises nitpickers to make suggestions that make the code:

Simple Readable Maintainable Modular

Those 4 rules are pretty close to the “4 Rules of Simple Design”, as stated by Corey Haines

Tests pass Express Intent No duplication of knowledge Small

Or, if you prefer Sandi Metz’s acronym, code that is TRUE

Transparent Reasonable Usable Exemplary

These descriptions of “good design” (or “better design”, if you’re Corey Haines) are all different words to describe code that people have found easy to work with over a long period of time. Because design doesn’t matter if you just want to run the code once. If you want to do that, just slap in Exercism’s example code and move on. Plenty of people do.

But if you’re trying to design better code, then you need to look at more than just making a few tests pass. The questions I like to ask myself are:

Will I understand this code if I look at it in 6 months? Will my co-workers understand this code when they have to fix it? Will future maintaners be able to extend this code with very little hassle? Will other teams be able to extend this code without problems?

The first question affects just me, the second affects 3-5 people, the third affects dozens and the fourth affects an uknown number of people. Thinking about the number of people that will be able to easily understand/use/modify your code is a usefuly way of thinking about design. Poorly designed code will not satisfy many people; well desgined code will.

So, with that in mind, let’s circle back to the original question. Is there value in my Exercism solution? I obviously think so, but I’m biased. I’ll leave the question for you to answer. Which code would you rather work with?