Data Abstraction and Problem Solving with C++

Walls and Mirrors. By Frank M. Carrano, Timothy Henry, Janet Prichard.

Amazon link to the book. (you could find the same book with examples in Java).

Novice programmers often struggle at understanding data structures and algorithms. Experienced developers suggest reading Introduction to Algorithms by Cormen, Leiserson, Rivest, Stein (usually referred to as CLRS).

Many newcomers and experienced coders who’ve realized early enough that fundamental algorithms and data structures are a must for successful programming career have struggled at “hard” books like the Introduction to Algorithms (though the title says otherwise).

One of the main reasons that programmers find it hard to understand algorithms is being able to measure the complexity of an algorithm (or a data structure’s operations). Another difficulty for beginners is the variety of data structures and pitfalls in their implementations.

I found the “Data Abstraction and Problem Solving with C++” when I was searching for a great introductory material that won’t leave the reader with the easy stuff but will dive into rather complex topics with an easy to follow approach. This book is a pearl for those of you who’ve or had a similar problem. It introduces to data structures and algorithms in a way that is both engaging and fruitful. Most of the topics are discussed in great detail so that even experienced programmers will find it truly attractive.

Why algorithms and data structures are so damn popular?

Because of companies like Google, Facebook, and alike. They tend to ask algorithmic problems at technical interviews. Strong problem-solving skills are a must for engineers working at big companies (and now at small companies, too).

Too many times people (sometimes I refer to developers as people) complain about the problems asked at companies (mostly big ones) during the interviews. Guys writing successful or famous applications get rejected from Google, Facebook, or others because they don’t know how to invert a binary tree.

These companies face challenging engineering problems every day, starting from planning and managing cheap and fast hardware resources and ending with loading the content on the device of the end-user in the fastest way. You might have written a couple of successful mobile or desktop applications, but the approach you choose in solving particular problems in the application defines your level as a programmer. For example, you can design an email client that shows the recent ten emails [in its Home screen, I guess] and it could have the best UI out there; displaying ten recent emails will work smoothly on almost any device, trust me.

Now, let’s suppose the user of your email application will receive hundreds of thousands of emails, say, in 2 years of using your application. When the user will need to search for a particular email (by subject or content or whatever), that’s where your skills may play a significant role. The way you stored the hundred thousand emails and the methods (algorithms) you used to sort and search them will now make you a rockstar developer or a hotdog beggar in the eyes of your end-user. Why so? Suppose you store all the incoming emails in an array.

struct Email {

string subject;

string body;

string from;

date when;

};

...

// let's suppose a million emails is the max for anyone

int MAX_EMAILS = 1000000;

Email inbox[1000000];

We can store the recent ten emails in any form that won’t affect the performance of the application. Issues arise when we try to manipulate thousands of emails stored in the inbox array. What if we want to search for the word “friend” in all emails? We have to scan all the emails in the array and collect ones containing the word “friend” in a separate array.

search(word) {

Email search_results[1000];

for (...) {

if (inbox[i].subject.contains(word)) {

search_results.insert(inbox[i]);

}

}

return search_results;

}

Obviously, the complexity of the algorithm (the speed at which it runs) is linear (i.e. it’s not perfect), so if the number of emails is N, then the number of operations performed to find all the emails containing the search term will be N (in the worst case). More than that, it also starts a search in the subject line of each email (by calling the contains() function). Considering that the search function would also try to find matches in the email body too, the running time will rise even more. So if one operation takes a millisecond, then searching among one hundred thousands of emails will take more than a minute (100 seconds to be exact). I guess you’ve never waited so long when searching for emails in any of your email clients. Here comes the need for proper use of data structures and algorithms.

Just to make the idea clear, let’s suppose we process each incoming email’s subject and body to store each word in a hashtable. The hashtable slot will refer to the email object that contains the word. When the user will search for any word, for example, “friend”, we will return the list referred by the slot of the hashtable having the “friend” key.

search(word) {

return hashtable[word];

}

This is a constant-time operation and in computer science, it’s considered as the ultimate goal of efficiency. In simple words, searching among emails would take less than a second.

Now imagine that you have a great startup that produces some fascinating application or provides internet service. Would you hire someone who wouldn’t master the art of problem-solving, or wouldn’t know when and how to use a hashtable (or other data structures)?

Now let’s explore one of the best books for studying algorithms and data structures.

Mmm… Algorithms