Goldman Sachs, Data Lineage, and Harry Potter Spells

Oh man…what a claim. I guess the job of the CEO is to rally the troops, but, really?

To me Goldman is the IBM of finance. Neither company has core values, which devalues their brand and enables them to lash about from acquisition to “bold new idea”, when all customers want is innovation on their part or a good product at a good price. It is actually much more difficult to hold fast to core values and innovate around these core values than it is to make deals and buy other companies or technologies.

There is a section where Polya quotes a passage from Pappus of Alexandria. The point of the passage is to illustrate an approach to problem solving whereby you assume what you’re trying to prove or find is in fact true. You do this to investigate the properties of this “object” that you’re searching for in a sort of working backward approach. In effect, starting from what properties the proposed solution to a problem would have to have, rather than starting with the given data and moving forward to the end solution. This approach works well for proofs by contradiction or what Polya calls Reductio ad Absurdum in the book.

“A wise man begins in the end, and a fool ends in the beginning.”

Translating ideas in English into mathematical notation is important, and also something I was particularly bad at. Although, I didn’t know I was bad at this until the fact that this translation was happening and was a necessary step to solving math problems was spelled out by Polya. He compares translating an idea into mathematical notation to translating English into French. In this way, you can’t do a simple translation of the English words directly into French words as you’re likely to miss some context. You also can’t always directly translate an idea into mathematical notation and should keep an eye out for using the appropriate notation when doing a translation as it can be the difference between having an “a-ha” moment with a problem and missing it.

I happened upon this project from an interview on software engineering daily about data engineering (in this case data lineage). I’d heard senior engineers discuss the idea and was curious to see how Airbnb was approaching the problem. The approach they’ve taken is to treat data exploration as a social media problem. In this way users and knowledge holders of the data can connect, pin, and like data sources. Presumably, the more liked a data source is, the more useful it would be. Also, if someone you’re working with has pinned a data source, it seems plausible that you would also be interested in this data source. It’s an interesting approach to the metadata problem and it’s renewed my interested in experimenting with graph databases and page rank (they’re using neo4j for the project).

I’d heard a lot of talk about things like data lineage in college but had always cast them aside as pedantic ideas. It’s not until you work at a large company dealing with massive amounts of data that is heterogenous and the data is the business that you realize how important a concept like this is.

This led me down a rabbit hole that I’ll likely continue to write about (see Lyft’s Open Source Amundsen and UCBerkeley’s Ground).

This content is for informational purposes only, you should not construe any such information or other material as legal, tax, investment, financial, or other advice. Nothing contained in this post constitutes a solicitation, recommendation, endorsement, or offer by myself to buy or sell any securities or other financial instruments in this or in in any other jurisdiction in which such solicitation or offer would be unlawful under the securities laws of such jurisdiction.

The opinions expressed in this publication are those of the author. They do not purport to reflect the opinions or views of the author’s employers.

Don’t forget to follow me on twitter, add me on linkedin, add me on facebook, subscribe to my blog, or follow me on medium for more content like this. I post a review of topics I find interesting every week. I’m always looking for new topics to cover, so if you have anything you find interesting and would like to discuss it with me please reach out!