Fun Stuff

seriousness-free school zone Geoff Hinton Facts In Jan 2012, I made a post on Google+ reproducing a dialog that occured between Geoff Hinton and Radford Neal, while Radford was giving a talk at a CIfAR workshop in 2004:

- Radford Neal: I don't necessarily think that the Bayesian method is the best thing to do in all cases...

- Geoff Hinton: Sorry Radford, my prior probability for you saying this is zero, so I couldn't hear what you said.

Vincent Vanhoucke sensed that this could be the start of a Geoff Hinton Facts meme. So, here we go: Geoff Hinton doesn't need to make hidden units. They hide by themselves when he approaches.

Geoff Hinton doesn't disagree with you, he contrastively diverges (from Vincent Vanhoucke)

Deep Belief Nets actually believe deeply in Geoff Hinton.

Geoff Hinton discovered how the brain really works. Once a year for the last 25 years.

Markov random fields think Geoff Hinton is intractable.

If you defy Geoff Hinton, he will maximize your entropy in no time. Your free energy will be gone even before you reach equilibrium.

Geoff Hinton can make you regret without bounds.

Geoff Hinton can make your weight decay (your weight, but unfortunately not mine).

Geoff Hinton doesn't need support vectors. He can support high-dimensional hyperplanes with his pinky finger.

A little-known fact about Geoff Hinton: he frequents Bayesians with prior convictions (with thanks to David Schwab).

All kernels that ever dared approaching Geoff Hinton woke up convolved.

Most farmhouses are surrounded by nice fields. Geoff Hinton's farmhouse lies in a hyper-plain, surrounded by a mean field, and has kernels in la grange.

The only kernel Geoff Hinton has ever used is a kernel of truth.

After an encounter with Geoff Hinton, support vectors become unhinged and suffer optimal hyper-pain (with thanks to Andrew Jamieson).

Geoff Hinton's generalizations are boundless.

Geoff Hinton goes directly to third Bayes.

Never interrupt one of Geoff Hinton's talks: you will suffer his wrath if you maximize the bargin'. Bayesian Sociology Bayesians are the only people who can feel marginalized after being integrated. Considering the evidence a posteriori, my belief is that I should have conjugated this sentence with the conditional tense, in order to multiply the prior approval of the reader. The Psychology of Miles per Gallons Most countries in the world measure the fuel consumption of their cars in liters per 100 kilometer. The US, practically alone in the world (a very frequent state of affair when it comes to measurement units), measures it in Miles per Gallon (MPG). You see, measuring fuel consumption backwards (distance traveled per unit of fuel), instead of in the proper way (fuel consumed per unit distance) allows marketers to call this fuel economy instead of fuel consumption. It's practically as if you actually save fuel by driving more. Where does MPG come from? Could it be a gimmick fomented by clever marketing executives in Detroit? Perhaps the choice of unit reflects a cultural difference between the attitude toward consumption by Americans versus (say) Europeans. By measuring the fuel consumption in liters (or gallons) per 100 kilometers (or 100 miles), European drivers can easily answer questions like "I must drive 1000 km per month, how much will that cost me?": just multiply the distance by the liter/km rating. By measuring it in miles per gallons, American drivers would have a hard time answering this type of question (it would involve a division). Instead, they can easily answer the following type of question: "I have 10 Gallons, how much can I drive?": just multiply the fuel quantity by the MPG. The European-style unit has a utilitarian/parcimonious flavor to it: "I don't want to drive more than necessary. This is how far I must drive. Tell me how much it will cost me". Whereas the American-style unit has a spend-as-much-as-you-have/consumerist flavor: "I don't really need to drive anywhere, but I have this much money/gas to spend, tell me how far I can go with that." I have been wondering about this strange choice of units ever since I moved to the US. A paper by two economists from Duke University in the June 20, 2008 edition of Science Magazine confirms that using MPG instead of GPHM (Gallons per 100 miles) totally confuses most Americans about the best ways to save fuel. Here is an example: would you save more fuel by switching from a 15MPG truck to a 30MPG sedan, or by switching from a 30MPG sedan to a 60MPG hybrid? Let's translate MPGs to GPHM: 15 MPG = 6.66 GPHM; 30 MPG = 3.33 GPHM; and 60 MPG = 1.66 GPHM. By switching from the 15MPG truck to the 30MPG car, you will save 3.33 Gallons per 100 miles driven. By switching from the 30MPG car to the 60MPG hybrid, you will save 1.66 Gallons per 100 miles driven. In other words, if you drive a fixed amount of miles, regardless of the car you own, you would save twice as much by ditching your truck for a sedan, than by ditching you sedan for a hybrid. When your ultimate objective function is Dollars, it seems strange to use a hyperbolic (inverse) mapping in which the Dollars (or Gallons) are in the denominator. The Duke paper actually suggest to use Gallons per 10,000 miles, probably because it would produce whole numbers, and would not be confusing for a large segment of the US population who is still unfamiliar with the decimal notation. Arguably, the best unit would be Gallons per hour, because most Americans measure driving distances in hours:

- "how far are we from New York City?"

- "About 2 hours"

- "I meant, how many miles?"

- "Oh, I dunno." The Theory of Everything Here it is, the Theory Of Everything: F(X)=0 ...for suitable values of F, and suitable interpretations of X (your mileage may vary). You can come up with another theory, but it will merely be a special case of this one. Terrorized I am everything the religious right despises: a scientist, an atheist, a leftist (by American standards at least), a university professor, and a Frenchman. Good thing Pat Robertson doesn't declare fatwahs. Oh wait..... No, Your Name can't possibly be pronounced that way This story illustrates how distinguished physicists, even when they won the Nobel Prize for the Quark theory such as Murray Gell-Mann, can make mistakes like you and me (mostly you). I have heard it all. Since I have lived in the US, I have heard my name pronounced in all kinds of interesting ways "Yawn Lee Koon", "Yen Leh Kahn", "Yan Lee Chun". All kinds of badly programmed computers thought that "Le" was my middle name. Even the science citation index knew me as "Y. L. Cun", which is one of the reasons I now spell my name "LeCun". Telemarketers call my home asking for a "Mr or Mrs Cun", to which I respond "there is no one by that name here" and hang up. The confusing morphology of my name has some advantages, like receiving colorful junk mail in chinese or vietnamese, receiving telemarketer offers for cheap international phone service to china, and even being on the mailing list of AT&T employees with Asian origins. So where does my name come from anyway, and how is it pronounced? My name is pretty typical of Bretagne (or Brittany as it is known in English), the western part of France that sticks out in the Atlantic Ocean. The indigenous population there has Celtic origins, just like the Irish, Welsh, and Scotts. Some people still speak the local Gaelic language called Breton. "Yann" is the Breton form of Ian/John/Jean/Jan/Johannes in Irish/English/French/Dutch/German. The French pronounce it "Yahn", but the real Bretons pronounce it "Yawnn" with a short "awn" and a long "n". "Le Cun" derives from the old Breton form "Le Cunff", which means something like "nice guy", and originates from the region of Guingamp in northern Brittany. Back to Murray Gell-Mann. I was once at a very interesting workshop on the Physics of Computation at the Santa Fe Institute. At the break, Murray Gell-Mann, who was sitting accross the table from me, told me: "your name is Breton, isn't it?". Not realizing that his favorite hobby is linguistics, I complimented him on the breadth of his culture. He then asked "How do you pronounce your name?". I said "Le Cun" with the nasalized "un", which sounds kinda like the "uh" in "huh?" (in other words, the "n" is silent). He paused for a minute and said "There are no nasalized consonnants in Breton", then added with the assertive tone of a Nobel-prize-winning Caltech professor "your name cannot possibly be pronounced that way". Many scientists (myself included) take a sadistic pleasure in proving other people wrong, but here he was telling me how to pronounce my own name. I was so flabbergasted by so much chutzpah (pardon my French) that whatever I knew about the Breton language was temporarily obliterated from my cerebral cortex. I just sat there for a while with my jaw dropped on the floor. The only response I could come up with was "uh, my grand-father pronounces it that way, and uh, he can speak some Breton, so it must be right." Not only are there nasalized consonnant in Breton, Professor Gell-Mann, but there is a veritable deluge of them. In fact, you would have a hard time finding another language with so many nasalized sounds. For a minute, I was impressed though. Incidentally, be advised that Murray Gell-Mann gets upset if you do not pronounce his name Gell Mahnnn (with a hard "g" and a long "ah"), but pronounce it Gelmin, or Jelmin instead. Shakespeare and Bayes are in a Boat... Shakespeare and Bayes are in a boat, fishing. Bayes is trying to figure out which net to cast when Shakespeare says:

"loopy or not loopy? that is the question". [inside joke for Bayesian Net geeks] Who is Tex Avery Anyway? This story was written in 1998. Since then, the appearance of zone-free DVD players, digital TVs with component inputs, and low-cost video projectors has made it much easier to play movies from accross the Big Pond. Funnily enough, although the (almost) complete collection of Tex Avery cartoons appeared on DVD in 2001, it is only marketed in France (go to fnac.fr or Amazon.fr and search for "tex avery"), and is not available in the US (see Amazon.con). Sadly, the DVD edition has been censored for political correctness (The VHS version was untouched).. French people are generally known for their utter contempt of every product of the American culture ("or lack thereof", as my friend John Denker would say with a smile). But there are two notable exceptions to this attitude, two pure products of the American culture that the French have embraced wholeheartedly (and no, one of them is not Jerry Lewis): Jazz music, and Tex Avery cartoons. "Who is Tex Avery anyway?" you may ask. Well, consult your local Frenchie. All the frogs know who he is (some local folks who happen to watch the Cartoon Channel on sunday nights might also know). Tex Avery was the most inventive American creator of animated cartoons in the 40's and 50's. He practically invented all the ideas, the gags, and the style for all the characters we see today in non-Disney cartoons (Bugs Bunny, Road Runner, Tom and Jerry,....). But his cartoons were not made only for kids, and sometimes not for kids at all. His relative anonymity is the result of political correctness, Disneyist revisionism, or the fact that in the Hollywood film business, an auteur recognized as such, especially when it comes to cartoons, is as much of a foreign concept as it is a foreign word. At this point, your next question will be, "Where can I get those wonderful cartoons in the US?".

It's quite simple really: Take the next flight to Paris

go to the FNAC superstore and buy the complete collection of Tex Avery cartoons on VHS (8 videocassettes).

back in the US, put the first tape in your VCR.

realize with horror that the tapes are in SECAM, the French TV standard, and not in your beloved NTSC.

go on the Web, and order a multi-standard VCR for $3,345,653

argh, the VCR can play NTSC, PAL, middle east SECAM, russian SECAM, but NOT FRENCH SECAM.

return the VCR, and order a "worldwide" model for $7,456,234. Realize that you could have bought a multi-system VCR in Paris for a fraction of the price.

Aaaarrgh, it has a European power plug, a funny antenna connector, and is not cable ready. Plug it in your TV anyway.

AAAWWAAAARGH, the VCR spits out SECAM, but your TV only accepts NTSC.

spend the next five years foraging electronics stores to realize in the end that multi-system TVs larger than 25 inches are virtually impossible to find in the US.

tell your story to your European acquaintances. Hear them respond that it's just another example of America acting like the rest of the world doesn't exist.

ridicule their un-American remarks by responding that the largest American TV manufacturer (RCA/ProScan) is actually a subsidiary of Thomson Multimedia, a French conglomerate whose largest shareholder is the French government, yet they do not offer their multi-system TVs on the US market.

draw the logical conclusion that there must be no demand for multi-system TVs in the US, and therefore that it is American consumers who act like the rest of the world does not exist.

who act like the rest of the world does not exist. get $6,000 out of your purse, and buy a professional LCD video projector, since most of them can do SECAM.

draw the logical conclusion that Corporate America for whom these projectors are built, does know about the rest of the world.

for whom these projectors are built, know about the rest of the world. smash your head on the wall after realizing that you could have bought a SECAM to NTSC digital converter for slightly less money.

watch the cartoons and laugh yourself to death at the sight of Adolf Wolf, Screwy Squirrel, Droopy, and Bad-Luck Blackie. You need a good laugh, after all the time and money you just wasted. Hey, the tapes even have French subtitles! Steep Learning Curves and other erroneous metaphors Interesting peculiarities of American English never cease to amuse non-native speakers like me. Many languages are full of idiotic idioms, but I find two idioms widely used in Corporate America particularly amusing because they reveal major misconceptions about simple mathematical concepts. Have you ever wondered where the expression "steep learning curve" comes from? The first time you heard it was most likely from a suit-wearing business person talking about a new and intimidating piece of software he/she knew nothing about, as in: I heard this Linux thing is cool but has a steep learning curve. What he/she meant by that was that learning to use this software was a long and arduous process. Now folks, hear me loud and clear: steep learning curves are your friends. Take my word for it, I am a machine learning guy, I know what I'm talking about. Take the picture on the right. These are what we, learning folks, call learning curves. They represent skill, accuracy, or performance at a particular task (say as percentage of success) as a function of time. Here, the red curve corresponds to quick learning, and the blue curve to slower learning. Oddly enough, the steep curve is the red curve, the one that corresponds to a quickly learned task. Behind the expression "steep learning curve" lies an everyday metaphor erroneously applied to a mathematical concept. The metaphor is that the poor learner must somehow "climb" the learning curve. So, steeper must be harder. But it's not like that at all. A Learning curve are more like a mountain with a ski lift: you get pulled at constant horizontal speed. The steeper it is, the sooner you get to the top. So next time a suit tells you about a technology with a "steep learning curve", you can reply "that means even you could learn it quickly". The second amusing expression with an erroneous mathematical metaphor is "least common denominator". It designates something (generally a movie, TV show, or political agenda) that is sufficiently unsophisticated to be appealing to a large majority of the population. In other words, it is the most complex thing that everyone can understand. Now arguably, the mathematical concept that best fits the metaphor would not be the least common denominator, but rather, the greatest common divisor (GCD). As any 5-th grader could tell you (to which Groucho Marx would reply "get me a 5-th grader") the GCD is the largest quantity by which you can divide a collection of numbers and still get whole numbers as a result. When two numbers have no divisors in common, the GCD is 1. When two people have no common cultural reference, their metaphorical GCD is small. Now, the problem is that having the word "greatest" as part of a derogatory expression somehow does not sound right. Hence the use of "least common denominator" which is metaphorically inadequate, but simply sounds right. Vladimir Vapnik, Cosmic Conqueror Warning: this is an inside-inside joke. OPERATOR: MAIN SCREEN TURN ON.

CAPTAIN: IT'S YOU !!

VVCC: HOW ARE YOU GENTLEMEN !!

VVCC: ALL YOUR BAYES ARE BELONG TO US.

VVCC: YOU ARE ON THE WAY TO DESTRUCTION.

CAPTAIN: WHAT YOU SAY !!

VVCC: YOU HAVE NO CHANCE TO SURVIVE MAKE YOUR TIME.

VVCC: HA HA HA HA ....

OPERATOR: MAIN SCREEN TURN ON.CAPTAIN: IT'S YOU !!VVCC: HOW ARE YOU GENTLEMEN !!VVCC: ALL YOURARE BELONG TO US.VVCC: YOU ARE ON THE WAY TO DESTRUCTION.CAPTAIN: WHAT YOU SAY !!VVCC: YOU HAVE NO CHANCE TO SURVIVE MAKE YOUR TIME.VVCC: HA HA HA HA .... If you get the joke, you probably belong to the intersection of two very exclusive subcultures: statistical learning theory scholars and classic computer game history scholars. If you don't get the joke, visit the official ALL YOUR BASE ARE BELONG TO US website. Then, learn about Vladimir Vapnik's approach to research. Cheap Philosophy (42 cents) Even the smartest people around think they are not smart enough to understand the meaning of life, the universe, and everything. Yet many not-so-smart people think they are smart enough to have the answer to the big question of life, the universe, and everything. And they don't even know it's 42. A Mathematical Theory of Empty Disclaimers "All copyrights and trademarks are property of their respective owners". Bleeding-edge research in a new branch of mathematics (the application of non-standard elliptical analysis to information theory), has recently proved that the above sentence carries exactly 0 bit of information. Researchers in yet another field of applied mathematics (applications of non-standard elliptical analysis to decision making) have conjectured that only the fear of lawsuits will drive people to write sentences with provably zero information content. The Axis of Rivals March 2003: Rupert Murdoch has mounted a pretty nasty campaign of French bashing and jingoistic name-calling through his various rags on both sides of the Atlantic. Following Bush's use of the phrase "Axis of Evil", Murdoch pun-ished France and Germany by calling them the "Axis of Weasels", apparently stealing the joke from an Internet blog pun-dit. I don't know much about weasels, but I know a bit about bad puns. So, let's start the Pun-ic war against le calembourage de crane: Q: What do you call an exhibition of French and German paintings? A: The axis of easel. Q: What do you call simultaneous anti-war protests in Paris and Berlin? A: the axis upheaval. Q: What do you call simultaneous pro-war demonstrations in Washington and London? A: the axis of fizzle. Enough bad puns! (The Geneva convention against torture, and the US constitutional protection against cruel and unusual pun-ishments forbid me to write more than three atrocious puns in a row). For my part, name-calling notwithstanding, and despite the seemingly astronomical distance along the "Axis of Rivals" between Washington and Paris these days, I want to believe that the imaginary line that links the Statue of Liberty in New York harbor to the banks of the Seine river and the Champ de Mars in Paris is still very short, at least in the minds of the people, if not the minds of their leaders. How is this imaginary line called? The Axis of Eiffel, bien sur. Gustave Eiffel designed the Eiffel Tower, and designed the supporting structure of the Statue of Liberty, the most prominent symbol of Franco-American friendship and shared values. Is Yann LeCun Philippe Kahn's Evil Brother? Several years ago I ran into a friend from college whom I had not seen in a long while. When he saw me, his first words were "Philippe Kahn! you look just like Philippe Kahn". I knew who Philippe Kahn was (founder of Borland and all), but I did not know what he looked like, until I read an article about him recently. He and I do look somewhat alike (from a distance), but as I discovered, ze similarities don't stop zere. Consider zis: both of us are French computer scientists who emigrated to ze US in our late 20's. Our favorite activities are hacking, racing sailboats, and playing Jazz on wind instruments. It is kind of suspicious zat our fazers are both aerospace engineers, but his dad worked on ze Concorde and mine on ze Airbus. So you see, we can't be brozers. I did change ze spelling of my name when I moved to ze US, but I changed it to "LeCun" from "Le Cun", not from "Le Kahn". Ze fact zat his philantropic organisation is called ze Lee Kahn foundation is purely coincidental. Now for ze fun part. My latest project DjVu has to do with compression technology for distributing images over ze web (using wavelets and ozer things). His latest project SurfLight has to do with compression technology for distributing images over ze wireless web (using wavelets and ozer things). How's zat for a coincidence? Zere are a few differences between us. His TurboPascal gazzered a few million users, while Lush, ze Lisp-like language I helped create has (so far) a couple thousands users (yeah, but what users!). At least Lush has more users today zan TurboProlog. Anozer difference is zat I made a name for myself in Science while he made his in ze high tech business. In ozer words, I have ze recognition of my peers (some of zem anyway ... I hope), while he only has a few million bucks, and ze hate of Bill G. So you see, reports of a connection between us are greatly exaggerated. Besides, we never met.