$\begingroup$

You could use a method like eigenfaces, http://en.wikipedia.org/wiki/Eigenface. The following has a good walk through of the procedure as well as links to different implementations.

http://www.pages.drexel.edu/~sis26/Eigenface%20Tutorial.htm

From here it is common to use this in a classification approach, train a model and then predict cases. You could do this by training on a bunch of known celebrities and if you predict a face from twitter as one in your trained model of celebrities, remove it. Similar to this http://blog.cordiner.net/2010/12/02/eigenfaces-face-recognition-matlab/

This suffers from constant amendments. Soon there will be a new Justin Bieber that wont be in your trained model, so you cant predict it. There is also a case like Whitney Houston, you may have never thought to add her before but she may be a common image out of respect and admiration for a few weeks. You will not have the downside of baby pictures as mentioned above though. To over come these problems you could use more of a hierarchical clustering approach. Removing the first few sets of clusters that are very close if they reach a certain level of support, your first cluster has 15 items before a second is constructed. Now you don't have to worry about whose in your training model but you will fall to the baby pictures issue.