Exciting times for machine learning and artificial intelligence fans: IBM’s Watson computer versus two of the flesh bound starting tonight, Feb 14, 2011, on the game show Jeopardy. And not just any humans, I guess, but Ken Jennings and Brad Rutter (I suppose the latter is less known to non-Jeopardy fanatics.) Too bad they aren’t competing against Seth Lloyd (someone have a link to the game show Seth won?) Here is a nice Nova special on the machine and the challenge. I sure hope Arnold Schwarzenegger is in the audience so he can tell us whether Watson is a good Terminator or a bad Terminator.
Update: Here is an amusing curmudgeonly discussion of Watson by Noam Chomsky. Don’t worry Noam, I’m sure Watson won’t feel sad about all your mean words 🙂

Mythical Man 26 Years

This morning I was re-reading David Deutsch’s classic paper “Quantum Theory, the Church-Turing Principle and the Universal Quantum Computer”, Proc. of the Roy. Soc. London A, 400, 97-117 (1985) This is the paper where he explicitly shows an example of a quantum speedup over what classical computers can do, the first time an explicit example of this effect had been pointed out. Amusingly his algorithm is not the one most people call Deutsch’s algorithm. But what I found funny was that I had forgotten about the last line of the article:

From what I have said, programs exist that would (in order of increasing difficulty) test the Bell inequality, test the linearity of quantum dynamics, and test the Everett interpretation. I leave it to the reader to write them.

I guess we are still waiting on a program for that last problem?

Consequence of the Concept of the Universe as a Computer

The ACM’s Ubiquity has been running a symposium on the question What is Computation?. Amusingly they let a slacker like me take a shot at the question and my essay has now been posted: Computation and Fundamental Physics. As a reviewer of the article said, this reads like an article someone would have written after attending a science fiction convention. Which I think was supposed to be an insult, but which I take as a blessing. For the experts in the audience, the fun part starts at the “Fundamental Physics” heading.

A Mathematical Definition of News?

Lately I’ve been thinking about the news. Mostly this involves me shouting obscenities at the radio or the internet for wasting my time with news items the depth of which couldn’t drown an ant and whose factual status makes fairy tales look like rigorous mathematical texts (you know the kind labeled “Introductory X”.) But also (and less violently) I’ve been pondering my favorite type of question, the quantification question: how would one “measure” the news?
Part of motivation for even suggesting that there is a measure of “news” is that if someone asked me if there was a measure of “information” back when I was a wee lad, I would have said they were crazy. How could one “measure” something so abstract and multifaceted as “information?” However there is a nice answer to how to measure information and this answer is given by the Shannon entropy. Of course this answer doesn’t satisfy everyone, but the nice thing about it is that it is the answer to a well defined operational question about resources.
Another thought that strikes me is that, of course Google knows the answer. Or at least there is an algorithm for Google News. Similarly Twitter has an algorithm for spotting trending topics. And of course there are less well known examples like Thoora which seeks to deliver news that is trending in social media. And probably there is academic literature out there about these algorithms, the best I could find with some small google-fu is TwitterMonitor: trend detection over the twitter stream. But all of this is very algorithm centered. The question I want to ask is what quantity are these services attempting to maximize (is it even the same quantity?)
The first observation is that clearly news has a very strong temporal component. If I took all of the newspapers, communications, books, letters, etc. that mankind has produced and regarded it without respect to time you wouldn’t convince many that there is news in this body of raw data (except that there are some monkeys who can type rather well.) Certainly also it seems that news has a time-frame. That is one could easily imagine a quantity that discusses the news of the day, the news of the week, etc.
A second observation is that we can probably define some limits. Suppose that we are examining tweets and that we are looking for news items on a day time scale. We could take the words in the different day’s tweets and make a frequency table for all of these words. A situation in which there is a maximum amount of news on the second day is then a situation where on the first day the frequency distribution over words is peeked one one word, while the second day is all concentrated on another word. One could probably also argue that, on the day time scale, if both frequency distributions were peaked on the same word, then this would not be (day scale) news (it might be week scale news, however.)
This all suggests that our friend, the news, is nothing more than the total variation distance. For two probability distributions p(x) and q(x) , the variation distance between these distribution is d(p,q)=frac{1}{2} sum_{x} |p(x)-q(x)| . This is also equal to sup_{E subset X} |P(E)-Q(E)| where P(E)=sum_{x in E} p(x) and similarly for Q(E). Ah, so perhaps this is not as exciting as I’d hoped 🙂 But at least it gives me a new way to talk about the variational distance between two probability distributions: this is a measure of the news that we could associate with changing from one probability distribution to another.
Of course this is just one approach to thinking about how to quantify “news.” What are the drawbacks for my method and what should a real measure have that this one lacks? I mean whats the worst that could happen in thinking about this problem. Okay, so maybe you would learn how many holes it takes
to fill the Albert Hall.

Balancing the Budget, One NSF Grant at a Time

Now THIS is the kind of idea we pay our Republican house representatives to come up with. A website where we can go through NSF grants and identify the ones we think should not be funded, balancing the budget, one NSF grant at a time.But clearly this is barking up the wrong tree! The NSF budget is only $7 billion-ish (and there is no WAY that this budget pays for itself by barely maintaining the most innovative economy in the world. Psah you say!) So…
Anyone want to help me build a website where we go around and identify senior citizens that are collecting social security but have not contributed enough in their life to merit this money? Grandpa can appear on youtube where he’ll describe what exact it is that he did in his life that merits his current social security check. Too young to fight in world war two, that no good lazy bum, cut his check! BAM, social security solved!
Next we can expand into hospitals where we will be able to identify tons of cost cutting measures. Does little Suzy really need that surgery? See little Suzy via a snazzy web interface. Ask her questions. Find out she is a very unproductive member of society, what with her 3rd grade reading skills and 4th grade math skills. No surgery for you little Suzy! BAM, Medicare problem solved!
Moving down the budget we get to the military. My first suggestion was that we take all members of the armed forces, count the number of people they have killed, sort the list, and start chopping from the bottom up. BAM, military spending cut! Okay that doesn’t use the web and well qualified internet surfers to help us solve this problem. We could have the surfers do the sorting (internet sort is a less well studied sorting algorithm taking 2N time to sort a list of length N, and usually results in the death of far too many neurons.) So instead we could put up videos of every member of the military and vote on whether they are dangerous enough looking to merit their pay. BAM, military spending cut! For a second time! And we’d win wars just by glancing menacingly at our enemies!
And what about income tax rates? Well I suggest we make a great tool where people can vote on what they’d like marginal tax rates to be. And then we can exactly INVERT the results. BAM, income distribution problem fixed!
Okay, enough with reason number 1231 why I am not a Republican.
P.S. If you go to the website for this spirited effort, http://republicanwhip.house.gov/YouCut/Review.htm, the web form doesn’t appear to verify that you’ve submitted a valid email address or a grant, and well, you know that those don’t have to be real anyway. Just saying. 😉

Quantum Job of the Teleportation Kind

teleportation technologists seek talented MBA (SOMA / south beach)

Date: 2010-12-03, 2:45PM PST
Reply to: [Errors when replying to ads?]

We are two engineers, brothers (both CalTech grads), who have developed and built a novel teleportation device. Over the last twelve months, we have tested our prototype system with up to 800 kg payloads, over distances of 300 miles. It is portable, safe, but does require a substantial (1800 W) power supply at both sending and receiving locations.
We believe our teleportation device could substantially disrupt many “last-mile” transit technologies and generate extraordinary returns for investors.
The only missing piece is YOU! We are looking for one or two newly-minted MBAs to help us develop critical assets — a logo, a Powerpoint deck, and Excel projections — needed to attract venture capital. In return, we would seek to retain a minority stake in the final venture.
Please send a resume and any other information that may set you apart from other MBAs with pitch experience. Can’t wait for you to join our team!

Morphing Science News?

Yesterday I noted that the New York Times article on the Nobel prize award for graphene said that the paper had been rejected by Nature and accepted by Science. Interestingly, today I got an email from a science journalist who noted that this statement doesn’t appear anywhere in the article. And the journalist is right! Since the New York Times isn’t cached by Google I have no way to verify the original statement. Anyone else remember that line from the article? And why does the New York Times not allow access to all versions of an article (like the arXiv!) or at least make a statement that the article has been modified from its original form. Inquiring minds want to know 🙂
Update: Of course, not to be a hypocrite, shouldn’t my blog also have access to all versions, including the ones where I spelled graphene “graphine” and the one where there isn’t this update? Is there a plugin that does this? And also I would like to know if I hallucinated this entire episode (i.e. the last sentences above only make sense if it was my own hallucination 🙂 )


Down the rabbit hole of universes I dread…
Opening up the daily scientific journal of Oct 23, 2051, a torrent of computer modern fonted journal articles confront me with yet more additions to the proven facts of the Human and Robot American Republic’s known knowledge. New bits xored into the database of that which is true, to be consumed by the rational robots that govern our modern libretarian robotocracy.
Switching on the music now, the artificial intelligence picks out a song guaranteed to satisfy my previous preferences, clustering me into a machine learned sisyphusian hell, from which esape isn’t even listed on the menu of options. How soothed the past feels, served up one associative memory inspired lyric at a time…with catchy bridge lyrics too boot!
But short this entertainment must last, for the advertising dollars that are being lost not peddling to my immediate virtual surroundings the ABSOLUTE bargain that is a Google sponsored cruise to the port of Long Beach. Buy it now, see cargo shipped through direct express tubes to the heartland of America! Witness the end result of the optimal combinatorial solution to economic policy, executed within epsilon of the NP-hard solution.
But all of this is, of course, just a distraction from the true meaning of life: to appear unfiltered on national YouTube. To live, alas to be viral, among the social webs, that truely is the one goal that will make life worth living. A life lived adored by millions justifying any lack of substance, for what is substance if not another check mark in the list of known knowledge?
To sleep now, a day spent pushing the epsilons of an optimized marketing drive, epsilon small but multiplied by billions. Dreams now of Superman III, and salami slices so thin yet so numerous that they even evade the IRS. Sweet dreams to influence my vote, sweet dreams to keep me warm during everything but at night. Sleep. Sleep.