Fighting tuberculosis with BB84

Tuberculosis (TB) has been with humans for millenia, infects 1 in 3 people worldwide and kills almost 2 million people per year.  BB84 is everyone’s favorite information-theoretically secure key expansion system, and is secure at bit error rates up to at least 12.9%.  So what’s the connection?
TB is treatable, but the treatment involves taking multiple antibiotics daily for 6-9 months (or up to 24 months for drug-resistant strains).  The drugs have painful side effects (think chemotherapy) and most TB symptoms go away after a few months, so it can be hard for people to be motivated to complete the course.  In poor countries, where TB is most common, doctors are in short supply, and have little time for counseling about side effects, or patients might not have access to doctors, and just buy as many pills as they can afford from a pharmacist.  But when people stop treatment early, TB can return in a drug-resistant form, of which the scariest is XDR-TB.
As a result, the WHO-recommended treatment is DOTS (directly observed treatment, short course), in which a health worker watches the patient take all of their pills.  This is effective, though proving this is hard, and implementation is difficult.  The community health workers monitoring patients are paid little or nothing, are often unmonitored, and spend their time in the houses of people with active TB, often without good masks.  So absenteeism and low morale can be problems.  Patients also can find it condescending, disempowering, and stigmatizing, since neighbors can notice the daily community-health-worker visits.
One ingenious alternative is called X out TB.  Patients are given a device that dispenses a strip of paper once every 24 hours.  If a patient is taking their antibiotics, then peeing on the paper will cause a chemical reaction (with a metabolite of the drugs) that reveals a code, which patients send to the local clinic by SMS.  As a result, the clinic can remotely monitor which patients are reliably taking their pills.  Patients in turn are given a reward (cell phone minutes have been popular) for taking their pills every day.
This system seems to be working well in trials, but the presence of the dispenser means that batteries are necessary, and security considerations arise. For example, one could try to open the dispenser up, to save a jar of urine and keep dipping strips in it after stopping the pills, or even to pour urine inside the dispenser. Imagine that the unfortunate TB patient is actually Eve, who has a dark determination to cheat the system, even at the expense of her own health.
Fortunately, BB84 has already provided an elegant, if not entirely practical, solution to this problem.  The dispenser can be replaced by a numbered series of strips, and the bottle of pills needs to be replaced by a similarly numbered blister pack (for simplicity, the two could be packaged together). On day i, the patient takes pill i and several hours later, pees on strip i. The twist is that there are two types of strips—let’s call them X and Z—and two types of pills, which we will also call X and Z.  These appear the same visually, but have different chemical properties.  Peeing on an X strip after taking an X pill will reveal the code, as will peeing on a Z strip after taking a Z pill.  But if the strip type doesn’t match the pill type, or there are metabolites from both pill types present, then the code will be irrevocably destroyed.
For a patient following instructions, the pill on day i will always match the strip on day i, and so all of the codes will be properly revealed.  But any attempt to reveal codes without matching up pills and strips properly (e.g. peeing on all the strips at once) will inevitably destroy half the codes.  The threshold for rewards could be set at something like 90-95%, which is safely out of range of any cheating strategy, but hopefully high enough to prevent resistance.
This scheme has its flaws.  For example, a patient could get a friend to take the pills for them (although this friend would probably suffer the same side effects).  The metabolites might not clear the system quickly enough, in which case honest patients would still invalidate strips sometimes when an X strip/pill is followed by a Z strip/pill or vice versa.  While the original X-out TB approach relied on using metabolites of common TB medications, the BB84 approach would probably want to use pharmacologically inactive additives, and I don’t know if drugs exist that are FDA-approved and have the necessary properties. On the other hand, this enables the additive to have a half-life much shorter than the medicine.  And of course, patients generally want to get better, and are likely to take their pills when given even mild encouragement, monitoring and counselling.  So information-theoretic security might be more than is strictly necessary here.
Can anyone else think of other applications of BB84?  Or other ways to stop TB?

A Mathematical Definition of News?

Lately I’ve been thinking about the news. Mostly this involves me shouting obscenities at the radio or the internet for wasting my time with news items the depth of which couldn’t drown an ant and whose factual status makes fairy tales look like rigorous mathematical texts (you know the kind labeled “Introductory X”.) But also (and less violently) I’ve been pondering my favorite type of question, the quantification question: how would one “measure” the news?
Part of motivation for even suggesting that there is a measure of “news” is that if someone asked me if there was a measure of “information” back when I was a wee lad, I would have said they were crazy. How could one “measure” something so abstract and multifaceted as “information?” However there is a nice answer to how to measure information and this answer is given by the Shannon entropy. Of course this answer doesn’t satisfy everyone, but the nice thing about it is that it is the answer to a well defined operational question about resources.
Another thought that strikes me is that, of course Google knows the answer. Or at least there is an algorithm for Google News. Similarly Twitter has an algorithm for spotting trending topics. And of course there are less well known examples like Thoora which seeks to deliver news that is trending in social media. And probably there is academic literature out there about these algorithms, the best I could find with some small google-fu is TwitterMonitor: trend detection over the twitter stream. But all of this is very algorithm centered. The question I want to ask is what quantity are these services attempting to maximize (is it even the same quantity?)
The first observation is that clearly news has a very strong temporal component. If I took all of the newspapers, communications, books, letters, etc. that mankind has produced and regarded it without respect to time you wouldn’t convince many that there is news in this body of raw data (except that there are some monkeys who can type rather well.) Certainly also it seems that news has a time-frame. That is one could easily imagine a quantity that discusses the news of the day, the news of the week, etc.
A second observation is that we can probably define some limits. Suppose that we are examining tweets and that we are looking for news items on a day time scale. We could take the words in the different day’s tweets and make a frequency table for all of these words. A situation in which there is a maximum amount of news on the second day is then a situation where on the first day the frequency distribution over words is peeked one one word, while the second day is all concentrated on another word. One could probably also argue that, on the day time scale, if both frequency distributions were peaked on the same word, then this would not be (day scale) news (it might be week scale news, however.)
This all suggests that our friend, the news, is nothing more than the total variation distance. For two probability distributions p(x) and q(x) , the variation distance between these distribution is d(p,q)=frac{1}{2} sum_{x} |p(x)-q(x)| . This is also equal to sup_{E subset X} |P(E)-Q(E)| where P(E)=sum_{x in E} p(x) and similarly for Q(E). Ah, so perhaps this is not as exciting as I’d hoped 🙂 But at least it gives me a new way to talk about the variational distance between two probability distributions: this is a measure of the news that we could associate with changing from one probability distribution to another.
Of course this is just one approach to thinking about how to quantify “news.” What are the drawbacks for my method and what should a real measure have that this one lacks? I mean whats the worst that could happen in thinking about this problem. Okay, so maybe you would learn how many holes it takes
to fill the Albert Hall.