Monday, February 26, 2007

Word of the day: Statistic

There has been a lot of Sound and Fury recently over a Statistic that appeared on the front page of the New York Times. That more than half of women in America are without spouse.

One of the key parts of every economist's training is a very deep understanding of statistics. It was in one of those fundamental classes, pondering the definition of the term sufficient statistic, that I got a better appreciation of just what exactly a "Statistic" actually is. A statistic is a way to represent one set of numbers drawn from the real world (data) using another set of numbers drawn from a more manageable subset. If that other set is "sufficient" then it tells you everything you need to know about the data. Unfortunately that is typically impossible.

The problem is the data you care about is typically multi-dimensional. Very multi-dimensional. To understand whether the coupling habits of American women are changing with time requires a very complex rich data set, with at least the age distribution of women (+2 dimensions), over time (+1 dim). That's 3 dimensions at least. A more complete picture would include whether these women were widowed (+1 dimension), gay (+1 dimension), living longer (+2 dimension for the life expectancies of each age cohort). In fact, to really get a complete picture, some would say you need to understand the stories of each of the 150 million women in America (+150 million dimensions), each with their own set of characteristics and life histories (++++ dimensions).

The problem is that humans can barely picture 3 dimensions (the real world), can only readily print 2 dimensions (a graph), and really only a 1 dimensional statistic (a number) can fit into a headline. And if you want to make that headline snappy, you make it 0-dimensional (a binary yes-no factoid).

Which is where the new york times article caused so many problems. They tried to make a big deal out of this arbitrary 0 dimensional statistic (that > 50% of women are without spouse), which is what newspapers and the media often do and which I quickly dismissed. And so they were hit with people bringing up all the dimensions they miss.

If only we were built to picture n-dimensional hyperspaces and could fit them into a headline. We wouldn't have this problem. Edward Tufte (admittedly) has come closest, going so far as to try to fit highly multidimensional data in-line with text.

Thursday, February 22, 2007

Time magazine discovers a new concept: "Evidence"

Time Magazine has identified a revolutionary new idea in medicine: basing your opinions on evidence. One of the most frustrating things about medicine is how skeptical they are of Evidence. In a world where those that deny evidence for evolution or who deny evidence for climate change are sneeringly derided, doctors often still base decisions, as the article says on "faith, bias or even an educated guess."

The book Complications discussed the difficulties in convincing doctors to accept studies that question doctors' judgment in favor of "evidence-based medicine." Studies that find computers are better at reading EKGs for example.

To be fair to R-'s profession, however, is that it is not so much that they distrust all scientific evidence, it is just that much evidence cannot be trusted. The last study flip flops all the time, with cholesterol being bad then good, low fat diets now not necessarily good, estrogen bad then good, even that overweight people live longer than not overweight people. Ioannidis reports in JAMA that as many as 1/3 of the most cited studies in JAMA in the 90's have already been refuted.

But what the article rightly points out is that this rediscovery is not limited to medicine, but also to many areas like education or climate change. Development economists are gaining prominence in pushing the revolutionary idea that development programs should be based on evidence. It is scary that the idea of "evidence" has to be rediscovered. A concept that predates the invention of science by at least 1600 years, to Aristotle and before. Part of it is a post-modern rejection of objective reality. Part of it is just the poor quality of evidence at these new borders of knowledge.

It's about time.

Monday, February 12, 2007

I do sound silly don't I.

Stanford Business Magazine somehow found my column for the Chronicle of Higher Education and wrote about it.
(See here for article)

My friends from undergrad days never tire of haranguing me for a line I wrote in a column for the MIT newspaper, about John Glenn, and the "re-penetration of our vast firmament."

Hopefully, my writing has gotten better with age, but I still overly like my verbiage, the overly precious turns-of-phrase. By picking out the most quotable/egregious examples for their article, it really hits home...

Ah well, I never aspired to be a writer anyway.

Friday, February 02, 2007

an exegesis on comic strips (whatever that means)

I recently read a piece on the death of comic strips

So I haven't read comics in ages. The only newspaper I get is the Sunday nytimes. I remember there once was a time when subscribing to a newspaper without a comics page was anathema to me. (Though I have recently started to appreciate the high-brow funny pages the nytimes magazine has put in, as a clever ploy to get us internet users to actually subscribe. Well it worked. The last was boring, but the Sprott series by Seth I’ve loved, with its Pulp Fiction-esque non-linearity, and its personal 4th wall breaking narration. I’m also excited about Chabon’s new literary D&D serialized novella which was just launched in the magazine. The first installment was great, full of unreasonably obscure references, that Chabon made easily accessible in context, but still had to think about it. It ended rather clich├ęd, but was a lot of fun. End Digression)

But last time I checked (which admittedly was long ago), Foxtrot and Over the Hedge were still pretty on top of things (I appreciate the physics/trekkie/ifruity humor esp). And Over the Hedge carried on the full panel, boxless glory that Calvin and Hobbes brought back in its late years.

And I have a friend who swears by mary worth and prince valiant. I think those are only meaningful to those who have been following them for decades, and read them with a proper sense of irony.

A lot of good comic strips have gone online. Some crazy stat I recall hearing that more comic strips are viewed that way than in news papers (even my friend the Mary Worth reader gets his comics online).

My favorite is (written by a fellow stanford phd student while I was there, he was an engineer though, but captures grad school life perfectly.)

Duvall clued me into which captures disaffected po-po-mo twenty something new york (asian-american) youth. I liked it enough to send money directly to the author for an original panel.

Something I never would have done for any of the old fashioned conventional strips.

The proverbial long tail; niche market.

Amazon Contextual Product Ads