The Truth is Still Putting its Pants On

I’ve subscribed to Bob Park’s What’s New for the better part of the past decade; it’s a mailing list that’s still a mailing list rather than a blog, which makes it old-fashioned, to the extent that email can be old-fashioned. It’s quick commentary on what’s new and controversial, much like a blogger would do.

This past week, as part of his continuing commentary on things possibly related to global warming he mentioned something which I’m not sure the science media quite “gets,” and serves as a decent example of how science progresses:

Researchers at Dalhousie University in Halifax, Nova Scotia say phytoplankton are disappearing from the ocean. Strictly speaking it’s not really a science story — yet. There’s no independent confirmation, and until that happens scientists don’t get too excited. But Dalhousie is a respected school, and you can bet a lot of scientists are looking at sea water today.

This is how it usually goes — any new finding serves as a springboard for more investigation. A single experiment is usually not given an extraordinary amount of weight if the result is something new and unexpected, and the experiment represents a relatively small amount of data (results from large collaborations at accelerator labs are generally afforded more weight because they are gathering tremendous amounts of data). This is especially so if it appears to contradict previous experiments. Science is cautious this way. You always want to get more data, and maybe have someone else repeat the experiment, or possibly do a more advanced experiment which would only work if the foundational work is correct. That’s how you gain confidence in the results.

The unspoken part of this is that the results were properly published — there was a press release, but that was coordinated with the publication in Nature. This was not something just put out on the web or shouted from a rooftop — they followed the important first steps of the process by going through peer review.

I was thinking about this when I later read a story in the NY Times: Rumors in Astrophysics Spread at Light Speed, in which a number of recent stories are discussed, in which results were aggressively interpreted. But while the thrust of the story seems to be about how fast information can spread and the author’s disappointment that none of the rumors he keeps hearing seem to pan out, I got a different message. I saw confirmation of the tendency for the media to pick up the ball and run with it, in their rush to be first (or not be left out) with little regard for checking the facts, combined with the author not reading or listening very carefully. In the extrasolar planet example, the TED talk speaker is pretty clear he’s talking about size, and he does call them candidates. If you don’t understand the jargon, how about checking with someone first? One would hope the lesson of climategate would not be lost here — an earlier case where misunderstood jargon was reported, only to have it turn out that there was nothing to see — but I fear that lesson has already been forgotten, since the blame went to the scientists (for using the word “trick”) but seemed to pass the media by. The Higgs at Fermilab? That was a rumor posted on a blog, and the linked gawker story reports it as such.

These spread at the speed of light, in part, because nobody put the brakes on. Nobody said, “Hey, wait a tic. Maybe we should get someone else to weigh in on it.” This is the cautionary tale of Pons & Fleischmann going to the popular press before their paper had been peer-reviewed, let alone published. That was more than 20 years ago.

Hardly a week goes by, for example, that I don’t hear some kind of rumor that, if true, would rock the Universe As We Know It. Recently I heard a rumor that another dark matter experiment, which I won’t name, had seen an interesting signal. I contacted the physicist involved. He said the results were preliminary and he had nothing to say.

Smart guy. Very.

My view is that journalists shouldn’t just be relying on the restraint of scientists to remind them that preliminary results are preliminary. What if the scientist had commented? Would you run the story, knowing full well that it had not passed peer-review nor had it been independently confirmed? What is so hard about these caveats and disclaimers scientists take for granted, and come up over and over again, when discussing science results? Is the collective journalistic memory so short that scientists (or their lawyers) have to start reading a statement before they ever make a comment?

Please understand that the following result is preliminary and should not be taken as the final word. For anyone unfamiliar with the field, an effort must be made on the reader’s part to see where this fits in with the prevailing models of the day. There is a chance that it could be wrong or have only limited applicability to broader problems being investigated by other research teams. Further investigation may confirm our findings, or show that our results were anomalous or contained errors.

Scientists already know this. Journalist should know this.

ZapperZ has also commented on the NYT story

Bovine Milliners in the Lab

Experimental Error: Don’t Try This at Home

I recently watched the online trailer for a stage scientist named Doktor Kaboom!. (I presume it’s a pseudonym. Either that, or his grandparents had it changed from Kaboomowitz at Ellis Island.) From the trailer, I gleaned that Doktor Kaboom!’s primary mission, as one might imagine, is making various household objects go kaboom. Watching him catapult a banana across the stage, I realized exactly how Doktor Kaboom! and his ilk perpetuate myths about scientists.

“He’s completely misrepresenting us,” I complained to my wife as the video clip played. “He’s making us look awesome.”

Media portrayal of scientists is pretty much a binary state. Either we’re boring automatons in lab coats, babbling incomprehensibly, or we’re like Doctor Kaboom!, hiding the 80% of the job that’s not particularly exciting. Why do it, then? Because the other 20% of it is worth it.

One nit, though:

We are distrusted, feared, but most of all, misunderstood. We work, after all, in one of the only two professions that idiomatically follow the word “mad” — the other such profession being “hatter.”

The author forgets “cow.” Not that that is a profession to which one aspires. Outside of India.

via ZapperZ

All of Steve Jobs's Men

Those who visit the tech world are probably aware of the iPhone4 antenna issues and all the media hoopla surrounding it. I have no real dog in the fight, horse in this race or cliché in this idiom. I don’t own an iPhone and I’m not shilling for Apple. But it pains me to see a bunch of tech-savvy people making crappy emotional arguments about something that should be quantifiable,and/or making crappy technical arguments because they don’t look at what the data are (or aren’t) telling them.

Apple had to respond, of course, and there are a number of articles out there explaining the business psychology of this; in some sense it’s already too late — once the idea that Al Gore invented the internet is out there, actual facts will do very little to change things, so the undercurrent that the phone is a dud cannot truly be slain (the best you can do is a flesh wound). There is no Vorpal blade for persistent myths of the internet. Some people will believe that because they heard it, and others will repeat it because they love to hate Apple. But you have to try, and so a solution was proposed. Free bumpers for everyone. Feel free to discuss whether Steve Jobs was not apologetic enough to suit you, or whatever.

That’s not my point.

My point is that people kept making this out as a technical problem, when all along it has been a PR problem, and a lot of people not employed by Apple kept insisting otherwise (except that perception is reality, hence the solution mentioned above). I’ve seen it called a design flaw and also called a defect. The latter is flat-out wrong — the problem is not with the phone itself being faulty, as if swapping it out for another phone would solve the issue. The problem is user-specific. Is it a design flaw? Yes and no. It is, in the sense that there is degradation in performance that can be avoided with a technical fix, but then you have to call any sub-optimal performance a design flaw. You have to insist that cheap technology suffers from a design flaw if it doesn’t work as well as a more expensive technology, and I think that this is not what we mean by flaw. It is a trade-off, a natural and expected offshoot from optimizing on multiple variables, including price. You want better performance? Spend a few extra bucks. In what industries is that not the case?

The real metric for seeing if this is a “flaw” is to do a proper analysis of performance and the analysis, for the most part, was absolute crap. Most of it concentrated on how much the signal dropped when you held the phone the “wrong” way, and went no further. BFD. That’s a science fair project. When you attenuate a signal, it goes down. When you short out an antenna (or at least change the capacitance or change the resistance of it, whatever was actually happening), you will lose signal. What the analyses lacked is any sort of context for these numbers, and while careful data-taking is important, the real tough part about science is in proper interpretation — figuring out what the data mean. And few of the stories did that. Diminished signal is not proper context, because all phones do that when you cover the antenna. All that these numbers show is that the phone works better when you don’t cover the antenna. Confirming this is not going to get you to Sweden.

You can’t compare it to a different phone on another network, because everyone knows AT&T sucks. Their network has made them infamous, like El Guapo. The real comparison of any validity would be to properly compare the phone to the one it replaced. Because the real question is this: Is the new one better? I haven’t done any exhaustive cataloging of all the stories on the iphone4, but of the dozens I’ve read, I have seen just one technical analysis that addresses this (though there are undoubtedly others). The conclusion? The new phone holds calls at a lower signal strength than the old one.

The other bad comparison was the number of drpped calls form the iPhone 3GS and iPhone 4. The new phone drops more calls — that’s bad, right? What if I told you that I did a survey and found 25 people liked a name brand of soft drink, but another one found that 100 people liked Crappa-Cola? Do those numbers mean anything? What if I had to survey 10,000 people to find the 100 who liked Crappa-Cola, but only 50 to get the result for the name brand? The numbers would be meaningless as a direct comparison — we have to normalize the responses. That’s just basic science analysis. So a direct comparison of the numbers of dropped calls is just as meaningless without knowing that we are similarly normalizing the data.

When John Gruber of Daring Fireball reported those numbers, I sent an email to point this out to him. I had to mention that this isn’t an Apples-to-Apples comparison (and, of course, I’m using the obvious pun, because that’s what I do. It was low-hanging fruit. Damn, I did it again) I wrote, in part,

What is important is the comparison to the previous version of the phone: does the iPhone4 drop calls that the 3 or 3GS does not? And the answer that seems to be, for the most part, “no.” It’s hard to tell, because most of the Geekmedia aren’t looking at it that way, and much of the remaining evidence is anecdotal.

In Antennagate Bottom Line, you mention the comparison of numbers of dropped calls, but I argue that this is not the right metric. What one needs to know is if the iPhone4 drops a call that would not be dropped by a 3GS. If the additional drops are in areas that the 3GS would have never connected in the first place, then the statistic isn’t telling us what everyone claims it is. All that would mean is that there is a large drop rate in regions that were previously regarded as dead zones. That’s an improvement, not a regression.

Without that information, one does not truly know how to interpret the statistic.

And not only did he made a post addressing that, he frikkin’ quoted my email! (and this little ego-boost is the whole reason for finally writing this up. I’ve been quoted by Gruber and linked to by Kottke. In your face, world!)

You May Well Ask What It Is

Over at Starts With a Bang, Ethan has a posted How Good is Your Theory? Open Thread I, in which he categorizes the spectrum of theories, from Scientific Law at one end, through Validated, Speculative and on to Ruled Out at the other end. My reaction is like that of Mammy’s in Gone With the Wind: it ain’t fittin’. It ain’t fittin’, it ain’t fittin’, it just ain’t fittin’.

First of all, theories do not grow up to be laws, which is one way this spectrum would be interpreted.

Scientific Law: This is really an elite category, reserved for the most thoroughly tested, rock-solid theories and ideas that we have. These are theories that have stood the test of time, as well, making many new predictions that have all been confirmed experimentally and observationally, where there’s practically no room for dispute other than making extensions or variations to the law itself.

If there are variations, then how rock-solid is it? No, that’s not the criterion. A law denotes a straightforward mathematical relationship, i.e. an equation. A law is closer to being synonymous with equation than it is with rock-solid theory. We have Ohm’s law, but we also have non-ohmic devices. We have Newton’s law of gravitation, but we know that fails under some conditions, and is just a subset of relativity. Laws have limits to their applicability.

We also have far-reaching and very well-established theories that are not called laws, simply because there is no simple equation associated with it. The theory of evolution is no less well-established than many laws that exist, for example. And this is a common debating tactic, to denigrate the theory of evolution because it isn’t a law, to make it sound like it has less support and more open to doubt.

The second objection I have is that the spectrum is actually two-dimensional. Ethan mentions how some speculative theories are not testable, or at least not currently testable. If it isn’t testable it really shouldn’t be considered a theory at all, but even ignoring that issue, this points toward the idea that there is a spectrum of the quality of a theory as well. Some theories are better than others, because they do a better job of precisely predicting behavior and/or explain a wider range of phenomena. Some very elegant theories are on the “Ruled Out” side of things, because they were the proverbial beautiful theory slain by the ugly fact, but by virtue of being testable, they were still higher quality than some other “theories” that cannot be (easily) checked. Phlogiston was a better-constructed theory than Brontosauruses being thin at one end, thicker in the middle, and thinner again at the other end, even though the former is ruled out and the latter is true except for a naming issue. The Balmer, Lyman, Paschen et. al series of Hydrogen are less complete than the Rydberg formula and Bohr model, and the Bohr model is tossed into the “ruled out” heap in favor of quantum mechanics. So the breadth of a theory’s reach has to be considered as well — a model that explains one thing is not as highly regarded as one that encompasses many phenomena.

The true spectrum is in the amount of evidence which supports the theory, weighed against evidence that contradicts it. Keeping in mind, of course, that contradiction comes in two flavors: those which kill the theory outright, and those which narrow the boundaries of the theory or require it to be modified. The spectrum of quality is similar to the high jump or pole vault — set the bar at some level of prediction/falsification, and then see if you can make it over the top. Nobody will be impressed by a theory that makes only obvious predictions that are trivially fulfilled (OK, excepting the fans of John Edward and his ilk, who are utterly impressed by “You’ve lost someone recently” at a meeting of people wanting to talk to recently-departed loved ones). If you don’t make predictions, you don’t get to play.

On The Clavicles of Collossi

Research generally gets more difficult over time, in a quantifiable way, as you clear out the low-hanging fruit.

Hard to find

The fact that discovery can become extremely hard does not mean that it stops, of course. All three of these fields have continued to be steadily productive. But it does tell us what kind of resources we may need to continue discovering things. To counter an exponential decay and maintain discovery at the current pace, you need to meet it with a scientific effort that obeys an exponential increase. To find a slightly smaller mammal, or a slightly heavier chemical element, you can’t just expend a bit more effort. Sometimes you have to expend orders of magnitude more.

Dissecting the Problem

A simple way to get the antiscience crowd to come around?

Maybe if those in the media and popular press would stop treating us like a different species, “the people” who we don’t reach would feel less wary about trusting us when the data we generate challenges their preconceptions. Maybe if the media would stop treating everything like a “controversy”, and stop giving free air time for dissemination of misinformation, we wouldn’t have to spend our time debunking crap that was debunked 150 years ago (in the case of evolution) and could focus more on education. Here’s an example; anybody even remotely familiar with the “controversy” surrounding mercury and autism knows who Andrew Wakefield is. He gets mentioned in practically every article and gets the media’s “equal time” treatment, even though the guy is a total slime and we’ve known it for years. How many legitimate medical researchers, on the other hand, get more than a two-sentence quote? How many autism researchers fighting the good fight get profiled to the extent that Wakefield does? If you’re not in the field, can you even name an autism researcher on the other side of the line from Wakefield?

I read this before reading Chris Mooney’s op-ed, but I think this, in particular, is spot-on. One of the many ways the battle is biased against science is the ease by which one can make a false claim, and the difficulty in debunking the claim, because science is complicated. The artificially forced bilateral symmetry common in stories and debates works against us. I don’t know how much of a solution this ends up being, but it is part of the problem.

I think this also ties in with science needing to step up its PR game, though I think there are problems inherent in non-scientists becoming spokespersons; the more links you put between the people that best understand the research and the people interacting with the public, the greater chance you have of simplifying the science to the point it’s wrong. Somebody simply reciting talking points can’t interact and answer questions, which means that Evil Monkey’s point about scientists getting out and engaging the public is the best approach, and we scientists (and administrators who are our bosses) have to recognize the value of outreach. The other thing that bothers me about external PR that strays from the Sgt. Friday script (just the facts) is that appealing to emotion swings both ways. I think it would be much better if a person could sniff out false claims themselves, rather than having to rely on a PR firm to tell you. If you can be convinced by a persuasive but non-fact-based argument that something is true, you can also be convinced that it’s false. And then there’s the trump card — the antiscience crowd often wins the battle not by having great spokespeople, but having ones that are willing to lie, and science can’t go down that path.

One thing that all this ignores, however, is that many of the targets who disagree aren’t doing so because scientists aren’t putting forth a compelling argument. They made up their minds long ago — facts aren’t going to sway them, but neither is a smooth talker with a pretty face. I think that you have to recognize that there are people who will never be convinced — there is no strategy that will work. They are not interested in the facts, or in honest debate, and if what you have to say disagrees with Glenn Beck or Rush Limbaugh, you’re just flat out of luck. Confirmation bias is real.

More Sports and Science

The sports reported as science bit has made its way around, and I while I was thinking of things to perhaps improve the analogies, it occurred to me that the whole part about “people don’t understand that jargon, can you dumb it down?” can be recast as “science is like a sport you’ve never seen before.”

If you’ve never seen a particular activity, and your only option was to watch (i.e. there’s nobody to explain it to you), how would you figure out the rules? You’d observe and look for patterns. You’d take note of repeated actions to see that they are consistent: player uses foot or head to hit the ball. You also may notice that some things don’t happen: hand touching the ball stops play. But then there are exceptions: doesn’t apply to the guy with the big gloves. He seems to be the only one who can handle the ball, and he wears a different colored jersey . Or, with baseball: if the ball touches the ground, the players seem to react differently than when it is hit in the air.

With repeated observation, you can guess at some of the rules. You can build a model and start to predict what would happen under certain conditions to see if your guess at the rules is correct. If it doesn’t, you have to know if there was something different about the circumstances to know if this is an exception or you were just plain wrong. Some of the more obscure rules take a lot of watching to uncover, and will look like anomalies at first. How many baseball games would you have to watch to see the infield fly rule invoked, and how many time would you have to see it before you could figure out the specific conditions under which it applies?

Observational science is just like this. At least part of astronomy, geology and paleontology, and perhaps others, rely solely on the ability to make repeated observations and figure out laws from the patterns of what does and doesn’t happen.

At the next level, you can also infer behavior that is due to strategy, which is based on the rules but not strictly part of it. There’s no rule in baseball that says the first baseman must hold a runner on, but the ability to take a lead and steal a base dictates this action. Much like the elliptical orbits plants being a derived behavior, based on the more fundamental rule of gravity being an inverse-square law. The orbits were noticed first, though, and the underlying rule was deduced later.

Just a Bit Outside

If sports got reported like science..

HOST: In sports news, Chelsea manager Carlo Ancelotti today heavily criticised a controversial offside decision which denied Didier Drogba a late equaliser, leaving Chelsea with a 1-all draw against Sunderland.
INTERCOM: Wait. Hold it. What was all that sports jargon?
HOST: It’s just what’s in the script. All I did was read it – I’ve got no idea what it’s really on about.
INTERCOM: Nobody without a PhD in football’s going to understand that. Who wrote this crap? It’s elitist rubbish, people will just turn off when they hear it. “Late equaliser”? “Offside”? We’ve got to get this rewritten so it’s more accessible.

They need to work in how the early goal by Sunderland violated or has rewritten some rule, except that it didn’t, in order to parallel all of those science stories that claim that relativity has been violated or evolutionary theory has been upset by some discovery, only to find that (of course) nothing of the sort actually happened.