NEJM editorial calls data scientists "research parasites." Can Joe Biden fix this?

Vice President Joe Biden recently called for a "moonshot" to cure cancer, which President Obama announced in his State of the Union address last week. Motivated by the tragic death of his son Beau, who died last year of brain cancer, Biden says he will devote his remaining time in office, and many years after, to helping fight cancer. On his VP blog, he writes that he wants to do two things:

  1. Increase resources — both private and public — to fight cancer.
  2. Break down silos and bring all the cancer fighters together — to work together, share information, and end cancer as we know it.

I'm 100% behind the Vice President on these efforts, and I hope he succeeds beyond his wildest ambitions. But he might discover, paradoxically, that raising money–his first goal–is easy compared to the challenge of getting scientists to share data.

Exhibit A is an editorial titled "Data Sharing" that appeared in last week's New England Journal of Medicine, written by Dan Longo and Jeffrey Drazen, the deputy editor and editor-in-chief of the journal. Drazen and Longo wrote that scientists who wish to use other people's data to make new discoveries are "research parasites." Or, to be more precise, they wrote that "some front-line researchers" (none of whom are named) have this view. They also argued that "someone not involved in the generation and collection of the data may not understand the choices made in defining the parameters" and thus have no business re-analyzing the data.

The condescension implicit in this statement is deeply troubling. Drazen and Longo are saying, essentially, that only the people who originally collect a data set can truly understand it, and anyone else who wants to take a look is a parasite.

The editorial has led to a firestorm on social media. For example, Nobel Laureate Barry Marshall tweeted that
"Plenty of Nobel prizes came from a new look at other people’s data."
UC Davis professor Jonathan Eisen tweeted that the "editorial by @nejm is simply deranged," and a new Twitter account under the name ResearchParasite quickly drew many followers.

I asked Dr. Drazen if he really meant to imply that scientists who use other people's data are parasites. He and I spoke on the phone, and he emphasized that he's a strong supporter of data sharing, and that's he been traveling the country promoting a new policy to share the information from clinical trials (something that rarely happens). Just a few days ago, he and other medical journal editors proposed a new policy on clinical trial data sharing, a policy that (while not perfect) would be a big step forward.

So why, I asked him, did he use the harshly negative phrase "research parasites"? Dr. Drazen pointed out that he had heard this term from others, and that's why he enclosed the phrase in quotation marks in his editorial (true). He shared with me an update that will appear in NEJM this week, in which he and Longo will explain further; however the journal asked that I not quote from that.

I was relieved to hear that Dr. Drazen and his NEJM colleagues are supportive of data sharing, and that are implementing new, more open policies on clinical trial data sharing for the journal. I asked him if he would also state directly that he did not believe the phrase "research parasites" was accurate or appropriate. He declined to comment, though he reiterated the point that this phrase came from others, not from him or Dr. Longo.

So the attitude is clearly out there. Indeed, it's not that unusual: I have encountered similar attitudes many times in my own career, although I should quickly add that it is far from universal.

It's a simple fact today that biomedical researchers (take note, Mr. Vice President) rarely share their data with others. Unless a funding agency or a journal in which they wish to publish requires them to share, they will sit on their data forever. I've personally been involved in projects where the various participants–funded by NIH or other federal agencies–refuse to share data even with other groups in the same consortium. For example (and this is just one of thousands I could point to), the raw data behind this clinical exome sequencing study, led by Baylor College of Medicine and published in 2013 in NEJM, is not available. The data collected by the famous Framingham Heart Study, running since 1948, has been locked up by Boston University scientists for half a century, and only recently (after considerable pressure from their funders) have they agreed to let others take a look at small pieces of the data, if they beg hard enough.

Let's go back to Vice President Biden's blog, where he wrote:
"We’ll encourage leading cancer centers to reach unprecedented levels of cooperation, so we can learn more about this terrible disease and how to stop it in its tracks.... Data and technology innovators can play a role in revolutionizing how medical and research data is shared and used to reach new breakthroughs."

Again, I'm 100% behind the VP here. Biden is already meeting with cancer researchers to see what he can do to accomplish these goals, and I'm sure they will tell him what he wants to hear. In contrast, let's see what Drazen and Longo wrote in their NEJM editorial:
"...a new class of research person will emerge — people who use another group’s data for their own ends, possibly stealing from the research productivity planned by the data gatherers, or even use the data to try to disprove what the original investigators had posited. There is concern among some front-line researchers that the system will be taken over by what some researchers have characterized as “research parasites.”"
Shocking! If you share your data, someone might try to disprove your results! Could it be that a published result relies on misinterpreted data and is wrong? It took me less than a minute on Retraction Watch to find multiple articles retracted by the NEJM itself, including some that were retracted because the original data could not be found.

Disproving a claim using the same data is what reproducibility is all about, and this is one of the most important reasons that data needs to be shared. After all, if someone has distorted their data in order to reach a conclusion that isn't really justified, we need someone else–someone not invested in proving the same result–to re-analyze the data using independent methods. This is how science corrects itself.

These sentiments of the unnamed "front-line researchers" quoted by Drazen and Longo reveal the dangerously arrogant assumption that only they understand the data, and that no one should question their findings. And there's also that concern that another scientist might discover something that was missed by the original group. In what view of reality is this "stealing from the research productivity" of that group?

The phrase "research parasites" also reflects the view of some scientists that the data they collect is their property, despite the fact that their research is (frequently) funded by the public. It's time for the funding agencies to set some new ground rules: if the government funds a study, then we all own the data. Scientists who don't like the rule can find another source of funding (and believe me, they might grumble and complain, but they will do what their funders demand).

One final note: a quick scan of recent articles in the NEJM reveals that, not surprisingly, many of them rely on the human genome sequence. Did any of those authors contact the "data gatherers" to get permission to use the genome in their work? Did they offer to include the human genome sequencers as co-authors on their papers, a step that Drazen and Longo recommend? Of course not–and they shouldn't. When we publish papers, we cite the sources of our data, but we don't ask their permission nor do we include them as co-authors. Citations are the currency of modern science.

So here's some advice to Vice President Biden: don't just talk to scientists and urge them to collaborate. They'll all agree, and tell you wonderful things about their numerous collaborations, but once you leave the room, they'll go back to business as usual. If you really want to change the culture, Mr. Vice President, change the rules.

Alabama versus Clemson prediction: both teams will lose

Monday evening's college football championship game features two powerhouse teams: undefeated Clemson (14-0) versus perennial football power Alabama (13-1). With all the media attention on this game, expected to be one of the most-watched college football games ever, it's easy to find predictions for the winner. CBS has six experts all predicting an Alabama victory. SBNation lists Alabama as a one-touchdown favorite.

My prediction is different, and I'm 100% positive that I'm correct: both teams will lose. Don't get me wrong: the teams will play each other, and one of the teams will score more points, so the game itself will have a winner. But in this money-washed extravaganza, with coaches, schools, and television networks hauling in tens of millions of dollars, none of the players will be paid a single dollar.

Imagine this: the 100 players on each team's roster have spent their year entertaining millions of fans. They have played their hearts out on the field, risking injury (including the possibility of a life-altering concussion) in every game, all while pretending to be full-time college students pursuing an education. The pretense that they are "student athletes" is what allows the NCAA and the universities to maintain the fiction that players should not be compensated for their efforts on the field.

Don't get me wrong: there will be plenty of winners in Monday's game. The coaches, conferences, and colleges have already won. Alabama's head coach Nick Saban will be paid $7,087,481 this year. Clemson's coach Dabo Sweeney makes $3,305,200. Alabama and Clemson's assistant coaches make $1,500,000 and $1,404,807 respectively. USA Today calculated the total payroll of the football coaching staff for the four teams in the final two playoff games: $35,981,491, not including bonuses.

It's not just the football coaches who have won big. Athletic directors and their staffs have cashed in handsomely too. As the Washington Post reported a few weeks ago,
"In a decade, the non-coaching payrolls at the schools [in the five wealthiest athletic conferences], combined, rose from $454 million to $767 million."
The Post also compiled numbers showing that 34 football teams had staff payrolls above $1 million for non-coaching staff. Clemson has created an "associate athletic director of football administration" who alone makes $252,000.

But wait, there's more. The athletic conferences in which Alabama, Clemson, and the other major football powers play have been rewarding themselves handsomely, paying their commissioners from $2.0 to 3.5 million. As the Washington Post put it:
"As a reward for making an industry fueled by unpaid athletes more lucrative than ever, the men who run these conferences have enjoyed staggering pay hikes doled out by the leaders of many of America’s largest universities."
Much of this money comes from television contracts; the Wall St. Journal explains that the ACC (Clemson's conference) has a $3.6 billion contract with ESPN that lasts until 2027. ESPN is paying another $7.3 billion to televise the playoff and bowl games. None of that money goes to the players upon whom the entire enterprise depends.

When you watch the game on Monday (or any college football game), think about all that money going to the coaches, administrators, conference commissioners, and staff, while the players get nothing. The universities participating in this lucrative enterprise should be ashamed: they are making millions off the backs of unpaid athletes, while hiding behind the pretense that they are providing the athletes a fair return in the form of a college education. As I've written before, this is nonsense. Universities have been corrupted by the lure of cash, and they seem to have forgotten that they are in the business of educating students, not providing sports entertainment.

So yes, you will see some winners on Monday. They're the guys on both sidelines wearing headsets, making multi-million dollar salaries. After the game, they'll drive their expensive cars to their multi-million dollar homes. The players will return to their dorm rooms.

As for me, I find it increasingly difficult to enjoy watching college football. Every time a player gets shaken up by a hard tackle, I'm reminded that their playing careers are woefully short, and this might be the last time they have the chance to play in front of such a large audience. I think about how they'll feel 10 or 20 years from now, when they're limping around on bad knees while their former schools have long forgotten about them.

Colleges need to pay the players. And while they're at it, they can take steps to make sure that these students get a real education, such as offering free tuition, room, and board to the players for at least four years after they stop playing football. Until they do, all the players lose.