A tale of two pigs

I don't usually mix my own research into this blog, but I make exceptions for influenza. As everyone knows (if you're awake and sentient), there's been a huge outbreak of swine flu recently, starting a few weeks ago in Mexico, which has now spread all over the U.S., Europe, New Zealand, and elsewhere. You can read all about it in the major news media, so I'll just focus on a couple of things you might not find elsewhere.

First, the swine flu has been reported to be a mixture of human, avian, and swine influenza viruses. Although the source of these reports is the CDC, that's not an accurate picture. I read today in The Washington Post that this epidemic started when a single pig was infected simultaneously by bird, pig, and human viruses. That's a reasonable inference from the reports in the media, but it's not true.

In fact, as a number of researchers have now discovered, the new swine flu is a mixture of two different swine flu viruses. It's definitely a novel strain, but it's pretty clearly a mixture of two already-circulating pig strains. That sounds less exotic than the "human-bird-pig" theory, and it is. The reason for the "triple reassortant" story is a bit complex, but (to simplify a bit): the history of one of the two parental swine flu strains indicates that part of that strain originated in birds - well over a decade ago. That strain is sometimes called "avian-like" as a result, but it's not an avian flu strain now. Second, the history of the other strain includes a small piece (one gene) that appears to have originated in humans - over 15 years ago. Again, it's a swine flu virus now, but there's a piece of it that might have come from humans. The event that created today's swine flu - the one we're worried about - is a combination (called a reassortment) between two pig strains, pure and simple.

The other point I wanted to make is about data sharing. The sequences from the U.S. isolates have been deposited in GenBank - the public DNA database - immediately, and this allows people like me to start our analysis without delay. Many of us have been arguing for years how important it is to get the data out to the community fast, in order to accelerate the pace of scientific discovery. However, the isolates from Mexico have NOT been put into GenBank, even though these sequences first went through the CDC. Instead, they went into a database called GISAID, which was originally set up to facilitate sharing of avian influenza. Unfortunately, GISAID changed their data release policy about six months ago, and there's no guarantee that sequences deposited there will ever become public.

The CDC has been depositing influenza sequences in GISAID as if this were equivalent to making them public. It's not, and they shouldn't pretend otherwise. The CDC has not always been supportive of publicly releasing flu data - in fact, for years they deposited some of their sequences in a private database. They've recently made public statements about their commitment to public data release of influenza sequences, but it doesn't seem that they are following through with this commitment for all of the sequences from the swine flu outbreak. (Don't get me wrong: the CDC is doing a fantastic job in trying to track and understand this outbreak, and their work is incredibly important to public health, especial concerning the flu. I'd just like them to be a big more open with their data.)

One last note, a technical one. I've looked at the Mexican sequences (I have a GISAID account) and the California sequences, and they are virtually identical. So it would appear that any differences in virulence are due to differences in the people being infected, not to the virus itself. At least that's what it looks like so far - the situation is changing rapidly.


  1. The CDC's Ruben Donis has done a detailed interview with Science where he explains very clearly why this is just two pig strains. He also explains the triple reassortant story very well. See the interview here.

  2. Thanks for this great post!

    1) Would you mind if I repost a translation of this entry in spanish on my blog? With all the due credit and links to this original post.

    2) Do you know if the california sequences were isolated from mexican individuals?

    Thanks again

  3. Daemios,
    1) I don't mind at all.
    2) The California sequences were all from U.S. residents. However, I think all of them can be traced to someone who had recently traveled to Mexico.

    Note that we now have U.S. cases that are not directly linked to travel to Mexico - we have one (at least) reported in my home state of Maryland now.

  4. By the way, my commentary on the flu vaccine from Nature last year is now avialable FREE (open access) in PubMedCentral. Here is the link.

  5. how likely is this flu's genomic capable of a major class shift that will lead to problems like the 1918 flu

    joe MD

  6. Joe MD: this flu already had a major shift - that's what a reassortment (mixing of two separate lineages) is. It is very unlikely at this point to become dramatically more deadly like the 1918 flu - that flu started out deadly and got milder, which is usually what we see in new flu strains. Evolutionarily speaking, that's what we expect: a more adaptive strain will be milder, because it can spread more effectively if the victims don't get very sick. So I expect this flu to gradually get milder, as we've seen with previous flu outbreaks.

  7. I posted the spanish translation here:

    Thanks a lot!

  8. now CIDRAP praised GISAID without mentioning the secrecy and Max-Planck institute hosts it after
    they had trouble with epiflu.
    Ukraine uploads critical D225G sequences only to GISAID and CDC just made a big update - at GISAID. Will we get 2 sorts of science , public and semi-secret ? What starts with flu-sequences may go over to other areas of science.


Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS

Note: Only a member of this blog may post a comment.