Skip to content

Merck Murky Data

2009 March 6
by abhishektiwari
After my last controversial post Data Vendetta, I was trying to keep away from writing anything about latest open data buzz around Merck announcement to release their highly valuable proprietary data in public domain. Merck has decided to donate huge amount of data through a not-for-profit medical research organization Sage, in this process Merck & Company will be also providing some necessary equipment and software. All of these is coming from blogosphere, means I can’t find any official news release from Merck (that may be due to my deadly ignorance or wrong angle), especially when this is going to be biggest charity ever by any pharmaceutical giant. There are several anticipations about what kind of data Merck is going to give away, but you know what, it is going to be gigantic hunk of expensive pharma data. It all started from Science Commons fame John Wilbanks’s blog where he writes that
Merck has pledged to donate a remarkable resource to the commons – a vast database of highly consistent data about the biology of disease, as well as software tools and other resources to use it.

That’s sound great (tons of data=ton of publications), I believed it, but then he write

This is all going to happen through the establishment of a non-profit organization called Sage to serve as the guardian of the resources. It’s not about making a quick data dump onto the web, however. Sage is going to take a while during an “incubation period of three to five years.
Did you say guardian, then what is role of science commons, why don’t just give away the data and why you need to create whole brand new organization to control this process. Wilbanks answer it very cleverly
This is complex content and it’s going to take some ongoing work to expose everything in a usable way.

I decided to check out what is going on Sage website and what they have to say,

The primary output from Sage will be an open access platform available in the public domain. An incubation period of three to five years is anticipated in which new project data are generated, critical tools for building and mining disease models are developed and governing rules for sharing, accessing, and contributing to the platform are established.
If I am not wrong they are going to develop a new web platform very much like NCBI and PDB which will be hosting the data and tools (heck none of them exist yet). I have mixed feelings about this new development and this is not first time that some pharma giant has decided to give away data, in past Novartis also have similar arrangement in bid to tackle diabetes but with certain reservations which gives a substantial lead over other companies attempting to exploit the open research. Everyone is just asking (why, when, how) and no one has answer, but for now open-source science advocates got a new agenda to cheer up. If you carefully analyze the development then you will realize that pharma companies are forced to do this, their dry drug discovery pipelines are pushing them on edge and may be Merck was expecting that if they will not do this then sooner or later some one else will take the lead, this does not reflect any positive move but appear like a sign of desperation. They are also realizing the power of collective intelligence in web 2.0 era, unfortunately finding new drugs is not as easy as contributing to Wikipedia or open source software (Harnessing the Crowd to Make Better Drugs). Take one more
Biology has never really had a social-networking movement like open-source computing, where thousands of loosely-affiliated people around the world pool brainpower to make better software.
What a hyporcracy, thousands of loosely-connected people around the world can make working software but they can not discover a drug, so I am not convinced that it is going to harness the crowd to make better drugs and that does not make any sense to me. Further founder of Sage also claims that
We see this becoming like the Google of biological science. It will be such an informative platform, you won’t be able to make decisions without it, We want this to be like the Internet. Nobody owns it.
That’s does sound over-statement without any factual information and we can not blame him alone, we are equally fascinated to scientific buzzs such cloud computing, crowd computing, open science, open data, open foo, open bar. Luckily any idioms or phrase which is boomed as hype and does not deliver on exceptions is doomed to fail, so I don’t need spend energy worrying about this.
Share and Enjoy:
  • Print
  • Digg
  • StumbleUpon
  • Slashdot
  • HackerNews
  • Reddit
  • del.icio.us
  • Twitter
  • Facebook
  • Google Bookmarks
  • Posterous
  • Tumblr

Related posts:

  1. Data Vendetta
7 Responses leave one →
  1. March 6, 2009

    Cool… thank you for reporting it!

  2. March 6, 2009

    thanks, just want to pull the leg, when things publicize in an exaggerated and often misleading manner

  3. March 6, 2009

    Merck Murky Data: After my last controversial post Data Vendetta, I was trying to keep away from writing anythin.. http://tinyurl.com/axshwb

  4. March 8, 2009

    What you say is absolutely true. I want to see the actual data before I praise Merck for their “bold” decision. I wonder how really “available” the data will be and what kind of tools they will create. If they were really doing it as a charitable act, they would just make their databases accessible from the internet, and publish a simple API to access the data in its entirety. I am sure that if the data was useful, in a few weeks you would start seeing bioinformaticians releasing software capable of mining ir. Instead, Merck wants to make the tools themselves, so they can fully control what’s being done to the data and by whom, and to regulate access depending on what makes most sense to them. We will have to wait and see how useful to the non-profit scientific community this project will actually turn out to be. It is not unconceivable that both Merck and academic science can benefit from this venture, but it is too early to say.

  5. March 8, 2009

    I could not agree more, “if the data was useful, in a few weeks you would start seeing bioinformaticians releasing software capable of mining it” thats very true. Surprisingly this whole thing is making news in Nature. I don’t know what to say.

  6. March 24, 2009

    Abhishek – Nice post with a healthy dose of skepticism. While I also applaud Merck for their announcement of a data commons — it’s a long way to creating one. I believe that the folks behind the SAGE initiative are earnest in their goals, but open-ness isn’t a trait typically associated with Big Pharma. Nor necessarily should it be.

    As someone who worked at NCBI, I can say that they have done an admirable job creating this kind of scientific commons — through Pubmed, Genbank, GEO (microarray DB), etc. It’s not perfect, but NCBI’s goals will never conflict with its commitment to democratizing data.

  7. March 24, 2009

    That’s true, rather than creating a SAGE they could have just donated it to NCBI. We can just hope now that sooner or later there will be some open data.

Leave a Reply

Note:You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS