Re: Genomics [was Re: MEDIA: Celera profitability]

Robert J. Bradbury (
Tue, 28 Sep 1999 18:16:41 -0700 (PDT)

On Mon, 27 Sep 1999, Kathryn Aegis wrote:

> Y'know why I don't post the full URLs anymore? Because they shifted to
> afternoon and evening editions, and if you don't access it by late
> afternoon, the index prefixes change.

That's silly it makes extra work for them. Links should be forever or automatically reroute.

> >Clearly the government scientists and strategies have been
> >scooped in this affair.
> Only on the basic components. Celera deliberately choose the 'fast and
> dirty' method, whereas the government-sponsored projects chose to focus on
> the full detailed information. Both should turn out to be useful for
> different kinds of applications.

Thats what some of the government researchers would like to have you believe. Celera had two things going for it (a) access to 300 really good sequencing machines; and (b) Eugene Myers. Myers actually worked out the mathematics and did the simulations showing that you really could reassemble small fragments into the complete genome. I won't go into the complexities of the methods they are using but it basically comes down to the fact that you can use multiple tags that show up in the small fragments to virtually "guarantee" that you are connecting the dots correctly. It boils down to the fact that (historically) most of the people working on the genome project were biologists and biologists generally don't do math and they don't generally attack problems from the perspective of "what can one of the world's largest supercomputers do for me?"

> It's not mentioned in the article, but Venter is banking on marketing a
> software library system, one that would allow Celera to sell subscriptions
> to a certain method of accessing and utilizing genome data.

This has been the path of all of the sequencing companies (Human Genome Sciences, Incyte, Millennium, etc.) to a greater or lesser degree. Celera, I believe intends to go farther since I think they want to get into the position of determing what some of the genes do functionally.

For example, we know enough about gene structure now to recognize what genes look like that are "receptors" or "transcription regulatory factors" but we don't know enough to know what binds to the receptor or the piece of DNA that a transcription factor regulates. After you have all that sequence you have to take these next steps to add value to the data.

> His argument is that the government data will be free of charge, but it
> will be in raw form and difficult to sort through for specific purposes.

Yep, they are selling value-added because they have really good software people and will probably have a huge pool of gene "curators".

> Persons hoping to obtain patents using that data could use his library
> system for research and development in specialized areas.
It will be nice to see the whole patent debate move out of the information in the sequence and into the "creative" applications of it (which was presumably what the law intended).

The government funded efforts are valuable because they are publishing the data as fast as they produce it, which is forcing the corporations to be more innovative and add more value to have something that people will be willing to buy.