RE: # human genes creeps up again

From: Robert J. Bradbury (bradbury@aeiveos.com)
Date: Sat Aug 25 2001 - 11:57:14 MDT


The NY Times discusses this here:
http://www.nytimes.com/2001/08/24/health/genetics/24GENO.html

Its apparently work done by scientists at the Novartis
Genomics Institute in San Diego.

I think I've seen another report as well based on work done
on the East coast.

Here is the problem. The historic paradigm has been
one gene = one protein. That idea works pretty well in
bacteria but as been under slow erosion in higher organisms
as the scientists have found genes that have alternate
splice variants and proteins that may be cleaved into
different forms by proteases (the primary player in
Alzheimer's disease fits this pattern).

So for years the concept of a "gene" has been getting fuzzy.
It is bordering on being what Minsky calls a "suitcase word"
at this point. The estimates of 100K-140K genes were based
on "Expressed Sequence Tag" (EST) libraries done by Human Genome
Sciences, Incyte and a publicly funded effort. The number
of "genes" found by Celera and the public effort announced
in February were based on what programs designed to recognize
genes could identify in the sequence data. My conclusion from
the NY Times article is that the Celera group and the public
group were using different programs to identify the "genes".
The other groups that are weighing in now are trying to
integrate the Celera and public data and match that up
with the EST data.

My opinion at this point is leaning towards 40-50,000
"transcription regions" in the genome that end up getting
expressed and generate probably 120K-140K proteins.

Recent papers, I think in Science, have revised upwards
the gene count in both Drosophila and C. elegans. so
we haven't reached the bottom of the genome barrel yet
but we are getting there.

Harvey, you have to give the scientists a break here -- they
were under a lot of pressure to announce progress here.
You aren't going to see any reasonable discussion about
the fraction of genes that they missed or how they missed
them in the public press. You have to go into a detailed
analysis of the published literature. Thats probably
approaching 100+ pages at this point.

Robert



This archive was generated by hypermail 2b30 : Fri Oct 12 2001 - 14:40:14 MDT