Re: GENOMICS: getting more from less

From: Anders Sandberg (asa@nada.kth.se)
Date: Mon Aug 11 2003 - 10:17:00 MDT

  • Next message: natashavita@earthlink.net: "META: List Changes"

    On Mon, Aug 11, 2003 at 08:17:43AM -0700, Robert J. Bradbury wrote:
    >
    > On Mon, 11 Aug 2003, Anders Sandberg wrote:
    >
    > > You don't really get more information. The data processing theorem
    > > of information theory shows that you will always lose information
    > > in every step of processing, you will not be able to increase it.
    >
    > I'm not so sure Anders -- it reminds me of the statement from
    > "Through the Looking Glass..." [1]. One may be able to make
    > a "word" mean many things depending on the context. I'll call
    > this "information overloading" and I'm not sure whether standard
    > information theories can be applied. At least not standard
    > theories of the information content of DNA or RNA.

    I think you can deal with it in the standard way, it is just that
    you have to calculate information across the entire gene and not
    just single bases, i.e. take correlations and higher-order links
    into account. It is not enough to sum the entropy like -sum_i
    sum_{b=ATCG} P(x_i=b) log P(x_i=b), but you have to look at longer
    stretches like -sum_i sum_{b1=ATCG} sum_{b2=ATCG}
    P(x_i=b1,x_{i+1}=b2) log P(x_i=b1,x_{i+1}=b2) and so on.

    The question is how the mutual information between the DNA and the
    finished protein differs from the mutual information calculated
    purely from codons; this is the real measure of how much information
    "sneaks in". It should be quite observable.

    Of course, entropy is not quite what we call information. It would
    be maximal if all bases were uncorrelated and random, rather than
    for complex correlation patterns.

    -- 
    -----------------------------------------------------------------------
    Anders Sandberg                                      Towards Ascension!
    asa@nada.kth.se                            http://www.nada.kth.se/~asa/
    GCS/M/S/O d++ -p+ c++++ !l u+ e++ m++ s+/+ n--- h+/* f+ g+ w++ t+ r+ !y
    


    This archive was generated by hypermail 2.1.5 : Mon Aug 11 2003 - 10:23:16 MDT