Speaker: 

Ridgway Scott

Institution: 

Departments of Computer Sci. and Math. , U. of Chicago

Time: 

Thursday, February 9, 2006 - 4:00pm

Location: 

MSTB 254

The digital nature of biology is crucial to its functioning
as an information system. The hierarchical development of
biological components (translating DNA to proteins which form
complexes in cells that aggregate to make tissue which form
organs in different species) is discrete (or quantized) at
each step. It is important to understand what makes proteins
bind to other proteins predictably and not in a continuous
distribution of places, the way grease forms into blobs.

Data mining is a major technique in bioinformatics. It has been
used on both genomic and proteomic data bases with significant
success. One key issue in data mining is the type of lens that
is used to examine the data. At the simplest level, one can just
view the data as sequences of letters in some alphabet. However,
it is also possible to view the data in a more sophisticated
way using concepts and tools from physical chemistry. We will
give illustrations of the latter and also show how data mining
(in the PDB) has been used to derive new results in physical
chemistry. Thus there is a useful two-way interaction between
data mining and physical chemistry.

We will give a detailed description of how data mining in the
PDB can give clues to how proteins interact. This work makes
precise the notion of hydrophobic interaction in certain cases.
It provides an understanding of how molecular recognition and
signaling can evolve. This work also introduces a new model of
electrostatics for protein-solvent systems that presents
significant computational challenges.