Logo of the British Biophysical Society

The British Biophysical Society

Registered Charity No. 254742

Building Better Science

Protein - DNA Interactions

Geoff Kneale

 

The sequence of bases in the DNA in the chromosomes determines the genetic make-up of an individual. It is the interaction of specific proteins with special control sequences in the DNA that determines whether or not a gene is expressed in a given cell. Numerous DNA binding proteins have now been identified by biochemical methods, and a considerable amount is known about their function in the cell from genetic studies. Tremendous effort has been put into elucidating the three dimensional molecular structure of such proteins, and their complexes with specific DNA sequences, principally by X-ray crystallography but also, for smaller proteins, by nuclear magnetic resonance (NMR). Thus, as with the discovery of the double helix, biophysics continues to play a vital role at the fore-front of molecular biology.

Clear structural categories of DNA binding proteins are emerging. One such family - the so-called "zinc-finger" proteins, are exemplified by steroid hormone receptors such as the glucocorticoid receptor. The receptor is specific for a given hormone, and when the hormone binds, it changes the structure of the protein in such a way that it can now interact with a specific DNA sequence that controls the expression of the target genes.

leucine zipper

Another class of DNA binding proteins go by the interesting name of leucine zipper proteins.These proteins join up in pairs, held together by complementary patches on their surface, and interact somewhat like a zipper. This brings other regions (the DNA binding arms) of the two proteins together, forming a Y-shaped molecule with the correct orientation of the arms to bind to specific DNA sequences and thereby regulate gene expression. The genes that these proteins control are often involved in the regulation of cell growth and, not surprisingly, mutations to them are implicated in many forms of cancer.

As we discover more about the numerous proteins that interact with DNA, we can understand how specific DNA sequences are recognised by proteins. Of the thousands of genes in the simplest organisms such as bacteria, or the tens of thousand of genes in more complex organisms such as man, it is vital for the cell that the proteins that control the activity of the genes can find the precise sequence of bases in the genome. Indeed, the number of potential binding sites on the DNA is many millions. If the wrong genes are switched on at the wrong time, the consequences for the cell could be fatal. How are proteins able to discriminate the "correct" from the "incorrect" sequences?

The bases are stacked together in the double helix, and they are linked to their complementary base on the other strand by the attractive forces known as hydrogen bonds. But the edges of the bases are exposed in the grooves of the DNA. It turns out that proteins are able to sense the correct bases by binding to the grooves of the double helix, and "reading" the sequence of the bases by the pattern of atoms presented. Again, hydrogen bonds are the crucial attractive forces between specific amino acids on the "reading head" of the protein and the DNA recognition sequence.

DNA Methylation

There are many ways in which the information stored in the DNA double helix can be modified, the most important being through a process known as "DNA methylation". This consists of changing the chemical structure of one of the bases (usually A or C) by the addition of a small group of atoms - a methyl group. This modification then acts as a label so that specific DNA sequences can be targetted for further action (or in most cases, to prevent action!) by regulatory proteins. The presence of the methyl group in one of the grooves of the double helix effectively prevents the protein from binding to a sequence it would otherwise recognise.

DNA methylation is important for many biological processes in the cell. Whether genes are methylated or not can determine their level of expression; genes that not required to be expressed in a particular cell are often highly modified in this way. However, the details of exactly how methylation affects gene expression is still not well understood.

DNA methylation is also intimately involved in the processes of DNA repair, for example, when errors are introduced during the process of DNA replication. Following replication of the genes, the presence of a methyl group on one of the two DNA strands allows the DNA repair systems to know which strand is the original, since the DNA strand from the original chromosome is the one that will be methylated. Thus, if any errors occur in the process of replication (e.g. an incorrect base pair such as G-T instead of G-C), then the repair system knows which one to correct. Without such a repair mechanism, such errors would lead to "scrambling" of the genetic code, with fatal consequences for the cell.

Finally, DNA methylation plays a vital role in the system that bacteria use to defend themselves from invasion by "foreign" DNA - eg. from attack by viruses. Each strain of bacteria has a specific enzyme that cuts the DNA at a unique sequence of bases, normally about 6 bases long. (These are known as restriction enzymes and, incidentally, are the means by which molecular biologists can "clone" genes; they provide the tools for the whole of genetic engineering and DNA fingerprinting). But how does the cell prevent these enzymes from destroying its own DNA? How does it distinguish "self" from "non-self"? It turns out that the cell makes another enzyme that will methylate precisely this sequence in its own DNA, with the effect that the restriction enzymes cannot cut at these DNA sequences, only those sequences of the invading "foreign" DNA. This defence mechanism therefore represents a primitive "immune" system, but one that works at the level of DNA.

hhai.gif

The structure of a methyltransferase bound to DNA (above) was first elucidated by X-ray crystallography in 1993, and was quite revolutionary. The normal double helical structure of DNA was disrupted, and the base to be modified was flipped out of the double helix, into the active site of the enzyme. An amino acid side-chain from the enzyme takes the place of the missing base! Recent research suggests that similar mechanisms of base flipping are quite widespread, and are utilised by a variety of DNA repair enzymes, that are crucial for maintaining the integrity of our genes!

Geoff Kneale is Professor of Biomolecular Science at the University of Portsmouth and Director of the Institute of Biomedical and Biomolecular Sciences.