Genetics Unzipped is the podcast from the Genetics Society - one of the oldest learned societies dedicated to supporting and promoting the research, teaching and application of genetics. Find out more and apply to join at genetics.org.uk

The dark heart of the genome

The dark heart of the genome

dark heart.jpg

Let me take you on a journey into the heart of darkness. A mysterious region of the genome where nothing makes sense, and everything looks the same, endlessly repeating over and over again. A forbidding genetic terrain where, until relatively recently, scientific explorers were unable to venture.

This is the centromere – a strange arrangement of repeated DNA sequences in the middle of every chromosome.  If you’ve ever seen pictures of x-shaped chromosomes when a cell is ready to divide, the centromere is the bit where the two arms meet and is the anchor point for the biological machinery that pulls them apart during cell division.

Centromeres were first described in 1882 by the German microscopist Walther Flemming – a man we met back in episode 12 – who noticed that chromosomes appeared to have little waist-like constrictions roughly halfway down. And by the 1900s, scientists had figured out that centromeres were an absolutely vital part of a chromosome. Without them, chromosomes get mislaid during division so cells end up with the wrong number, leading to cancer or other problems. And human chromosomes that accidentally acquire two or more centromeres are also in trouble, as they are likely to be pulled apart if they’re being dragged in opposite directions.

For something so essential, there’s a lot of diversity in the world of centromeres, from short stretches of DNA just 125 ‘letters’ long in yeast to hundreds of thousands in most plants and animals. But they do have one thing in common – they’re incredibly repetitive.

Centromeric DNA is usually made up from short sequences of DNA, repeated over and over again. In the case of human centromeres, much of each chromosome’s centromere consists of so-called alpha satellite repeats – a 171-letter sequence that is repeated many thousands of times. Other organisms have their own variations on the repeat theme: thale cress (Arabidopsis) centromeric repeats are 178 letters long, while fruit flies’ are just five.

Many organisms have just one centromere per chromosome, but tiny nematode worms have gone to extremes, with little regions spread across the full length of their chromosomes, effectively turning each chromosome into one long centromere.

But, curiously enough, it’s not the underlying DNA sequence that makes a centromere a centromere, but the presence of certain proteins and marks on the ball-shaped histone proteins that package DNA, known as epigenetic modifications.

In experiments on human cells grown in the lab, sticking centromere DNA into another region of a chromosome doesn’t automatically give it a second centromere. But using clever molecular manipulation to stick centromere proteins onto a random stretch of DNA does.

This idea that centromeres are determined by proteins rather than DNA was backed up by a recent study of orangutans living on the islands of Borneo and Sumatra. Within these populations, there are two versions of chromosome 12, with the centromere in different places.

One centromere looks a lot like our own human versions, packed with lots of repetitive DNA, and must be very old in evolutionary terms. The other is just a regular bit of DNA that happened to have attracted the right kind of centromere proteins relatively recently in evolutionary history. But it still works absolutely fine, and it has been perpetuated down the generations.

That’s not to say the DNA within centromeres isn’t interesting at all, but it has been very hard to study. There are very few genes in there and sequencing across the highly repetitive DNA in most centromeres is extremely difficult, so researchers have tended to skip over them in favour of nice, normal genes elsewhere. But improvements in sequencing technology mean that these previously dark hearts of our genomic map are now opening up for exploration.

One intriguing finding comes from the fact that there is very little mixing at centromeres when eggs and sperm are made, so the DNA within them tends to be passed down relatively unscathed from generation to generation. Now that researchers can finally map out the confusing genetic landscape within centromeres, they can start comparing sequences between organisms or populations to look evolution and ancestry.

A recent study from a team of US researchers looking at DNA from the 1000 Genomes Project, which captures data from a wide range of human populations, discovered that there are several specific centromere lineages that stretch back half a million years. For example, some people with non-African recent ancestry have Neanderthal centromeres in their chromosome 11, while the centromere of chromosome 12 is home to an even more archaic stretch of DNA that must have come from an as-yet-unknown ancient human ancestor.

Despite all these new tools and techniques, we’re only just starting to scratch the surface of the mysterious terrain within centromeres, so who knows what other hidden genetic secrets might be lurking within the dark heart of the genome?

References and further reading

Driven to extinction

Driven to extinction

Splice girls

Splice girls