LONG TERMINAL REPEATS

Click on the arrows at bottom to control the movie 

As we have seen, retroviruses (which have an RNA genome) face a problem in that they copy their RNA to DNA using a virus-specific polymerase called reverse transcriptase. But then when new virus is made, the DNA genome must be copied back to RNA using a host enzyme, RNA polymerase II. This enzyme is used by the host cell to make messenger RNA and when mRNA is made, some of the DNA gene is not copied to RNA (since that part of the genetic information is not required in coding for a protein). The non-copied parts include the control elements of the gene such as promotors, enchancers, stop signals etc. Thus RNA polymerase II is ill-suited to the exact copying of a genome...but the retrovirus has no other choice and must overcome this limitation. It does this in the following way.

THE RNA GENOME IN THE VIRUS 

The RNA form of the genome of a retrovirus consists of a unique sequence that contains the three structural (protein coding) genes, gag, pol and env (black in the diagram below). These genes code for the internal antigens (gag), the enveloped glycoproteins (env) and the enzymes needed for viral replication and maturation (pol). Bordering this region are two unique sequences, U5 (blue), at the 5 prime end and U3 (purple) at the 3 prime end of the RNA. These do not encode a protein. They encode controlling elements that are used later in the copying of the DNA back to RNA. At the ends of the RNA form of the genome are two sequences that are very similar. These are the repeat (R) sequences and are shown in red.

In fact, things are a little more complex than this because of two additional regions at the U5/gag and the env/U3 interface. These are the primer binding site (below in green) and the polypurine tract (brown)

THE DNA PROVIRUS FORM FOUND IN THE HOST CELL

The DNA form of the retrovirus genome is larger than the RNA form and has an extra sequence duplicated at each end

Sequencing has shown that the extra sequences at each end are duplications of internal sequences in the RNA form of the virus. The U3 region (purple), an internal region in the RNA form, has been duplicated and lies at the opposite end of the DNA strand. The U5 region has also been duplicated and lies at the other end of the DNA strand. The DNA is, of course, double stranded

So, by some means, a two internal regions of the RNA form of the genome are copied during reverse transcription and come to lie elsewhere in the DNA genome. Notice that at each end we now have U3-R-U5, that is the two ends are now identical. These are the LONG TERMINAL REPEATS or LTRs

How does this happen?:

1. As you know, all DNA polymerases require a primer and reverse transcriptase is no exception. The primer is a transfer RNA (tRNA) of the host cell that is packaged into the virus particle. It does not bind to the end of the nucleic acid (in this case RNA) to be copied but, instead, it binds at the primer binding site (green). Reverse transcriptase copies the RNA into DNA as far as the end of the RNA.

2. Now we have double-stranded nucleic acid at one end and an enzyme called RNase H can degrade RNA that is in a double stranded form. RNase H now removes the RNA of the double strand DNA/RNA hybrid

3. FIRST JUMP: The new piece of DNA (which is no longer hyrbidized to a long strand of RNA), together with the primer, now jumps to the other end of the RNA where the repeat (R) sequences hybridize.

4. Reverse transcriptase now copies the remainder of the RNA to the far end

5. Again we have a lot of double-strand RNA/DNA hybrid and the remainder of the RNA in the hybrid is digested by RNase H except for the polypurine tract (brown) that remains to act as a new RNA primer for reverse transcriptase. The second strand is now extended left to right from the polypurine RNA primer

6. The RNA primers are now removed by RNase H.

7. Next the above structure forms a circle since the primer binding sites (green) can hybridize

8. Using the PBS region (green) as a primer, the reverse transcriptase now copies a second strand of DNA

9. Now the reverse transcriptase displaces one of the DNA strands and copies further. At this stage one of the long terminal repeats is complete

10. SECOND JUMP. The reverse transcriptase jumps to the other stand and completes it so that now both long terminal repeats are complete. This is best seen in the movie

Click on the arrows below to control the movie

play stop rewind