Intein motifs
Motifs are shown as sequence logos.
|
<----->
1-33 aa
|
|
<------>
16-75 aa
|
|
<----->
3-37 aa
|
|
N1 | |
N2 | |
N3 | |
N4 |
The first motif is found at the N-termini of inteins (A in
Pietrokovski 94'
and Perler et al. 97').
Note the conserved Ser and Cys at the N' intein splice site.
The OH/SH side groups of the aa in this position are necessary for the
N-O/S shift
of the peptide bond to the N-extein. The His at position 10 of
the third motif is the most conserved intein residue.
It was first predicted
(Pietrokovski 94')
and then shown (Kawasaki et al. 97')
to be involved in the protein splicing reaction. Intein structures
(Duan et al. 97',
Klabunde et al. '98) also
showed this residue to be positioned at the protein splicing active site.
|
<------>
-1-9 aa
|
|
C2 | |
C1 |
The two C domain motifs
(F and G in Pietrokovski 94'
and Perler et al. 97')
are involved in the final steps of protein splicing. A Ser, Thr or Cys are
found at the N-terminus of the C-extein (last position of second motif).
The hydroxyl, or thiol, group of this aa attacks the last aa of the N-extein
in a
transesterification reaction. The resulting branched intermediate is
resolved by the
cyclization of the Asn preceding the attacking aa. This Asn,
the intein's C-termini, is the second most conserved residue in inteins.
Gln residues are now also known to occur in this position. They are found
in the CIV RIR1 and Pho polC inteins. These inteins are integrated in
conserved protein regions of vital proteins (subunits of ribonucleotide
reductase and replicative DNA polymerases). Hence it is likely that these
inteins are active and capable of protein splicing. The splicing reaction is
suggested to be a variation of the Asn cyclization in which the Gln will
undergo
cyclization to glutarimide ring.
DOD domain
|
<------->
53-106 aa
|
|
<----->
4-18 aa
|
|
<----->
0-23 aa
|
|
EN1 | |
EN2 | |
EN3 | |
EN4 |
The first and third EN motifs (C and E in
Pietrokovski 94'
and Perler et al. 97')
are the DOD motifs found in the DOD homing endonucleases
(Mueller et al. 94').
The two motifs are similar to each other and probably have similar roles.
Protein structure of the yeast Sce VMA intein (
Duan et al. 97')
and of a DOD type endonuclease (
Heath et al. 97')
showed the motifs to be alpha helices holding together the two halves
of the protein and also forming the endonuclease's active site. The Sce VMA
structure also showed the conserved basic residue in the second position of
the second EN motif (motif D in
Pietrokovski 94'
and Perler et al. 97')
to be another part of the active site.
Mutating DOD motifs in the Tli pol2
(Hodges et al. 92')
and Sce VMA
(Gimble and Stephens 95')
inteins abolished their endonuclease activity.
However, the protein splicing activity of Tli pol2 was not affected by
the mutation. Genetically engineered Sce VMA
(Chong and Xu 97') and Mtu recA (Derbyshire et al. 97') inteins lacking the EN domain
were both shown to protein splice.
The EN domain is also naturally missing from
various inteins. All this clearly shows
that the endonuclease domain is optional and not crucial for intein splicing.
|
<----->
2-10 aa
|
|
The HNH motif is found in bacterial and organellar endonucleases occurring as
independent genes and inside group I and II introns
(Shub et al. 94',
Gorbalenya 94').
The endonuclease domain is optional
in inteins. Mutations in it affect the intein endonuclease activity but
not the protein splicing activity, some inteins are missing this domain,
and inteins were shown to protein splice without this domain
(Chong and Xu 97',
Derbyshire et al. 97'). The role of the EN domain
is to enable inteins to horizontally transfer to unoccupied intein
integration-sites by a process termed homing. This process was first
studied in group-I introns that code for proteins called homing endonucleases.
It proceeds in the same way in both introns and inteins and will be described
here for inteins. The cleavage site of the inteins endonuclease domain is
made up by the two flanks of the integration sites in their host gene. This
was experimentally verified for many of these group-I encoded endonucleases
and some inteins. The target sites are very long relative to
restriction endonuclease spanning 12-40 bp. This usually assures that only
such site will be present in the genome - in an unoccupied intein-host gene.
In order for homing to occur the DNA of an intein containing gene must be
present in a cell with an intein-less allele of this gene.
This can happen in sexual mating in eukaryotes or when a bacterium or archaeon
ingest or exchange DNA. The intein would be transfered with the DNA or
transcribed and translated from it. It will then proceed to cleave the
intein-less allele of its host. If the resulting double-strand break will
be repaired by ligation of its ends it will be cleaved again.
However, the break can also be repaired
using the intein+ allele as a template. In this case the repaired gene will
now include the intein region in exactly the same spot as the intein+ allele.
This gene conversion process is called homing since the transfered element
(intein or intron) can only move to homologous unoccupied sites of its
integration point. There is a possibility that the
cleavage site will be lost due to mutations or errors in repair of the double
strand break. In such cases that site will be immune to cleavage by that
intein. Indeed, inteins and group-I introns are found integrated in highly
conserved sites where changes are unlikely and usually deleterious.
The positions of the motifs are conserved in different inteins and relative to
each other. This can be seen in the
inteins motif map.
Intein structures
show that the motifs have important
functional and structural roles, forming the protein splicing and
endonuclease active sites.
| Motif designations |
| Pietrokovski '97 |
Perler '97 & Pietrokovski '94 |
Other names |
| N1 |
A |
Inteins N-terminal splicing point |
| N2 |
- |
- |
| N3 |
B |
- |
| N4 |
- |
- |
| EN1 |
C |
DOD, dodecapeptide, LAGLIDADG, P1 |
| EN2 |
D |
- |
| EN3 |
E |
DOD, dodecapeptide, LAGLIDADG, P2 |
| EN4 |
H |
- |
| HNH |
- |
I-TevIII family motif |
| C2 |
F |
- |
| C1 |
G |
Inteins C-terminal splicing point |
[Inteins home page]
Page last modified July 1998
Shmuel Pietrokovski <pietro@weizmann.ac.il>