Difference between revisions of "General Information/Target Choice"
Line 1: | Line 1: | ||
'''<big>T</big>'''he influence of different IS on genome architecture will depend not only on their levels of activity but also on the type of target into which they insert. It was initially believed that TE show no or only low sequence specificity in their target choice. For example [[IS Families/IS630 family|IS''630'']] and the eukaryotic Tc/mariner families<ref><nowiki><pubmed>26104691</pubmed></nowiki></ref> both require a TA dinucleotide in the target <ref><nowiki><pubmed>17680987</pubmed></nowiki></ref><ref><nowiki><pubmed>8556864</pubmed></nowiki></ref> while others such as the [[IS Families/IS200-IS605 family|IS''200''/IS''605'']] and [[General Information/IS91 and ISCR|IS''91'']] families require short tetra- or penta-nucleotide sequences<ref><nowiki><pubmed>2552258</pubmed></nowiki></ref><ref><nowiki><pubmed>19524540</pubmed></nowiki></ref>. Yet others, such as [[IS Families/IS1 family|IS''1'']] and IS''186'' (of the [[IS Families/IS4 and related families|IS''4'' family]]), show some regional specificity (for AT and GC rich sequences respectively) <ref><nowiki><pubmed>6260963</pubmed></nowiki></ref><ref><nowiki><pubmed>6248730</pubmed></nowiki></ref><ref><nowiki><pubmed>3032747</pubmed></nowiki></ref>. | '''<big>T</big>'''he influence of different IS on genome architecture will depend not only on their levels of activity but also on the type of target into which they insert. It was initially believed that TE show no or only low sequence specificity in their target choice. For example [[IS Families/IS630 family|IS''630'']] and the eukaryotic Tc/mariner families<ref><nowiki><pubmed>26104691</pubmed></nowiki></ref> both require a TA dinucleotide in the target <ref><nowiki><pubmed>17680987</pubmed></nowiki></ref><ref><nowiki><pubmed>8556864</pubmed></nowiki></ref> while others such as the [[IS Families/IS200-IS605 family|IS''200''/IS''605'']] and [[General Information/IS91 and ISCR|IS''91'']] families require short tetra- or penta-nucleotide sequences<ref><nowiki><pubmed>2552258</pubmed></nowiki></ref><ref><nowiki><pubmed>19524540</pubmed></nowiki></ref>. Yet others, such as [[IS Families/IS1 family|IS''1'']] and IS''186'' (of the [[IS Families/IS4 and related families|IS''4'' family]]), show some regional specificity (for AT and GC rich sequences respectively) <ref><nowiki><pubmed>6260963</pubmed></nowiki></ref><ref><nowiki><pubmed>6248730</pubmed></nowiki></ref><ref><nowiki><pubmed>3032747</pubmed></nowiki></ref>. | ||
− | Although, from a global genome perspective, insertion may appear to occur without significant sequence specificity, accumulation of more statistically robust data has uncovered rather subtler insertion patterns revealing that several TE use rather shrewd mechanisms in choosing a target. For example, there is some indication from the public databases suggesting that IS density is generally significantly higher in conjugative bacterial plasmids than in their host chromosomes with the exception of special cases in which the host has undergone IS expansion. Such plasmids are major vectors in lateral gene transfer and are important vectors in IS transmission (as well as in transmission of accessory traits such as resistance to antibacterials). Some TE, including IS, appear to be attracted to replication forks<ref | + | Although, from a global genome perspective, insertion may appear to occur without significant sequence specificity, accumulation of more statistically robust data has uncovered rather subtler insertion patterns revealing that several TE use rather shrewd mechanisms in choosing a target. For example, there is some indication from the public databases suggesting that IS density is generally significantly higher in conjugative bacterial plasmids than in their host chromosomes with the exception of special cases in which the host has undergone IS expansion. Such plasmids are major vectors in lateral gene transfer and are important vectors in IS transmission (as well as in transmission of accessory traits such as resistance to antibacterials). Some TE, including IS, appear to be attracted to replication forks<ref name=":0"><pubmed>11274058</pubmed></nowiki></ref><ref><nowiki><pubmed>20691900</pubmed></nowiki></ref><ref name=":1"><pubmed>11715047</pubmed></nowiki></ref> and show a strong orientation bias indicating strand preference at the fork<ref name=":0" /><ref><nowiki><pubmed>20691900</pubmed></nowiki></ref><ref name=":1" /><ref name=":2"><pubmed>9620951</pubmed></nowiki></ref><ref name=":3"><pubmed>11178901</pubmed></nowiki></ref>. Moreover, in certain cases, insertion may target stalled replication forks<ref name=":4"><pubmed>26350330</pubmed></nowiki></ref>. A link between replication (in this case, replication origins) and insertion has also now been observed for a eukaryotic TE: the [[wikipedia:P_element|P element]] of ''[[wikipedia:Drosophila|Drosophila]]''<ref><nowiki><pubmed>21896744</pubmed></nowiki></ref>. |
− | For example, transposon Tn''7'' has two modes of transposition<ref | + | For example, transposon Tn''7'' has two modes of transposition<ref name=":5"><pubmed>26104363</pubmed></nowiki></ref><ref><nowiki><pubmed>1664019</pubmed></nowiki></ref><ref><nowiki><pubmed>2834269</pubmed></nowiki></ref>: in one, which uses the Tn''7''-encoded target protein TnsD, a specific sequence within the highly conserved ''glmS'' is recognized and insertion occurs next to this essential gene<ref name=":1" /><ref name=":5" /><ref><nowiki><pubmed>2826397</pubmed></nowiki></ref><ref><nowiki><pubmed>2542960</pubmed></nowiki></ref>; in the second, which uses a more general targeting protein, TnsE, insertion occurs into replication forks directed by interactions with the β-clamp<ref><nowiki><pubmed>19703395</pubmed></nowiki></ref><ref><nowiki><pubmed>8804309</pubmed></nowiki></ref>. This latter pathway results in a strong orientation bias of Tn''7'' insertions, consistent with insertion into the lagging strand of the replication fork formed during conjugative transfer. Although studies with IS are less advanced, a similar orientation bias was observed with IS''903<ref name=":2" />''<ref name=":3" />, suggesting that it too may use the β-clamp in directing insertions. It seems probable that many other IS use this type of protein-protein interaction. |
− | The second example of a specialized target choice was observed in members of the IS''200''/IS''605'' family (see "[[IS Families/IS200-IS605 family|IS''200''/IS''605''-family]]"). These transpose using a strand-specific single-strand intermediate and insertion occurs 3’ to a tetra- or pentanucleotide on the lagging strand<ref | + | The second example of a specialized target choice was observed in members of the IS''200''/IS''605'' family (see "[[IS Families/IS200-IS605 family|IS''200''/IS''605''-family]]"). These transpose using a strand-specific single-strand intermediate and insertion occurs 3’ to a tetra- or pentanucleotide on the lagging strand<ref name=":6"><pubmed>20691900</pubmed></nowiki></ref><ref name=":4" />. Clear vestiges of this specificity can still be detected in a large number of bacterial genomes where the orientation of insertion is strongly correlated with the direction of replication. There are clearly incidences of insertion in the “wrong” orientation but many of these may be explained by post-insertion genome rearrangements involving inversions. This would place the “active” strand of the IS on the lagging rather than on the leading strand. Interestingly, those IS which are not oriented in the “correct” orientation with respect to replication are almost certainly inactive and unable to transpose further<ref name=":6" />. |
Other examples of sequence-specific target choice have been described. IS''1'', for example, shows a preference for regions rich in AT whereas the transposon TnGBS (an ICE from ''[[wikipedia:Streptococcus_agalactiae|Steptococcus agalactiae]]'') and members of the closely related IS''Lre2'' family show a preference for insertion 15-17bp upstream of σA promoters<ref><nowiki><pubmed>19183283</pubmed></nowiki></ref><ref><nowiki><pubmed>23435978</pubmed></nowiki></ref>. Targeting of upstream regions of transcription units has also been extensively documented for certain eukaryotic transposons (e.g. <ref><nowiki><pubmed>22493285</pubmed></nowiki></ref><ref><nowiki><pubmed>22287102</pubmed></nowiki></ref>). | Other examples of sequence-specific target choice have been described. IS''1'', for example, shows a preference for regions rich in AT whereas the transposon TnGBS (an ICE from ''[[wikipedia:Streptococcus_agalactiae|Steptococcus agalactiae]]'') and members of the closely related IS''Lre2'' family show a preference for insertion 15-17bp upstream of σA promoters<ref><nowiki><pubmed>19183283</pubmed></nowiki></ref><ref><nowiki><pubmed>23435978</pubmed></nowiki></ref>. Targeting of upstream regions of transcription units has also been extensively documented for certain eukaryotic transposons (e.g. <ref><nowiki><pubmed>22493285</pubmed></nowiki></ref><ref><nowiki><pubmed>22287102</pubmed></nowiki></ref>). | ||
− | Potential topological characteristics or secondary structures are another feature which can attract certain TE. Changes in topology induced by the nucleoid protein, H-NS, for example, may explain the effects of H-NS mutants on the target choice of IS''903'' and Tn''10'' (IS''10'')<ref><nowiki><pubmed>15130124</pubmed></nowiki></ref><ref><nowiki><pubmed>26104553</pubmed></nowiki></ref>. Members of the [[IS Families/IS110 family|IS''110'']], [[IS Families/IS3 family|IS''3'']] and [[IS Families/IS4 and related families|IS''4'']] families are examples of IS which insert into potential secondary structures such as '''R'''epeated '''E'''xtragenic '''P'''alindromes (REP)<ref | + | Potential topological characteristics or secondary structures are another feature which can attract certain TE. Changes in topology induced by the nucleoid protein, H-NS, for example, may explain the effects of H-NS mutants on the target choice of IS''903'' and Tn''10'' (IS''10'')<ref><nowiki><pubmed>15130124</pubmed></nowiki></ref><ref><nowiki><pubmed>26104553</pubmed></nowiki></ref>. Members of the [[IS Families/IS110 family|IS''110'']], [[IS Families/IS3 family|IS''3'']] and [[IS Families/IS4 and related families|IS''4'']] families are examples of IS which insert into potential secondary structures such as '''R'''epeated '''E'''xtragenic '''P'''alindromes (REP)<ref name=":4" /><ref><nowiki><pubmed>12888493</pubmed></nowiki></ref><ref><nowiki><pubmed>10559158</pubmed></nowiki></ref><ref><nowiki><pubmed>16563168</pubmed></nowiki></ref>, integrons<ref><nowiki><pubmed>18487340</pubmed></nowiki></ref><ref><nowiki><pubmed>19025573</pubmed></nowiki></ref> or even the ends of other TE<ref><nowiki><pubmed>1648561</pubmed></nowiki></ref><ref><nowiki><pubmed>14563872</pubmed></nowiki></ref> . |
− | Some transposons such as Tn''7'' in ''[[wikipedia:Escherichia_coli|Escherichia coli]]''<ref | + | Some transposons such as Tn''7'' in ''[[wikipedia:Escherichia_coli|Escherichia coli]]''<ref name=":7"><pubmed>11030337</pubmed></nowiki></ref> and Tn''917'' in ''[[wikipedia:Bacillus_subtilis|Bacillus subtilis]]''<ref><nowiki><pubmed>19820088</pubmed></nowiki></ref> [39], ''[[wikipedia:Enterococcus_faecalis|Enterococcus faecalis]]''<ref><nowiki><pubmed>15489440</pubmed></nowiki></ref> and ''[[wikipedia:Strangles|Streptococcus equi]]'' (but not in ''[[wikipedia:Listeria_monocytogenes|Listeria monocytogenes]]'' or ''[[wikipedia:Streptococcus_suis|Streptococcus suis]]'')<ref><nowiki><pubmed>12695044</pubmed></nowiki></ref> also show a preference for integration into the replication terminus region and sites of DNA breakage may also attract insertions<ref name=":7" />. Interestingly, an analysis of incorporation of “self-DNA” by a [[wikipedia:CRISPR|CRISPR system]] in ''E. coli'' showed a preference for trapping DNA from the terminus region of the chromosome<ref><nowiki><pubmed>25874675</pubmed></nowiki></ref>, mimicking the target preference of Tn''7''. It remains to be seen whether any IS has adopted these types of target preference. |
In addition, [[IS Families/IS21 family|IS''21'']], [[IS Families/IS30 family|IS''30'']] and IS''911'' have all been observed to insert close to sequences which resemble their own IR<ref><nowiki><pubmed>3034717</pubmed></nowiki></ref><ref><nowiki><pubmed>2163395</pubmed></nowiki></ref><ref><nowiki><pubmed>9393723</pubmed></nowiki></ref><ref><nowiki><pubmed>14756780</pubmed></nowiki></ref>. Although these IS are members of different families, they have in common the formation of a dsDNA excised circular transposon intermediate with abutted left and right ends<ref><nowiki><pubmed>26350305</pubmed></nowiki></ref>. Insertion next to a resident “target” IR such that IR of the IS are abutted “head-to-head” presumably reflects the capacity of the Tpase to form a synaptic complex between one IR present in the transposon circle and the target IR. This type of structure is extremely active in transposition and will continue to generate genome rearrangements. | In addition, [[IS Families/IS21 family|IS''21'']], [[IS Families/IS30 family|IS''30'']] and IS''911'' have all been observed to insert close to sequences which resemble their own IR<ref><nowiki><pubmed>3034717</pubmed></nowiki></ref><ref><nowiki><pubmed>2163395</pubmed></nowiki></ref><ref><nowiki><pubmed>9393723</pubmed></nowiki></ref><ref><nowiki><pubmed>14756780</pubmed></nowiki></ref>. Although these IS are members of different families, they have in common the formation of a dsDNA excised circular transposon intermediate with abutted left and right ends<ref><nowiki><pubmed>26350305</pubmed></nowiki></ref>. Insertion next to a resident “target” IR such that IR of the IS are abutted “head-to-head” presumably reflects the capacity of the Tpase to form a synaptic complex between one IR present in the transposon circle and the target IR. This type of structure is extremely active in transposition and will continue to generate genome rearrangements. |
Revision as of 16:57, 22 June 2020
The influence of different IS on genome architecture will depend not only on their levels of activity but also on the type of target into which they insert. It was initially believed that TE show no or only low sequence specificity in their target choice. For example IS630 and the eukaryotic Tc/mariner families[1] both require a TA dinucleotide in the target [2][3] while others such as the IS200/IS605 and IS91 families require short tetra- or penta-nucleotide sequences[4][5]. Yet others, such as IS1 and IS186 (of the IS4 family), show some regional specificity (for AT and GC rich sequences respectively) [6][7][8].
Although, from a global genome perspective, insertion may appear to occur without significant sequence specificity, accumulation of more statistically robust data has uncovered rather subtler insertion patterns revealing that several TE use rather shrewd mechanisms in choosing a target. For example, there is some indication from the public databases suggesting that IS density is generally significantly higher in conjugative bacterial plasmids than in their host chromosomes with the exception of special cases in which the host has undergone IS expansion. Such plasmids are major vectors in lateral gene transfer and are important vectors in IS transmission (as well as in transmission of accessory traits such as resistance to antibacterials). Some TE, including IS, appear to be attracted to replication forks[9][10][11] and show a strong orientation bias indicating strand preference at the fork[9][12][11][13][14]. Moreover, in certain cases, insertion may target stalled replication forks[15]. A link between replication (in this case, replication origins) and insertion has also now been observed for a eukaryotic TE: the P element of Drosophila[16].
For example, transposon Tn7 has two modes of transposition[17][18][19]: in one, which uses the Tn7-encoded target protein TnsD, a specific sequence within the highly conserved glmS is recognized and insertion occurs next to this essential gene[11][17][20][21]; in the second, which uses a more general targeting protein, TnsE, insertion occurs into replication forks directed by interactions with the β-clamp[22][23]. This latter pathway results in a strong orientation bias of Tn7 insertions, consistent with insertion into the lagging strand of the replication fork formed during conjugative transfer. Although studies with IS are less advanced, a similar orientation bias was observed with IS903[13][14], suggesting that it too may use the β-clamp in directing insertions. It seems probable that many other IS use this type of protein-protein interaction.
The second example of a specialized target choice was observed in members of the IS200/IS605 family (see "IS200/IS605-family"). These transpose using a strand-specific single-strand intermediate and insertion occurs 3’ to a tetra- or pentanucleotide on the lagging strand[24][15]. Clear vestiges of this specificity can still be detected in a large number of bacterial genomes where the orientation of insertion is strongly correlated with the direction of replication. There are clearly incidences of insertion in the “wrong” orientation but many of these may be explained by post-insertion genome rearrangements involving inversions. This would place the “active” strand of the IS on the lagging rather than on the leading strand. Interestingly, those IS which are not oriented in the “correct” orientation with respect to replication are almost certainly inactive and unable to transpose further[24].
Other examples of sequence-specific target choice have been described. IS1, for example, shows a preference for regions rich in AT whereas the transposon TnGBS (an ICE from Steptococcus agalactiae) and members of the closely related ISLre2 family show a preference for insertion 15-17bp upstream of σA promoters[25][26]. Targeting of upstream regions of transcription units has also been extensively documented for certain eukaryotic transposons (e.g. [27][28]).
Potential topological characteristics or secondary structures are another feature which can attract certain TE. Changes in topology induced by the nucleoid protein, H-NS, for example, may explain the effects of H-NS mutants on the target choice of IS903 and Tn10 (IS10)[29][30]. Members of the IS110, IS3 and IS4 families are examples of IS which insert into potential secondary structures such as Repeated Extragenic Palindromes (REP)[15][31][32][33], integrons[34][35] or even the ends of other TE[36][37] .
Some transposons such as Tn7 in Escherichia coli[38] and Tn917 in Bacillus subtilis[39] [39], Enterococcus faecalis[40] and Streptococcus equi (but not in Listeria monocytogenes or Streptococcus suis)[41] also show a preference for integration into the replication terminus region and sites of DNA breakage may also attract insertions[38]. Interestingly, an analysis of incorporation of “self-DNA” by a CRISPR system in E. coli showed a preference for trapping DNA from the terminus region of the chromosome[42], mimicking the target preference of Tn7. It remains to be seen whether any IS has adopted these types of target preference.
In addition, IS21, IS30 and IS911 have all been observed to insert close to sequences which resemble their own IR[43][44][45][46]. Although these IS are members of different families, they have in common the formation of a dsDNA excised circular transposon intermediate with abutted left and right ends[47]. Insertion next to a resident “target” IR such that IR of the IS are abutted “head-to-head” presumably reflects the capacity of the Tpase to form a synaptic complex between one IR present in the transposon circle and the target IR. This type of structure is extremely active in transposition and will continue to generate genome rearrangements.
It has also been observed that certain transposons, in particular members of the Tn7 family, carry CRISPR-Cas proteins[48][49][50]. The function of CRISPR-cas systems is generally to provide adaptive immunity against invading bacterial and archaeal viruses and other mobile genetic elements. When incorporated as part of a Tn7 family member, the CRISPR/Cas systems have been subverted and do not retain their defense functions against incoming mobile genetic elements. Instead, they have been “repurposed“[51] to use guide RNAs to target Tn insertion.
These examples represent only a small part of the literature concerning factors influencing target choice but serve to illustrate the impact this can have on genomes.
Bibliography
- ↑ <pubmed>26104691</pubmed>
- ↑ <pubmed>17680987</pubmed>
- ↑ <pubmed>8556864</pubmed>
- ↑ <pubmed>2552258</pubmed>
- ↑ <pubmed>19524540</pubmed>
- ↑ <pubmed>6260963</pubmed>
- ↑ <pubmed>6248730</pubmed>
- ↑ <pubmed>3032747</pubmed>
- ↑ 9.0 9.1 Peters JE, Craig NL . Tn7 recognizes transposition target structures associated with DNA replication using the DNA-binding protein TnsE. - Genes Dev: 2001 Mar 15, 15(6);737-47 [PubMed:11274058] [DOI] </nowiki>
- ↑ <pubmed>20691900</pubmed>
- ↑ 11.0 11.1 11.2 </nowiki>
- ↑ <pubmed>20691900</pubmed>
- ↑ 13.0 13.1 </nowiki>
- ↑ 14.0 14.1 Hu WY, Thompson W, Lawrence CE, Derbyshire KM . Anatomy of a preferred target site for the bacterial insertion sequence IS903. - J Mol Biol: 2001 Feb 23, 306(3);403-16 [PubMed:11178901] [DOI] </nowiki>
- ↑ 15.0 15.1 15.2 He S, Corneloup A, Guynet C, Lavatine L, Caumont-Sarcos A, Siguier P, Marty B, Dyda F, Chandler M, Ton Hoang B . The IS200/IS605 Family and "Peel and Paste" Single-strand Transposition Mechanism. - Microbiol Spectr: 2015 Aug, 3(4); [PubMed:26350330] [DOI] </nowiki>
- ↑ <pubmed>21896744</pubmed>
- ↑ 17.0 17.1 </nowiki>
- ↑ <pubmed>1664019</pubmed>
- ↑ <pubmed>2834269</pubmed>
- ↑ <pubmed>2826397</pubmed>
- ↑ <pubmed>2542960</pubmed>
- ↑ <pubmed>19703395</pubmed>
- ↑ <pubmed>8804309</pubmed>
- ↑ 24.0 24.1 Ton-Hoang B, Pasternak C, Siguier P, Guynet C, Hickman AB, Dyda F, Sommer S, Chandler M . Single-stranded DNA transposition is coupled to host replication. - Cell: 2010 Aug 6, 142(3);398-408 [PubMed:20691900] [DOI] </nowiki>
- ↑ <pubmed>19183283</pubmed>
- ↑ <pubmed>23435978</pubmed>
- ↑ <pubmed>22493285</pubmed>
- ↑ <pubmed>22287102</pubmed>
- ↑ <pubmed>15130124</pubmed>
- ↑ <pubmed>26104553</pubmed>
- ↑ <pubmed>12888493</pubmed>
- ↑ <pubmed>10559158</pubmed>
- ↑ <pubmed>16563168</pubmed>
- ↑ <pubmed>18487340</pubmed>
- ↑ <pubmed>19025573</pubmed>
- ↑ <pubmed>1648561</pubmed>
- ↑ <pubmed>14563872</pubmed>
- ↑ 38.0 38.1 Peters JE, Craig NL . Tn7 transposes proximal to DNA double-strand breaks and into regions where chromosomal DNA replication terminates. - Mol Cell: 2000 Sep, 6(3);573-82 [PubMed:11030337] [DOI] </nowiki>
- ↑ <pubmed>19820088</pubmed>
- ↑ <pubmed>15489440</pubmed>
- ↑ <pubmed>12695044</pubmed>
- ↑ <pubmed>25874675</pubmed>
- ↑ <pubmed>3034717</pubmed>
- ↑ <pubmed>2163395</pubmed>
- ↑ <pubmed>9393723</pubmed>
- ↑ <pubmed>14756780</pubmed>
- ↑ <pubmed>26350305</pubmed>
- ↑ <pubmed>28811374</pubmed>
- ↑ <pubmed>30717668</pubmed>
- ↑ <pubmed>31165781</pubmed>
- ↑ <pubmed>31502713</pubmed>