Structure of the space of taboo-free sequences.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Author(s): Manuel C;Manuel C; von Haeseler A; von Haeseler A; von Haeseler A
  • Source:
    Journal of mathematical biology [J Math Biol] 2020 Nov; Vol. 81 (4-5), pp. 1029-1057. Date of Electronic Publication: 2020 Sep 17.
  • Publication Type:
    Journal Article; Research Support, Non-U.S. Gov't
  • Language:
    English
  • Additional Information
    • Source:
      Publisher: Springer Verlag Country of Publication: Germany NLM ID: 7502105 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1432-1416 (Electronic) Linking ISSN: 03036812 NLM ISO Abbreviation: J Math Biol Subsets: MEDLINE
    • Publication Information:
      Publication: Berlin : Springer Verlag
      Original Publication: Wien, New York, Springer-Verlag.
    • Subject Terms:
    • Abstract:
      Models of sequence evolution typically assume that all sequences are possible. However, restriction enzymes that cut DNA at specific recognition sites provide an example where carrying a recognition site can be lethal. Motivated by this observation, we studied the set of strings over a finite alphabet with taboos, that is, with prohibited substrings. The taboo-set is referred to as [Formula: see text] and any allowed string as a taboo-free string. We consider the so-called Hamming graph [Formula: see text], whose vertices are taboo-free strings of length n and whose edges connect two taboo-free strings if their Hamming distance equals one. Any (random) walk on this graph describes the evolution of a DNA sequence that avoids taboos. We describe the construction of the vertex set of [Formula: see text]. Then we state conditions under which [Formula: see text] and its suffix subgraphs are connected. Moreover, we provide an algorithm that determines if all these graphs are connected for an arbitrary [Formula: see text]. As an application of the algorithm, we show that about [Formula: see text] of bacteria listed in REBASE have a taboo-set that induces connected taboo-free Hamming graphs, because they have less than four type II restriction enzymes. On the other hand, four properly chosen taboos are enough to disconnect one suffix subgraph, and consequently connectivity of taboo-free Hamming graphs could change depending on the composition of restriction sites.
    • References:
      Ailloud F, Didelot X, Woltemate S, Pfaffinger G, Overmann J, Bader RC, Schulz C, Malfertheiner P, Suerbaum S (2019) Within-host evolution of Helicobacter pylori shaped by niche-specific adaptation, intragastric migrations and selective sweeps. Nat Commun 10(1):2273. (PMID: 10.1038/s41467-019-10050-1)
      Alberts B, Bray D, Lewis J, Raff M, Roberts K, Watson J (2004) Molecular biology of the cell (chapter 8), 5th edn. Garland, London, pp 532–534.
      Asinowski A, Bacher A, Banderier C, Gittenberger B (2018) Analytic combinatorics of lattice paths with forbidden patterns: enumerative aspects. In: Language and automata theory and applications. Springer, pp 195–206.
      Asinowski A, Bacher A, Banderier C, Gittenberger B (2020) Analytic combinatorics of lattice paths with forbidden patterns, the vectorial kernel method, and generating functions for pushdown automata. Algorithmica 82:386–428. https://doi.org/10.1007/s00453-019-00623-3.
      Collery MM, Kuehne SA, McBride SM, Kelly ML, Monot M, Cockayne A, Dupuy B, Minton NP (2017) What’s a SNP between friends: the influence of single nucleotide polymorphisms on virulence and phenotypes of Clostridium difficile strain 630 and derivatives. Virulence 8(6):767–781. (PMID: 10.1080/21505594.2016.1237333)
      Fitch WM, Margoliash E (1967) A method for estimating the number of invariant amino acid coding positions in a gene using cytochrome c as a model case. Biochem Genet 1(1):65–71. (PMID: 10.1007/BF00487738)
      Gelfand M, Koonin E (1997) Avoidance of palindromic words in bacterial and archaeal genomes: a close connection with restriction enzymes. Nucleic Acids Res 25:2430–9. (PMID: 10.1093/nar/25.12.2430)
      Hsu WJ, Chung MJ (1993) Generalized Fibonacci cubes. In: 1993 International conference on parallel processing—ICPP’93, vol 1, pp 299–302.
      Ilić A, Klavžar S, Rho Y (2012) Generalized Fibonacci cubes. Discrete Math 312:2–11. (PMID: 10.1016/j.disc.2011.02.015)
      Klavžar S (2013) Structure of Fibonacci cubes: a survey. J Comb Optim 25:505–522. (PMID: 10.1007/s10878-011-9433-z)
      Kommireddy V, Nagaraja V (2013) Diverse functions of restriction–modification systems in addition to cellular defense. Microbiol Mol Biol Rev MMBR 77:53–72. (PMID: 10.1128/MMBR.00044-12)
      Manuel C, Pfannerer S, von Haeseler A (unpublished) Etaboo: modelling and measuring taboo-free evolution. Unpublished.
      REBASE (2020a) The restriction enzyme database. http://rebase.neb.com/rebase/arcbaclistB.html . Accessed 17 June 2020.
      REBASE (2020b) The restriction enzyme database. http://rebase.neb.com/rebase/arcbaclistA.html . Accessed 17 June 2020.
      Roberts RJ, Vincze T, Posfai J, Macelis D (2014) REBASEa database for DNA restriction and modification: enzymes, genes and genomes. Nucleic Acids Res 43(D1):D298–D299. (PMID: 10.1093/nar/gku1046)
      Rocha E, Danchin A, Viari A (2001) Evolutionary role of restriction/modification systems as revealed by comparative genome analysis. Genome Res 11:946–958. (PMID: 10.1101/gr.GR-1531RR)
      Rusinov I, Ershova A, Karyagina A, Spirin S, Alexeevski A (2015) Lifespan of restriction–modification systems critically affects avoidance of their recognition sites in host genomes. BMC Genomics 16(1):1084. (PMID: 10.1186/s12864-015-2288-4)
      Rusinov IS, Ershova AS, Karyagina AS, Spirin SA, Alexeevski AV (2018a) Avoidance of recognition sites of restriction–modification systems is a widespread but not universal anti-restriction strategy of prokaryotic viruses. BMC Genomics 19(1):885. (PMID: 10.1186/s12864-018-5324-3)
      Rusinov IS, Ershova AS, Karyagina AS, Spirin SA, Alexeevski AV (2018b) Comparison of methods of detection of exceptional sequences in prokaryotic genomes. Biochemistry (Moscow) 83(2):129–139. (PMID: 10.1134/S0006297918020050)
      Sanders P, Schulz C (2013) High quality graph partitioning. In: Proceedings of the 10th DIMACS implementation challenge workshop.
      Shoemaker JS, Fitch WM (1989) Evidence from nuclear sequences that invariable sites should be considered when sequence divergence is calculated. Mol Biol Evol 6(3):270–289.
      Strimmer K, von Haeseler A (2009) Genetic distances and nucleotide substitution models. In: Lemey P, Salemi M, Anne-Mieke V (eds) The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing, 2nd edn. Cambridge University Press, Cambridge, pp 111–141. (PMID: 10.1017/CBO9780511819049.006)
      Ussery DW, Wassenaar TM, Borini S (2008) Computing for Comparative Microbial Genome: Bioinformatics for Microbiologists, 1st edn. Springer, Berlin.
      Weber ND, Aubert M, Dang CH, Stone D, Jerome KR (2014) DNA cleavage enzymes for treatment of persistent viral infections: recent advances and the pathway forward. Virology 454–455:353–361. (PMID: 10.1016/j.virol.2013.12.037)
      Wilson RJ (1986) Introduction to graph theory. Wiley, New York.
      Yuan L, Huang X-Y, Liu Z-Y, Zhang F, Zhu XL, Yu J-Y, Ji X, Xu Y, Li G, Li C, Wang H-J, Deng Y-Q, Wu M, Cheng M-L, Ye Q, Xie D-Y, Li X-F, Wang X, Shi W, Qin C-F (2017) A single mutation in the prM protein of Zika virus contributes to fetal microcephaly. Science 358:933–936. (PMID: 10.1126/science.aam7120)
    • Contributed Indexing:
      Keywords: Bacteriophage DNA evolution; Connectivity of Hamming graphs; Endonuclease-dependent evolution; Hamming graph with taboos; Restriction-enzyme dependent evolution; Restriction–modification system
    • Accession Number:
      9007-49-2 (DNA)
    • Publication Date:
      Date Created: 20200917 Date Completed: 20210729 Latest Revision: 20210729
    • Publication Date:
      20221213
    • Accession Number:
      PMC7560954
    • Accession Number:
      10.1007/s00285-020-01535-5
    • Accession Number:
      32940748