Statistics

Taxonomic distribution


The tables and piecharts display the taxonomic distribution of experimentally verified natural transformation-associated genes, enumerating validated gene counts per taxon. Clicking on a specific taxon (e.g., Streptococcus pneumoniae) provides access to strain list and gene information.

Kingdom

Total gene count 1072


Domain

Total gene count 1072


Bacteria

1054

Archaea

18

The tables and piecharts display the taxonomic distribution of in silico predicted natural transformation machinery and regulatory genes by BLASTp (identities * coverage ≥ 0.64), enumerating validated gene counts per taxon. Clicking on a specific taxon (e.g., Streptococcus pneumoniae) provides access to strain list and gene information.

Species

Total gene count 666435 (per strain count)


Bacillus subtilis

40139 (110.9)

Bacillus licheniformis

5076 (105.8)

Streptococcus equinus

903 (100.3)

Streptococcus mitis

1427 (95.1)

Streptococcus canis

569 (94.8)

Streptococcus mutans

2840 (94.7)

Streptococcus oralis

1874 (89.2)

Streptococcus suis

12036 (80.2)

Streptococcus pyogenes

28315 (77.6)

Lactococcus lactis

6190 (66.6)

Staphylococcus aureus

99400 (63.2)

Pseudomonas aeruginosa

55184 (58.5)

Vibrio cholerae

7475 (53)

Vibrio campbellii

1245 (51.9)

Vibrio vulnificus

1526 (50.9)

Neisseria gonorrhoeae

10054 (50.3)

Neisseria subflava

401 (50.1)

Escherichia coli

176059 (45)

Kingella kingae

307 (43.9)

Eikenella corrodens

128 (42.7)

Xylella fastidiosa

2730 (40.1)

Thermus thermophilus

1241 (38.8)

Histophilus somni

1182 (36.9)

Campylobacter jejuni

13460 (29.8)

Campylobacter coli

2471 (28.7)

Helicobacter pylori

10515 (28.4)

Faucicola osloensis

438 (<0.1)

Phylum

Total gene count 666435


Kingdom

Total gene count 666435


Domain

Total gene count 666435


Bacteria

666362

Archaea

73

This section catalogs species with documented natural transformability but without characterized transformation machinery or regulatory genes, alongside corresponding literature references. Clicking on a species name provides access to in silico predicted natural transformation machinery and regulatory genes (where available).

Species with zero predicted genes in this table are not devoid of natural transformation machinery. Instead, this result reflects the conservative design of the precomputed pipeline, which relies exclusively on BLASTp searches seeded with experimentally validated reference genes. For species lacking such experimentally characterized genes, no homologs can be retrieved by default. Users interested in these species are encouraged to perform customized searches using the online Genome prediction, BLAST, and HMMER prediction tools available on the NTDB website, where detection thresholds can be adjusted to explore more divergent homologs.

Species Number of predicted genes References (PubMed ID)
  Achromobacter arsenitoxydans insolico 0 16345673
  Aggregatibacter actinomycetemcomitans insolico 829 2229383
  Aggregatibacter aphrophilus insolico 194 2229383
  Agrobacterium tumefaciens insolico 963 11375171
  Avibacterium paragallinarum insolico 831 37212663
  Azotobacter vinelandii insolico 263 20453151
  Bacillus amyloliquefaciens insolico 9914 5541018
  Bacillus licheniformis insolico 5076 14127566, 4960110, 5802618, 5929742
  Bacillus stearothermophilus insolico 0 6277855, 6584074
  Bradyrhizobium japonicum insolico 456 5365772
  Campylobacter coli insolico 2471 16461682
  Cardiobacterium hominis insolico 60 2939687
  Chlorobium limicola insolico 24 17997281
  Chlorobium tepidum insolico 0 11375161
  Ectopseudomonas mendocina insolico 191 6571730
  Gallibacterium anatis insolico 310 22582057
  Haemophilus parainfluenzae insolico 769 475792
  Histophilus somni insolico 1182 25218867
  Lactobacillus lactis insolico 0 24509783
  Leuconostoc carnosum insolico 280 15184175
  Methanobacterium thermoautotrophicum insolico 0 3422229
  Methanococcus voltae insolico 0 3034867
  Methylobacterium organophilum insolico 14 401866
  Moraxella osloensis insolico 0 4589126
  Moraxella urethralis insolico 0 845247
  Nostoc muscorum insolico 0 1978772
  Pseudomonas alcaligenes insolico 0 6571730
  Pseudomonas fluorescens insolico 1058 11375171
  Pseudomonas pseudoalcaligenes insolico 0 6571730
  Pyrococcus furiosus insolico 0 21317259, 27824140
  Pyrococcus yayanosii insolico 0 30504216, 32602260
  Riemerella columbina insolico 20 33746928
  Streptococcus anginosus insolico 1926 9352904
  Streptococcus constellatus insolico 732 9352904
  Streptococcus cristatus insolico 372 8807795
  Streptococcus intermedius insolico 728 9352904
  Streptococcus oralis insolico 1874 3241622
  Streptococcus sanguinis insolico 665 2864407
  Streptomyces sp. insolico 0 24509783
  Synechococcus elongatus insolico 453 35958137, 7961432
  Thermoactinomyces vulgaris insolico 106 12732975
  Thermosynechococcus elongatus insolico 0 14639476
  Thermus aquaticus insolico 35 3957870
  Thermus caldophilus insolico 0 3957870
  Thermus flavus insolico 0 3957870
  Thiobacillus spp. insolico 0 6571832
  Thiobacillus thioparus insolico 0 6571832
  Vibrio vulnificus insolico 1526 19502446
  Xylella fastidiosa insolico 2730 21666009

Distribution of DNA uptake sequence (DUS)


Eight distinct DUS dialects were identified by Frye et al (PMID: 23637627). The distribution and abundance of these 8 DUS dialects were analyzed in all completely sequenced Neisseriaceae species by using the perfect match identified by BLASTn-short.

Go to Download page for detailed tables.


DUS frequencies per million bp


DUS proportion within species (%)

DUS variants Sequence
AT-DUS ATGCCGTCTGAA
AG-DUS AGGCCGTCTGAA
TG-wadDUS TGCCTGTCTGAA
AG-mucDUS AGGTCGTCTGAA
AG-simDUS AGGCTGCCTGAA
AG-kingDUS AGGCAGCCTGAA
AA-king3DUS AAGCAGCCTGCA
AG-eikDUS AGGCTACCTGAA

Distribution of Uptake Signal Sequence (USS)


Two versions of USS have been described (PMID: 23637627, PMID: 17038178): version A (5'-AAGTGCGGT-3'), named Hin-USS, is found in Haemophilus influenzae (PMID: 7542802, PMID: 22753031, PMID: 6285382), Actinobacillus actinomycetemcomitans (PMID: 12057937, PMID: 17074902) and Avibacterium paragallinarum (PMID: 37212663) ; USS version B (5'-ACAAGCGGT-3'), named the Apl-USS subtype, is found in Actinobacillus pleuropneumoniae (PMID: 17038178).

The distribution and abundance of these 2 USS sequences were analyzed in all completely sequenced Pasteurellaceae species by using the perfect match identified by BLASTn-short.

Go to Download page for detailed tables.


USS frequencies per million bp


USS proportion within species (%)

USS variants Sequence
Hin-USS AAGTGCGGT
Apl-USS ACAAGCGGT