Help

Introduction


Natural transformation stands as a fundamental mechanism of horizontal gene transfer (HGT) in bacteria, enables direct uptake and integration of extracellular DNA, driving evolution, adaptation, and antibiotic resistance spread (Johnston, et al. Nat Rev Microbiol, 2014). This multi-stage process (competence regulation, DNA binding, DNA uptake, DNA processing and homologous recombination) is governed by specialized machinery and regulatory genes. Furthermore, several naturally transformable species exhibit DNA uptake specificity, preferentially internalizing fragments containing species-specific motifs (e.g., DUS in Neisseriaceae and USS in Pasteurellaceae) (Frye, et al. PLoS Genet, 2013; Redfield, et al. BMC Evol Biol, 2006).

Mobile genetic elements (MGEs), including insertion sequences (ISs), transposons, genomic islands (GIs), and prophages, exploit transformation for dissemination (e.g., SCCmec acquisition in MRSA; Maree, et al. Nat Commun, 2022). While transformation can purge selfish MGEs (Croucher, et al. PLoS Biol, 2016), MGEs retaliate by suppressing host machinery via extracellular DNA degradation or gene disruption (Dalia, et al. Proc Natl Acad Sci U S A, 2015; Tuffet, et al. Proc Natl Acad Sci USA, 2024), ensuring their persistence.


Key features

  • Repository of natural transformation machinery and regulatory data
  • (i) Experimentally validated natural transformation-associated genes. NTDB archives 992 experimentally verified natural transformation-associated genes curated from 782 peer-reviewed publications across 50 prokaryotic species, such as Streptococcus pneumoniae, Vibrio cholerae and Methanococcus maripaludis. Genes were classified into four functional categories based on their roles in natural transformation: (i) Machinery genes (n = 562): the genes directly mediate DNA binding, DNA uptake, DNA processing, or homologous recombination; (ii) Regulatory genes (n = 336): the genes encode regulators operating through transcriptional control, post-translational modification, targeted proteolysis or other regulatory mechanisms to control competence development. Experimentally validated interactions between regulators and their target genes were also archived; (iii) Auxiliary factors (n = 33): comprise restriction-modification (RM) systems or endonucleases modulating transformation efficiency via foreign DNA degradation, and autolysins enabling DNA release through cell lysis; (iv) Genes with unclear function (n = 61): the genes proved to be essential for natural transformation, but the detailed mechanisms remained uncharacterized. Each entry includes annotations such as primary sequence, secondary and tertiary structures, Pfam domains, transmembrane helices, sequence homologs and regulatory network.

    (ii) Naturally transformable species without characterized genes. NTDB also catalogs 49 experimentally transformable species whose associated transformation machinery and regulatory genes remain uncharacterized. Natural transformation machinery and regulatory genes in these species were predicted and annotated.

    (iii) DNA uptake motifs (DUS/USS). Experimentally validated DNA uptake motifs are cataloged, with computational prediction tools applied across relevant genomes.

    (iv) Genome-scale prediction of natural transformation machinery and regulatory genes. A total of 514,235 predicted genes, including 360,601 machinery genes and 153,634 regulatory genes, were identified in the chromosomes of 48,764 prokaryotic complete genomes (48,136 bacterial, 628 archaeal) from the NCBI RefSeq database by BLASTp with Ha-value (identities * coverage) ≥ 0.48.

  • Associations between natural transformation machinery and regulatory genes and MGEs
  • (i) Identification of MGEs associated with natural transformation machinery and regulatory genes. By utilizing VRprofile2 (Wang, et al. Nucleic Acids Research, 2022), MGEs were identified in 48,764 prokaryotic chromosomes from RefSeq and 115 chromosomes harboring the experimentally validated natural transformation machinery and regulatory genes. By screening both experimentally verified and predicted natural transformation machinery and regulatory genes located completely within or in the 1 kb flanking regions of MGEs, we identified 42,850 associations connecting 41,319 genes (29,610 machinery genes; 11,709 regulatory genes) to 31,941 distinct MGEs.

    (ii) Interactive exploration on the website. For each gene involved in natural transformation, the related MGE are listed in Browse tables and displayed in Detailed Information web pages by gene structure plots.

  • Integrated prediction and analysis toolkit
  • (i) Genome prediction. Upload prokaryotic genomes to predict natural transformation machinery and regulatory genes using NTDB's reference database (BLASTp-based).

    (ii) BLAST search. Identify machinery and regulatory genes in query sequences via BLASTp (protein queries), BLASTx (nucleotide-translated queries) or BLASTn (nucleotide queries).

    (iii) HMMER search. Screen query protein sequences against 37 curated Pfam HMM profiles related to natural transformation machinery genes for functional domain detection.

    (iv) DNA uptake motif search. Predict DNA uptake motifs (DUS/USS) in user-submitted DNA sequences using experimentally validated DNA uptake motifs as references.


    Key functions

  • Fuzzy search and filters: NTDB supports fuzzy searching within its database to accommodate user search queries. Users can apply filters on the browsing interface and search results to refine and access the required data.
  • Interactive visualization: for the complex content displayed on interfaces such as browse, statistics, and prediction results, NTDB offers interactive charts, graphs, and visualizations to help users understand the corresponding information more intuitively.
  • Online prediction: users can use the prediction tool, including Genome prediction, BLAST search, HMMER search and DNA uptake motif search, provided by NTDB to perform prediction in submitted sequence files without the need for registration, and there is no limit on the using times.
  • Data and tool sharing: The data and datasets collected in NTDB can be freely downloaded by users via the Download page.


  • Usage


    How to use NTDB?



    The web-based NTDB database contains several major parts: Home, Browse, Statistics, Tools, Download, References, Help, and a search field. Users can browse and download resources according to their needs on the corresponding pages.



    By entering a specific name of a gene or a species in the search bar, users can quickly obtain detailed information about the gene or a category of species.

    On the "Browse" page


    NTDB provides two complementary browsing modalities:

    (1) Browse by species

    This interface catalogs all species containing experimentally verified or computationally predicted natural transformation machinery and regulatory genes. Users can navigate alphabetically or search via text input. Selection of a species accesses its strain list; subsequent strain selection displays associated genes. Clicking any gene retrieves its detailed information.





    (2) Browse by function

    Genes are categorized by functional roles within the natural transformation process (e.g., competence regulation, DNA binding, DNA uptake, homologous recombination). Selecting a functional category reveals relevant gene lists. Gene selection then displays taxonomic distribution across species, comprehensive gene list (experimental/predicted) and access to detailed gene records through individual selection.






    On the "Statistics" page


    This interface dynamically visualizes the taxonomic distribution of the experimentally validated and in silico predicted genes within NTDB via interactive pie charts. Selection of specific taxa within these charts provides direct access to corresponding gene records. Additionally, natural transformable species without characterized genes are listed in a table, distributions of DNA uptake motifs (DUS/USS) rendered as interactive heatmaps, while MGE-gene-organism associations are represented through Sankey diagrams.






    On the "Tools" page


    In this interface, NTDB offers 4 online tools: Genome prediction, BLAST search, HMMER search and DUS/USS search.

  • Genome prediction tool
  • Users can upload GenBank or FASTA format sequence file to predict natural transformation machinery and regulatory genes by BLASTp and HMMER.

  • BLAST search
  • Users can perform BLAST search for query DNA/protein sequences against experimentally validated sequences archived in NTDB through this online tool.

  • HMMER search
  • Users can perform hmmscan search against 37 curated HMM profiles of natural transformation machinery and regulatory genes from Pfam through this online tool.

  • DUS/USS search
  • Users can perform DNA uptake motifs (DUS/USS) prediction by BLASTn in user-submitted DNA sequences using experimentally validated DNA uptake motifs as references.


    On the "Download" page


    Users can download the key datasets archived in NTDB on this page.





    On the "References" page


    This interface includes all the references related to natural transformation machinery and regulatory genes collected by NTDB, including some reviews. NTDB also supports the search of included references using keywords such as author, article title, journal, year, and PubMed ID.


    FAQs


    Q1. How to use NTDB?


    A1: You can quickly understand the core functions of NTDB through the browse interface, and get more detailed through this Help manual.

    Q2. In the entries collected by NTDB, some content is displayed as "-", what does this mean?


    A2: If "-" appears in an experimentally validated entry, it indicates that we were unable to find corresponding data in the reference literature; if "-" appears in an in silico predicted entry, it means that the corresponding content was not predicted.

    Q3: How could I contact you if I find an error or have suggestions for NTDB?


    A3: Please do not hesitate to contact us through the mailing of hyou@sjtu.edu.cn.

    Q4. Can I submit data to NTDB?


    A4: You can contact Prof. Ou for cooperation through the mailing of hyou@sjtu.edu.cn.