Detailed information    

insolico Bioinformatically predicted

Overview


Name   comEC   Type   Machinery gene
Locus tag   clem_RS08600 Genome accession   NZ_CP016397
Coordinates   1978854..1981070 (-) Length   738 a.a.
NCBI ID   WP_094091175.1    Uniprot ID   A0A222P393
Organism   Legionella clemsonensis strain CDC-D5610     
Function   ssDNA transport through the inner membrane (predicted from homology)   
DNA binding and uptake

Related MGE


Note: This gene co-localizes with putative mobile genetic elements (MGEs) in the genome predicted by VRprofile2, as detailed below.

Gene-MGE association summary

MGE type MGE coordinates Gene coordinates Relative position Distance (bp)
Prophage 1962563..1981070 1978854..1981070 within 0


Gene organization within MGE regions


Location: 1962563..1981070
Locus tag Gene name Coordinates (strand) Size (bp) Protein ID Product Description
  clem_RS08500 (clem_08660) - 1962563..1964227 (-) 1665 WP_027265451.1 ATP-dependent endonuclease -
  clem_RS08510 (clem_08665) - 1964908..1965714 (+) 807 WP_027265452.1 phage integrase N-terminal domain-containing protein -
  clem_RS08515 (clem_08670) - 1965707..1966606 (+) 900 WP_027265453.1 phage integrase N-terminal domain-containing protein -
  clem_RS08520 (clem_08675) - 1966718..1967743 (+) 1026 WP_027265454.1 hypothetical protein -
  clem_RS08525 (clem_08680) - 1968085..1968279 (+) 195 WP_027265455.1 hypothetical protein -
  clem_RS08535 (clem_08685) - 1968379..1968801 (+) 423 WP_027265456.1 single-stranded DNA-binding protein -
  clem_RS08540 (clem_08690) - 1968867..1969373 (+) 507 WP_027265457.1 antirestriction protein ArdA -
  clem_RS08545 (clem_08695) - 1969441..1970379 (+) 939 WP_027265458.1 COG2958 family protein -
  clem_RS08550 (clem_08700) - 1970631..1971236 (-) 606 WP_042755010.1 hypothetical protein -
  clem_RS15110 - 1971243..1971791 (-) 549 Protein_1712 recombinase family protein -
  clem_RS08565 (clem_08715) - 1972124..1973401 (-) 1278 WP_027266380.1 Y-family DNA polymerase -
  clem_RS08570 (clem_08720) - 1973388..1973894 (-) 507 WP_011215133.1 LexA family transcriptional regulator -
  clem_RS08575 (clem_08725) - 1974029..1974697 (-) 669 WP_013101291.1 SOS response-associated peptidase -
  clem_RS08580 (clem_08730) - 1975180..1976247 (-) 1068 WP_011213398.1 hypothetical protein -
  clem_RS08585 (clem_08735) - 1976319..1977077 (-) 759 WP_027266381.1 2OG-Fe(II) oxygenase -
  clem_RS08595 (clem_08745) - 1977610..1978767 (-) 1158 WP_094092317.1 multidrug DMT transporter permease -
  clem_RS08600 (clem_08750) comEC 1978854..1981070 (-) 2217 WP_094091175.1 DNA internalization-related competence protein ComEC/Rec2 Machinery gene

Sequence


Protein


Download         Length: 738 a.a.        Molecular weight: 82454.11 Da        Isoelectric Point: 9.8713

>NTDB_id=187973 clem_RS08600 WP_094091175.1 1978854..1981070(-) (comEC) [Legionella clemsonensis strain CDC-D5610]
MEILCFFAGTVFFYTKSVYAILLVVVAFFLSPRLSFPCCFLAALLWGYAHQWWVADCGMPMHVRIIPHAVLEGEIVSIPA
TTTFKSQFQFKLSKFNGKPASSILLLACYNRCPLFKVGESWRFEAKIKKPANLGNPGSFNYVNWLSARHIHWVGYIKPRT
ATLLKPGQSSSLLRLREQLASTLDKLLPEGKSLGIFEALTLGITNHIDKSQWELFRRTGTTHLMVISGAHIGLVAGLGFM
LMRWLWTRNAWLCLHYPAPQAASIAGLLMAIAYALLAGFAPPAQRSLIACFLLLSRNFSSYRFTGWQAWRYGLLAVLLFE
PHDVLLPGFYLSFLAVASLLLGSQRLRATGFKKSLGLQLICLFGLMPLTLFWFSYGAISGLLANLVAIPFVGFVIVPAAL
ISLLAVQCWDEAWFLIPVHWAIEVLLYFLKLIDSLASFNLSFSFSSILSPLALMLVMLSGLFLPVRAFYPAMIVLGMAAL
YPGYPKVRGGEAEINILDVGQGLAVTVRTANHTLVYDTGMKFYQGGDMAKLAIIPFLKTVGIKKIDKIIISHPDLDHRGG
LVSLETNYPVNALLVNNISYYRRGENCHYYPSWQWDGISFRFLAIHKRFKNKNNNSCILQIGNSRGRILLTGDIERSAED
YLVKMYKEQLASEVLVVPHHGSKTSSSPSFIQQISPKFAVISAGFDNRYHFPHAQTLQTFARQNVKILSTADCGMVTVRL
PANHDRINPFCYKTITSA

Nucleotide


Download         Length: 2217 bp        

>NTDB_id=187973 clem_RS08600 WP_094091175.1 1978854..1981070(-) (comEC) [Legionella clemsonensis strain CDC-D5610]
ATGGAAATTCTCTGTTTTTTTGCAGGGACGGTCTTTTTTTATACAAAGTCAGTCTATGCCATTTTACTGGTGGTTGTTGC
ATTTTTTCTCTCCCCTCGGTTGAGTTTTCCCTGTTGCTTTCTTGCAGCTCTGCTGTGGGGATATGCTCACCAGTGGTGGG
TGGCCGATTGTGGCATGCCGATGCACGTTCGCATTATCCCACATGCTGTTTTAGAAGGTGAAATTGTATCCATCCCAGCA
ACAACGACTTTTAAGTCTCAGTTCCAGTTTAAACTGTCGAAGTTTAATGGGAAACCTGCTAGCTCTATCTTGCTATTGGC
CTGTTATAATCGCTGCCCTTTATTTAAAGTGGGAGAATCTTGGCGTTTTGAAGCAAAAATAAAAAAACCAGCGAATCTGG
GAAATCCAGGCAGTTTTAATTATGTAAATTGGTTAAGTGCTCGTCATATTCATTGGGTGGGGTATATTAAACCGCGCACT
GCCACGCTCCTAAAACCTGGTCAATCTTCAAGTCTTTTAAGACTTCGTGAGCAGCTTGCGTCAACTCTCGATAAACTATT
ACCTGAAGGAAAAAGTCTGGGCATTTTTGAAGCATTAACGCTGGGTATTACCAATCATATTGATAAGTCACAGTGGGAGT
TGTTTCGCCGAACAGGAACTACTCATTTGATGGTCATTTCCGGTGCTCACATTGGTTTGGTGGCAGGATTAGGATTTATG
TTAATGCGATGGTTATGGACTCGAAATGCATGGCTATGTCTTCATTATCCAGCACCTCAGGCAGCCAGTATTGCAGGATT
GCTAATGGCTATCGCTTATGCACTACTCGCTGGATTTGCTCCTCCCGCCCAACGTTCACTTATTGCTTGTTTTTTATTGC
TCTCGAGAAATTTTTCAAGTTATCGCTTTACCGGCTGGCAAGCTTGGCGCTATGGCTTATTAGCAGTATTACTTTTTGAA
CCGCATGATGTTTTATTGCCAGGATTCTATTTATCCTTTCTTGCTGTAGCCAGTTTGCTGTTAGGCAGTCAACGATTAAG
AGCTACAGGTTTTAAAAAAAGCCTGGGGTTGCAGCTTATTTGTCTTTTCGGATTAATGCCTCTAACATTGTTTTGGTTCT
CTTATGGGGCAATTAGTGGGTTGCTAGCCAATCTGGTTGCTATTCCGTTTGTAGGATTTGTTATCGTCCCTGCTGCGCTT
ATCAGTTTGTTGGCAGTGCAATGTTGGGATGAGGCCTGGTTTTTAATACCAGTACACTGGGCGATTGAGGTTTTGCTTTA
TTTTTTAAAATTGATTGACTCCTTGGCATCATTTAATTTGAGCTTTTCCTTTAGCAGTATTTTATCGCCTTTAGCATTGA
TGCTGGTCATGTTGTCAGGCTTATTTCTCCCTGTTAGGGCATTTTACCCAGCAATGATTGTGTTGGGTATGGCGGCTCTT
TATCCAGGATATCCTAAAGTGAGGGGAGGGGAGGCTGAAATTAATATTCTTGATGTGGGGCAAGGTTTAGCAGTGACTGT
ACGAACAGCCAATCACACGCTTGTTTATGATACCGGAATGAAATTCTATCAGGGTGGGGATATGGCAAAATTGGCGATTA
TTCCTTTTCTTAAAACTGTGGGGATAAAGAAGATTGATAAAATTATTATTAGCCATCCCGATTTAGATCACCGAGGTGGC
CTTGTTTCGCTGGAAACTAATTATCCAGTCAATGCGCTTTTGGTCAATAATATTTCCTATTATCGTCGAGGAGAAAATTG
CCATTACTATCCCTCGTGGCAATGGGATGGCATTTCTTTTCGTTTTTTAGCTATTCATAAACGGTTTAAAAACAAGAATA
ATAACTCGTGTATTTTGCAAATTGGGAATTCCAGAGGGCGTATTTTATTAACAGGGGATATAGAACGGTCAGCGGAAGAC
TATTTGGTAAAAATGTACAAAGAACAACTTGCATCCGAAGTATTAGTTGTCCCTCACCATGGCAGCAAAACATCTTCCTC
ACCTAGCTTCATTCAACAAATTTCACCAAAATTTGCAGTAATATCCGCAGGCTTTGATAATCGATATCATTTTCCGCATG
CGCAAACATTGCAAACTTTTGCACGACAAAATGTGAAAATATTGAGTACTGCAGATTGTGGCATGGTCACCGTGCGATTG
CCTGCAAACCATGACAGGATAAACCCATTTTGTTATAAAACAATAACAAGTGCATAA


Secondary structure


Protein secondary structures were predicted by S4PRED and visualized by seqviz.



3D structure


Source ID Structure
  AlphaFold DB A0A222P393

Transmembrane helices


Transmembrane helices of protein were predicted by TMHMM 2.0 and visualized by seqviz and ECharts.



Visualization of predicted probability:


Similar proteins


Only experimentally validated proteins are listed.

Protein Organism Identities (%) Coverage (%) Ha-value
  comEC Legionella pneumophila strain Lp02

49.798

100

0.5

  comEC Legionella pneumophila strain ERS1305867

49.324

100

0.495


Multiple sequence alignment