Oral pathogens non-coding small RNA prediction

Yi-Feng Chang, Chien-Chi Lo, Chuan-Hsiung Chang, Gary Xie






  mRNA Locus ID
Organism




1. Introduction

Recent systematic genome searches revealed that bacteria encode a tremendous number of non-coding small RNAs (sRNAs) that control numerous cellular processes, primarily as regulators of translation and message stability. However, genome-wide annotations for sRNA-encoding genes have been conducted in only a few of the 417 bacterial genomes sequenced to date and none have been done in nine completed oral pathogen genomes. To provide high quality annotation for the oral pathogen research community, we developed a computer pipeline designed to locate potential sRNAs in bacteria by searching for highly conserved regions in the intergenic sequence (IGS) among and within genomes. Combining this approach with same-orientation, terminator-pair prediction, we conducted a genome-wide survey for putative sRNA-encoding genes in the nine oral pathogens. As more genomes are sequenced, this pipeline can be used to locate more non-coding small RNAs, which will open up an avenue to understand sRNA regulation in bacteria.

2. Materials and Methods

The nine oral pathogens that were used in this research are listed in below:

  1. Actinobacillus actinomycetemcomitans
  2. Fusobacterium nucleatum
  3. Porphyromonas gingivalis
  4. Prevotella intermedia
  5. Streptococcus mutans
  6. Streptococcus mitis
  7. Streptococcus sanguinis
  8. Tannerella forsythensis
  9. Treponema denticola

2.1 IGS (InterGenetic Sequence) Database Preparation

  1. Completed prokaryotic genome sequences are downloaded from NCBI RefSeq Database (Feb 14th 2007).
  2. Intergenic sequences are extracted from the region between two adjacent but non-overlapping coding sequences (CDS) and longer than 50bps.
  3. All intergenic sequences are fomated into blast searchable database to search conserved IGS for small RNA prediction pragrom.
  4. In order to predict possible regulatory mRNA target of small RNA, sequences of 140bp upstream and 60bp downstream from start condon, sequences of 60bp upstream and 90bp downstream from stop codon are extracted for mRNA target prediction.

Figure 1. Database preparation flowchart

2.2 sRNA Prediction

  1. All integernic sequences are extracted from GenBank flat file.
  2. Intergenic sequences that are shorter than 50bps or adjacent to insertion elements (IS), transposases or prophages are eliminated by a keyword filter.
  3. The nucleotide and amino acid insertion element databases (from IS Finder) also applied to identify IGS that escape the keyword filter.
  4. IGSs that pass the second step are submitted to Blast against IGS database to search for conserved IGS sequences.
  5. All IGS sequences are also subject to promoter and Rho-independent terminator prediction by PromScan with RpoD/RpoN weight matrix and TransTerm HP.
  6. Conserved IGS segments are submitted to Blast against complement coding sequence database to search for a possible regulatory target.
  7. In order to obtain more possible sRNA candidates, terminator pairs that lie on the same strand are are extracted as well.

Figure 2. Small RNA prediction flowchart

3. Results

Parameters for small RNA prediction

  1. IGS len >= 100bps
  2. Exclude tRNA, rRNA
  3. IGS Blast e-value: 1e-5
  4. PromScan score > 75
  5. TransTerm Hp con > 70
  6. In Table 2. only Inter-species conservation number of IGS (column 17, Inter Spp. IGS) large than 1 (for S. mit, larger than 17) is considered, but all information are availabled at raw data of Table 1.
  7. The transcribing strand of predicted small RNA candidate can be guessed by P-T Pair information (see Definition of table header).

Table 1. Raw data of prediction results:

Organism
Small RNA Prediction
Rfam Scan: Rfam 8.0

A. act

Table 2. Small RNA prediction results for nine oral pathogens. For detail explanation of table header please see: Definition of table header. In Table 2, rows with light green background color are known small RNA prediction from Rfam 8.0, the information from left to right are: organism abbrevation, Rfam ID, bit score of rfam scan, sRNA type, start, and end position.

Organism IGS# Up stream Locus Up stream Product Down Stream Locus Down Stream Product Gene Dir type Start End IGS Len GC% IS NT IS AA NR PT-Pair Intra Spp. IGS Inter Spp. IGS Conserved Inter-spp IGS Start Conserved Inter-spp IGS End
A. act 35 AA00139 antibiotic maturation factor AA00140 hypothetical protein ->-> 100597 100785 189 30.70% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 1 15 162
A. act 40 AA00166 conserved hypothetical protein (possible acetyltransferase) AA00167 possible amino-acid transporter (sodium-alanine symporters) ->-> 117980 118378 399 44.60% 0 0 0/0  +: 1/3/0
-: 0/0/0 
1 3 58 288
A. act   RF00504 36.95 Glycine     118043 118128                    
A. act   RF00504 46.55 Glycine     118129 118272                    
A. act 84 AA00349 hypothetical protein AA00350 50S ribosomal protein L10 ->-> 233588 233760 173 43.40% 0 0 0/0  +: 0/1/0
-: 0/0/0 
1 4 8 173
A. act 89 AA00370 30S ribosomal protein S10 AA00371 transcriptional regulator ->-> 249069 249320 252 36.10% 0 0 0/0  +: 0/0/0
-: 1/0/0 
1 3 1 174
A. act 105 AA00412 conserved hypothetical protein AA00413 conserved hypothetical protein ->-> 279515 280281 767 53.20% 0 0 0/20  +: 1/1/1
-: 0/3/0 
6 10 292 750
A. act 124 AA00470 conserved hypothetical protein (possible Zn-dependent protease with chaperone function) AA00471 3,4-dihydroxy-2-butanone 4-phosphate synthase; GTP cyclohydrase II ->-> 327720 328401 682 39.00% 0 0 0/0  +: 0/3/0
-: 0/2/0 
1 2 390 588
A. act   RF00050 100.78 FMN     328126 328309                    
A. act 164 AA00621 conserved hypothetical protein AA00622 conserved hypothetical protein ->-> 429068 429831 764 53.30% 0 0 0/20  +: 1/1/1
-: 0/2/0 
6 10 292 747
A. act 309 AA01227 hypothetical protein AA01228 50S ribosomal protein L14 ->-> 838906 839036 131 45.80% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 2 28 131
A. act 310 AA01238 50S ribosomal protein L36 AA01239 30S ribosomal protein S13 ->-> 844499 844642 144 35.40% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 5 5 119
A. act   RF00140 107.65 Alpha_RBS     844559 844675                    
A. act 316 AA01260 hypothetical protein AA01261 conserved hypothetical protein ->-> 855819 856582 764 53.00% 0 0 0/20  +: 0/2/0
-: 1/0/0 
6 10 18 474
A. act 337 AA01321 DNA repair protein AA01323 50S ribosomal protein L28 ->-> 894176 894386 211 36.00% 0 0 0/0  +: 1/0/0
-: 0/0/0 
1 5 55 211
A. act 344 AA01357 5-formyltetrahydrofolate cyclo-ligase-family protein AA01358 conserved hypothetical protein ->-> 919846 920141 296 41.20% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 1 47 215
A. act   RF00013 119.97 6S     919861 920044                    
A. act   RF00023 236.14 tmRNA     1038523 1038887                    
A. act 392 AA01565 conserved hypothetical protein AA01566 conserved hypothetical protein ->-> 1051028 1051950 923 53.60% 0 0 5/18 +: 0/2/0
-: 1/0/0 
6 10 175 683
A. act   RF00522 40.52 PreQ1     1177861 1177905                    
A. act   RF00022 48.25 GcvB     1210848 1210988                    
A. act   RF00059 54.2 TPP     1251735 1251822                    
A. act 481 AA01903 hypothetical protein AA01904 hypothetical protein ->-> 1284276 1284587 312 40.10% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 1 179 308
A. act 542 AA02119 hypothetical protein AA02120 nonheme ferritin ->-> 1446641 1447320 680 27.80% 0 0 0/0  +: 2/0/0
-: 0/0/0 
1 1 133 239
A. act   RF00010 225.15 RNaseP_bact_a     1484725 1485021                    
A. act 575 AA02262 50S ribosomal protein L13 AA02263 conserved hypothetical protein ->-> 1538988 1539231 244 35.70% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 1 1 147
A. act   RF00168 116.89 Lysine     1563630 1563804                    
A. act 590 AA02297 conserved hypothetical protein AA02298 conserved hypothetical protein ->-> 1569961 1570727 767 53.30% 0 0 0/11  +: 1/1/1
-: 0/2/0 
6 10 295 750
A. act 625 AA02421 conserved hypothetical protein AA02422 conserved hypothetical protein ->-> 1667361 1668153 793 53.30% 0 0 0/20  +: 1/1/1
-: 0/2/0 
6 10 321 776
A. act 673 AA02609 ABC transporter, ATP-binding protein/permease AA02610 acyl-CoA thioesterase II ->-> 1820245 1820510 266 45.10% 0 0 0/0  +: 0/0/0
-: 1/0/0 
1 3 23 145
A. act   RF00169 61.04 SRP_bact     1820270 1820369                    
A. act 709 AA02776 conserved hypothetical protein AA02777 tRNA-guanine transglycosylase ->-> 1940967 1941438 472 32.60% 0 0 0/0  +: 1/1/0
-: 0/1/0 
1 2 159 462
A. act 734 AA02879 octaprenyl-diphosphate synthase AA02880 50S ribosomal protein L21 ->-> 2020650 2020896 247 32.40% 0 0 0/0  +: 1/0/0
-: 1/1/0 
1 1 58 200
A. act 754 AA02944 30S ribosomal protein S20 AA02946 virulence factor protein ->-> 2059089 2059355 267 35.20% 0 0 0/0  +: 0/0/0
-: 1/0/0 
1 3 1 247
                                   
Organism IGS# Up stream Locus Up stream Product Down Stream Locus Down Stream Product Gene Dir type Start End IGS Len GC% IS NT IS AA NR PT-Pair Intra Spp. IGS Inter Spp. IGS Conserved Inter-spp IGS Start Conserved Inter-spp IGS End
F. nuc 2 FN1497 Multidrug resistance protein 2 FN1498 Integral membrane protein <--> 2229 2702 474 28.10% 0 0 0/0  +: 1/1/0
-: 0/0/0 
2 5 269 373
F. nuc   RF00050 93.36 FMN     2491 2606                    
F. nuc 4 FN1504 Nickel-binding protein FN1505 6,7-dimethyl-8-ribityllumazine synthase <--> 10106 10535 430 24% 0 0 0/0  +: 1/2/0
-: 1/0/0 
2 6 245 351
F. nuc   RF00050 95.23 FMN     10346 10461                    
F. nuc   RF00059 69.19 TPP     254884 254989                    
F. nuc   RF00174 79.9 Cobalamin     471270 471443                    
F. nuc   RF00557 76.51 L10_leader     547805 547935                    
F. nuc   RF00169 52.97 SRP_bact     799407 799503                    
F. nuc   RF00059 68.85 TPP     862319 862422                    
F. nuc   RF00174 104.29 Cobalamin     934497 934678                    
F. nuc   RF00504 54.04 Glycine     963902 963991                    
F. nuc   RF00504 55.84 Glycine     963992 964072                    
F. nuc   RF00162 45.8 SAM     987399 987486                    
F. nuc   RF00515 57.94 PyrR     1059062 1059170                    
F. nuc   RF00556 27.5 L19_leader     1071453 1071488                    
F. nuc   RF00023 148.06 tmRNA     1209222 1209564                    
F. nuc   RF00162 49.36 SAM     1317567 1317653                    
F. nuc   RF00066 20.86 U7     1900736 1900788                    
F. nuc 620 FN1286 30S ribosomal protein S13 FN1287 Bacterial Protein Translation Initiation Factor 1 (IF-1) <-<- 1944165 1944504 340 27.40% 0 0 0/0  +: 1/0/0
-: 0/0/0 
1 1 220 322
F. nuc   RF00010 232.98 RNaseP_bact_a     1968685 1968969                    
F. nuc   RF00174 82.2 Cobalamin     2032821 2032995                    
                                   
Organism IGS# Up stream Locus Up stream Product Down Stream Locus Down Stream Product Gene Dir type Start End IGS Len GC% IS NT IS AA NR PT-Pair Intra Spp. IGS Inter Spp. IGS Conserved Inter-spp IGS Start Conserved Inter-spp IGS End
P. int   RF00059 44.63 TPP     47004 47102                    
P. int 111 PI0224 conserved hypothetical protein PI0225 conserved hypothetical protein ->-> 222040 222177 138 37.70% 0 0 0/0 +: 0/0/0
-: 0/0/0
1 1 32 138
P. int 119 PI0242 hypothetical protein PI0243 hypothetical protein ->-> 240288 240645 358 43.90% 0 0 0/0 +: 0/0/0
-: 0/0/0
1 1 134 257
P. int 127 PI0271 hypothetical protein PI0272 conserved hypothetical protein ->-> 262452 262760 309 37.90% 0 0 1/0 +: 0/1/0
-: 0/0/0
2 1 168 309
P. int 128 PI0282 hypothetical protein PI0283 Zn-dependent peptidase ->-> 266887 267693 807 41.10% 0 0 0/2 +: 0/0/0
-: 0/3/0
1 2 302 584
P. int 167 PI0377 DNA primase/mobilizable transposon, excision protein PI0378 conserved hypothetical protein; possible helicase ->-> 347884 348165 282 34.40% 0 0 0/0 +: 0/0/0
-: 0/0/0
3 1 100 282
P. int 168 PI0378 conserved hypothetical protein; possible helicase PI0379 conserved hypothetical protein ->-> 349600 350011 412 38.30% 0 0 1/3 +: 0/0/0
-: 0/0/0
1 1 114 216
P. int 172 PI0387 hypothetical protein PI0388 integrase ->-> 354873 355066 194 38.70% 0 0 0/0 +: 0/0/0
-: 0/0/0
1 2 1 119
P. int 173 PI0390 conserved hypothetical protein PI0391 conserved hypothetical protein ->-> 357816 357946 131 32.10% 0 0 0/0 +: 0/0/0
-: 0/0/0
1 2 4 128
P. int 174 PI0391 conserved hypothetical protein PI0392 possible transcriptional regulator ->-> 358277 358483 207 30.90% 0 0 0/0 +: 0/0/0
-: 1/0/0
1 3 76 207
P. int 177 PI0397 conserved hypothetical protein PI0398 hypothetical protein ->-> 362960 363517 558 40.70% 0 0 0/0 +: 0/1/0
-: 0/1/0
2 2 1 558
P. int 178 PI0398 hypothetical protein PI0399 conserved hypothetical protein ->-> 363602 363724 123 35.80% 0 0 0/0 +: 0/0/0
-: 1/0/0
1 2 1 123
P. int 179 PI0399 conserved hypothetical protein PI0400 conserved hypothetical protein ->-> 364151 364568 418 42.30% 0 0 1/3 +: 0/2/0
-: 0/2/0
1 1 1 302
P. int 180 PI0401 conserved hypothetical protein PI0402 conserved hypothetical protein ->-> 367168 368031 864 43% 0 0 0/1 +: 0/6/0
-: 1/4/0
2 2 247 861
P. int   RF00010 133.46 RNaseP_bact_a     612908 613235                    
P. int 320 PI0743 hypothetical protein PI0744 DNA primase/mobilizable transposon, excision protein ->-> 673087 674310 1224 42.60% 0 0 5/51 +: 0/2/0
-: 1/0/0
3 1 972 1081
P. int 321 PI0745 conserved hypothetical protein PI0746 hypothetical protein ->-> 675731 676110 380 43.20% 0 0 0/0 +: 0/1/0
-: 0/1/0
3 1 180 281
P. int 322 PI0749 hypothetical protein PI0750 zinc protease ->-> 677530 677885 356 34.80% 0 0 0/0  +: 0/0/0
-: 1/0/0 
1 1 67 178
P. int   RF00059 49.49 TPP     1372953 1373051                    
P. int 664 PI1543 hypothetical protein PI1544 hypothetical protein ->-> 1463342 1463646 305 39.70% 0 0 0/0  +: 0/0/0
-: 0/1/0 
2 1 1 281
P. int   RF00023 123.42 tmRNA     1491830 1492228                    
P. int   RF00174 93.62 Cobalamin     1724499 1724681                    
                                   
Organism IGS# Up stream Locus Up stream Product Down Stream Locus Down Stream Product Gene Dir type Start End IGS Len GC% IS NT IS AA NR PT-Pair Intra Spp. IGS Inter Spp. IGS Conserved Inter-spp IGS Start Conserved Inter-spp IGS End
P. gin 158 PG0453 hypothetical protein PG0456 PHP N-terminal domain protein ->-> 496731 497854 1124 37.80% 0 0 0/0  +: 0/10/0
-: 0/4/0 
5 2 565 747
P. gin   RF00174 88.16 Cobalamin     701523 701705                    
P. gin 233 PG0665 beta-galactosidase PG0668 TonB-dependent receptor ->-> 715306 717590 2285 47% 0 0 0/35  +: 2/9/7
-: 0/3/0 
2 2 1368 1581
P. gin   RF00174 73.6 Cobalamin     717154 717394                    
P. gin   RF00174 103.15 Cobalamin     749392 749595                    
P. gin 258 PG0742 antigen PgaA PG0744 RNA methyltransferase, TrmH family <-<- 790728 791222 495 41% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 1 113 217
P. gin 285 PG0816 hypothetical protein PG0819 integrase <--> 876950 877465 516 40.10% 0 35 0/0  +: 0/0/0
-: 0/0/0 
2 1 398 516
P. gin 286 PG0821 lipoprotein, putative PG0822 hypothetical protein -><- 880293 880417 125 33.60% 0 0 0/0  +: 0/0/0
-: 0/0/0 
2 1 1 125
P. gin 287 PG0822 hypothetical protein PG0823 hypothetical protein <--> 880826 880957 132 25.80% 0 0 0/0  +: 0/0/0
-: 1/0/0 
2 2 1 132
P. gin 291 PG0829 hypothetical protein PG0831 hypothetical protein <--> 886837 887542 706 40.50% 0 0 0/0  +: 0/1/0
-: 0/1/0 
2 2 1 695
P. gin 292 PG0831 hypothetical protein PG0832 hypothetical protein -><- 887972 888769 798 42.00% 0 0 0/1  +: 0/2/0
-: 0/4/0 
2 3 121 651
P. gin 305 PG0856 hypothetical protein PG0857 transcriptional regulator, putative ->-> 917236 917458 223 39.00% 0 0 0/0  +: 0/3/0
-: 0/3/0 
1 1 10 210
P. gin   RF00010 166.37 RNaseP_bact_a     1019704 1020048                    
P. gin 440 PG1203 transcriptional regulator, putative PG1205 DNA-binding protein, histone-like family <--> 1283537 1284092 556 36.70% 0 0 0/1  +: 0/2/0
-: 0/1/0 
1 1 22 130
P. gin   RF00174 87.3 Cobalamin     1338074 1338333                    
P. gin   RF00174 92.13 Cobalamin     1495774 1495987                    
P. gin 540 PG1436 ATPase, putative PG1439 hypothetical protein -><- 1523815 1524674 860 42.90% 0 0 0/1  +: 1/3/0
-: 0/7/0 
2 2 1 617
P. gin 541 PG1442 hypothetical protein PG1444 hypothetical protein <--> 1528121 1528885 765 39.90% 0 0 0/0  +: 1/1/0
-: 0/1/0 
2 3 1 765
P. gin 546 PG1450 hypothetical protein PG1451 hypothetical protein <--> 1534705 1534836 132 25.80% 0 0 0/0  +: 1/0/0
-: 0/0/0 
2 2 1 132
P. gin 547 PG1451 hypothetical protein PG1452 lipoprotein, putative -><- 1535245 1535369 125 33.60% 0 0 0/0  +: 0/0/0
-: 0/0/0 
2 1 1 125
P. gin 548 PG1454 integrase PG1457 hypothetical protein <--> 1538197 1538789 593 39.00% 0 35 0/0  +: 0/0/0
-: 0/0/0 
2 1 1 119
P. gin 550 PG1463 hypothetical protein PG1465 hypothetical protein ->-> 1541532 1542121 590 43.40% 0 0 0/0  +: 0/0/0
-: 0/2/0 
1 2 102 590
P. gin 558 PG1496 hypothetical protein PG1497 DNA-binding protein, histone-like family ->-> 1573758 1574224 467 48.40% 0 0 0/1  +: 0/2/0
-: 0/4/0 
3 2 121 272
P. gin 564 PG1519 hypothetical protein PG1521 O-succinylbenzoic acid--CoA ligase -><- 1597486 1598143 658 39.50% 0 0 0/3  +: 0/2/0
-: 0/0/0 
2 1 535 646
P. gin 565 PG1526 hypothetical protein PG1527 hypothetical protein <-<- 1605462 1605656 195 25% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 1 1 195
P. gin   RF00521 38.37 SAM_alpha     1994574 1994648                    
P. gin   RF00059 54.48 TPP     1999744 1999857                    
P. gin   RF00059 58.49 TPP     2221664 2221774                    
P. gin 768 PG2117 30S ribosomal protein S16 PG2119 oxidoreductase, Gfo/Idh/MocA family <--> 2225904 2226686 783 46% 0 0 0/0  +: 0/4/0
-: 0/1/0 
1 3 416 559
P. gin   RF00023 123.65 tmRNA     2226057 2226462                    
                                   
Organism IGS# Up stream Locus Up stream Product Down Stream Locus Down Stream Product Gene Dir type Start End IGS Len GC% IS NT IS AA NR PT-Pair Intra Spp. IGS Inter Spp. IGS Conserved Inter-spp IGS Start Conserved Inter-spp IGS End
S. mut 25 SMU.58 hypothetical protein SMU.59 adenylosuccinate lyase ->-> 58935 59243 309 35.60% 0 0 0/0  +: 0/1/0
-: 0/0/0 
1 6 160 307
S. mut 35 SMU.81 heat shock protein GrpE (HSP-70 cofactor) SMU.82 molecular chaperone DnaK ->-> 85269 85641 373 37.80% 0 0 0/0  +: 0/0/0
-: 0/1/0 
1 1 89 222
S. mut 36 SMU.82 molecular chaperone DnaK SMU.83 heat shock protein DnaJ (HSP-40) ->-> 87481 88007 527 36.20% 0 0 0/0  +: 0/1/0
-: 0/2/0 
1 2 420 526
S. mut 85 SMU.168 putative transcriptional regulator SMU.169 50S ribosomal protein L13 ->-> 169406 169797 392 33.90% 0 0 0/0  +: 0/1/0
-: 0/0/0 
1 3 292 392
S. mut   RF00555 38.88 L13_leader     169699 169778                    
S. mut 140 SMU.305 hypothetical protein SMU.307 glucose-6-phosphate isomerase ->-> 293111 293666 556 29.30% 0 0 1/1 +: 2/0/0
-: 0/0/0 
1 3 18 121
S. mut   RF00169 57.53 SRP_bact     293137 293237                    
S. mut 162 SMU.356 purine operon repressor SMU.357 30S ribosomal protein S12 ->-> 333836 334094 259 32.40% 0 0 0/0  +: 1/0/0
-: 0/1/0 
1 2 121 221
S. mut 213 SMU.471 hypothetical protein SMU.472 conserved hypothetical protein; possible N6-adenine-specific DNA methylase ->-> 439785 440233 449 40.80% 0 0 0/0  +: 0/1/0
-: 0/0/0 
1 21 14 432
S. mut   RF00011 291.74 RNaseP_bact_b     439800 440179                    
S. mut 233 SMU.530c hypothetical protein SMU.531 putative chorismate mutase <--> 497463 498133 671 31.30% 0 0 0/0  +: 2/2/1
-: 3/0/0 
1 2 517 652
S. mut   RF00230 70.79 T-box     497851 498089                    
S. mut   RF00230 53.08 T-box     608205 608375                    
S. mut 305 SMU.696 cytidylate kinase SMU.697 translation initiation factor IF-3 ->-> 660468 660633 166 36% 0 0 0/0  +: 0/1/0
-: 1/0/0 
1 11 24 149
S. mut   RF00558 68.64 L20_leader     660491 660617                    
S. mut   RF00059 79.56 TPP     666070 666170                    
S. mut 311 SMU.713 putative cell division protein FtsW SMU.714 elongation factor Tu ->-> 672650 672867 218 22.50% 0 0 0/0  +: 1/0/0
-: 0/0/0 
1 8 112 218
S. mut   RF00080 54.21 yybP-ykoY     681838 681940                    
S. mut   RF00559 45.54 L21_leader     796925 797000                    
S. mut   RF00515 68.64 PyrR     803283 803394                    
S. mut 411 SMU.956 putative Clp-like ATP-dependent protease, ATP-binding subunit SMU.957 50S ribosomal protein L10 <--> 907600 907993 394 28.20% 0 0 0/0  +: 0/0/0
-: 2/0/0 
1 6 241 390
S. mut   RF00557 77.6 L10_leader     907831 907962                    
S. mut   RF00504 62.24 Glycine     1115962 1116047                    
S. mut 497 SMU.1196c hypothetical protein SMU.1197 hypothetical protein <--> 1139292 1139763 472 35.60% 0 0 0/0  +: 0/1/0
-: 1/0/0 
1 17 45 394
S. mut   RF00023 162.4 tmRNA     1139337 1139684                    
S. mut 504 SMU.1204 DNA topoisomerase IV subunit A SMU.1205c hypothetical protein <-<- 1146604 1146795 192 32.80% 0 0 0/0  +: 0/0/0
-: 0/0/0 
2 22 1 134
S. mut   RF00515 67.05 PyrR     1166002 1166118                    
S. mut 556 SMU.1326 peptide chain release factor 2 SMU.1327c conserved hypothetical protein; possible 4Fe-4S ferredoxin <-<- 1250624 1250790 167 32.90% 0 0 0/0  +: 1/0/0
-: 0/0/0 
1 2 1 111
S. mut 586 SMU.1405c hypothetical protein SMU.1406c hypothetical protein <-<- 1334980 1335359 380 27.60% 0 0 0/0  +: 1/1/0
-: 1/0/0 
1 3 52 165
S. mut 607 SMU.1457 putative dTDP-glucose-4,6-dehydratase SMU.1459c hypothetical protein <-<- 1388588 1388990 403 36.00% 0 0 0/0  +: 0/2/0
-: 0/0/0 
2 10 3 124
S. mut   RF00230 64.43 T-box     1518253 1518453                    
S. mut 693 SMU.1703c hypothetical protein SMU.1704 hypothetical protein <--> 1614997 1615454 458 33.80% 0 0 0/0  +: 1/1/1
-: 1/0/0 
1 19 120 345
S. mut   RF00050 77.55 FMN     1615112 1615325                    
S. mut   RF00230 61.88 T-box     1778786 1778993                    
S. mut 811 SMU.1990 DNA-directed RNA polymerase beta subunit SMU.1991 putative membrane carboxypeptidase, penicillin-binding protein 1b <-<- 1864236 1864495 260 27.30% 0 0 0/0  +: 0/0/0
-: 1/2/0 
1 1 113 218
S. mut 824 SMU.2011 50S ribosomal protein L6 SMU.2012 30S ribosomal protein S8 <-<- 1887284 1887695 412 31.10% 0 0 0/0  +: 0/0/0
-: 1/1/0 
1 5 61 408
S. mut 828 SMU.2026c 30S ribosomal protein S10 SMU.2027 putative transcriptional regulator <--> 1893131 1893647 517 35.80% 0 1 0/0  +: 0/0/0
-: 1/0/0 
1 21 199 329
S. mut   RF00013 86.37 6S     1927946 1928139                    
                                   
Organism IGS# Up stream Locus Up stream Product Down Stream Locus Down Stream Product Gene Dir type Start End IGS Len GC% IS NT IS AA NR PT-Pair Intra Spp. IGS Inter Spp. IGS Conserved Inter-spp IGS Start Conserved Inter-spp IGS End
S. mit 30 SMT_129 putative glycosyltransferase SMT_130 capsular polysaccharide biosynthesis protein ->-> 80944 81134 191 35.60% 0 0 0/0  +: 2/0/0
-: 0/0/0 
31 29 49 159
S. mit 48 SMT_168 anaerobic ribonucleoside-triphosphate reductase activating protein SMT_169 30S ribosomal protein S10 ->-> 114401 114654 254 37.00% 0 0 0/0  +: 1/2/0
-: 1/2/0 
1 22 76 254
S. mit 89 SMT_259 dipeptide/tripeptide permease SMT_260 phage transcriptional repressor ->-> 199242 199609 368 32.60% 0 0 0/0  +: 1/2/1
-: 0/2/0 
37 53 96 214
S. mit 103 SMT_301 hypothetical protein SMT_302 K07444 putative N6-adenine-specific DNA methylase ->-> 244463 244947 485 43.90% 0 0 0/0  +: 0/2/0
-: 0/1/0 
1 25 5 482
S. mit   RF00011 275.74 RNaseP_bact_b     244481 244867                    
S. mit   RF00050 45.09 FMN     265097 265217                    
S. mit   RF00230 85.76 T-box     298076 298301                    
S. mit 137 SMT_384 hypothetical protein SMT_385 hypothetical protein ->-> 319921 320034 114 42.10% 0 0 0/0  +: 0/0/0
-: 0/0/0 
39 53 1 114
S. mit   RF00515 53.18 PyrR     376092 376196                    
S. mit   RF00059 63.86 TPP     381488 381576                    
S. mit 161 SMT_442 transcriptional regulator, putative SMT_443 hydroxyethylthiazole kinase ->-> 382461 382694 234 32.10% 0 0 0/0  +: 0/0/0
-: 0/0/0 
23 18 12 234
S. mit   RF00059 62.75 TPP     384204 384292                    
S. mit   RF00059 51.6 TPP     390742 390839                    
S. mit 183 SMT_482 hypothetical protein predicted by Glimmer/Critica SMT_483 transcriptional activator TipA, putative <--> 418627 418966 340 37.60% 0 0 0/0  +: 1/0/0
-: 1/1/0 
39 50 222 331
S. mit 216 SMT_568 hypothetical protein SMT_569 Mn2+ and Fe2+ transporter of the NRAMP family ->-> 486450 487278 829 32% 0 0 0/0  +: 1/2/0
-: 0/4/0 
1 17 13 408
S. mit   RF00023 167.97 tmRNA     486511 486857                    
S. mit 248 SMT_662 A/G-specific adenine glycosylase SMT_663 phosphate acetyltransferase <-<- 573918 574105 188 39.40% 0 0 0/0  +: 0/0/0
-: 0/0/0 
38 52 49 157
S. mit   RF00515 61.2 PyrR     602270 602402                    
S. mit 257 SMT_698 uracil permease SMT_699 K03106 signal recognition particle, subunit SRP54 -><- 609830 610030 201 40.30% 0 0 0/0  +: 0/2/0
-: 0/2/0 
37 55 1 107
S. mit 308 SMT_826 hypothetical protein SMT_827 hypothetical protein <--> 750113 750528 416 34.40% 0 0 0/0  +: 1/0/0
-: 1/0/0 
39 44 1 416
S. mit   RF00080 44.26 yybP-ykoY     782839 782955                    
S. mit 326 SMT_872 hypothetical protein SMT_873 hypothetical protein <-<- 820709 820822 114 42.10% 0 0 0/0  +: 0/0/0
-: 0/0/0 
39 53 1 114
S. mit 329 SMT_890 K02029 polar amino acid transport system permease protein SMT_891 oxidoreductase, putative ->-> 837753 837989 237 32.90% 0 0 0/0  +: 1/1/1
-: 1/0/0 
38 54 1 230
S. mit 339 SMT_915 sucrose operon repressor SMT_916 3-hydroxy-3-methylglutaryl-CoA reductase -><- 868680 868819 140 42.90% 0 0 0/0  +: 0/0/0
-: 0/0/0 
44 131 1 114
S. mit   RF00230 66.46 T-box     915608 915812                    
S. mit 403 SMT_1068 ATP-dependent DNA helicase RecG SMT_1069 acetyl xylan esterase, putative ->-> 1009047 1009469 423 36.20% 0 0 0/0  +: 0/3/0
-: 0/1/0 
38 57 1 423
S. mit 437 SMT_1138 hypothetical protein SMT_1139 rRNA (guanine-N1-)-methyltransferase <--> 1079638 1079894 257 33% 0 0 0/0  +: 0/0/0
-: 2/0/0 
39 69 34 257
S. mit 438 SMT_1139 rRNA (guanine-N1-)-methyltransferase SMT_1140 glycogen phosphorylase -><- 1080744 1081405 662 32% 0 0 0/3  +: 0/1/0
-: 1/1/1 
38 30 316 662
S. mit 457 SMT_1179 aldose 1-epimerase SMT_1180 putative tagatose-6-phosphate aldose/ketose isomerase <-<- 1137945 1138125 181 35.90% 0 0 0/0  +: 0/0/0
-: 0/0/0 
38 43 74 181
S. mit 482 SMT_1232 hypothetical protein predicted by Glimmer/Critica SMT_1233 cadmium resistance protein ->-> 1193270 1194445 1176 31.30% 0 0 1/3 +: 1/3/3
-: 1/3/0 
1 18 896 1176
S. mit   RF00059 49.31 TPP     1253895 1253997                    
S. mit   RF00230 92.36 T-box     1314947 1315164                    
S. mit   RF00558 60.1 L20_leader     1344534 1344665                    
S. mit 535 SMT_1393 hypothetical protein SMT_1394 hypothetical protein predicted by Glimmer/Critica ->-> 1369621 1370215 595 28.20% 0 0 0/0  +: 2/1/2
-: 1/0/0 
39 65 20 464
S. mit 567 SMT_1476 thiol peroxidase SMT_1477 K02077 zinc/manganese transport system substrate-binding protein <-<- 1459310 1459536 227 40.50% 0 0 0/0  +: 0/0/0
-: 1/0/0 
43 139 3 227
S. mit 572 SMT_1487 hypothetical protein SMT_1488 K06147 ATP-binding cassette, subfamily B, bacterial ->-> 1467137 1467296 160 43.10% 0 0 0/0  +: 0/0/0
-: 0/0/0 
33 39 1 160
S. mit 573 SMT_1489 K06147 ATP-binding cassette, subfamily B, bacterial SMT_1490 putative undecaprenyl-phosphate galactose phosphotransferase ->-> 1470775 1470992 218 33.00% 0 0 0/0  +: 0/0/0
-: 0/0/0 
11 60 1 213
S. mit   RF00504 58.19 Glycine     1512848 1512936                    
S. mit 590 SMT_1534 signal peptidase I SMT_1535 exodeoxyribonuclease V alpha chain ->-> 1517973 1518206 234 39.70% 0 0 0/0  +: 0/0/0
-: 0/0/0 
38 40 4 209
S. mit 591 SMT_1535 exodeoxyribonuclease V alpha chain SMT_1536 trigger factor -><- 1520574 1520725 152 39.50% 0 0 0/0  +: 0/0/0
-: 0/0/0 
49 146 38 152
S. mit   RF00230 72.45 T-box     1536930 1537134                    
S. mit 595 SMT_1554 isoleucyl-tRNA synthetase SMT_1555 hypothetical protein -><- 1539973 1540312 340 32.10% 0 0 0/0  +: 0/1/0
-: 1/2/0 
19 49 3 257
S. mit 616 SMT_1601 hypothetical protein predicted by Glimmer/Critica SMT_1602 hypothetical protein ->-> 1592204 1592524 321 35.50% 0 0 0/0  +: 0/1/0
-: 0/0/0 
39 65 181 298
S. mit 617 SMT_1604 K05833 putative ABC transport system ATP-binding protein SMT_1605 hypothetical protein ->-> 1595224 1595545 322 40.10% 0 0 0/0  +: 1/2/1
-: 0/2/0 
15 90 64 322
S. mit 642 SMT_1670 K02030 polar amino acid transport system substrate-binding protein SMT_1671 hypothetical protein -><- 1659356 1659481 126 41.30% 0 0 0/0  +: 0/0/0
-: 0/0/0 
37 55 1 109
S. mit 649 SMT_1689 PTS enzyme, maltose-and glucose-specific, factor II homologue SMT_1690 K06148 ATP-binding cassette, subfamily C, bacterial -><- 1673200 1673459 260 36.50% 0 0 0/0  +: 0/0/0
-: 0/0/0 
43 115 1 228
S. mit   RF00380 57.62 ykoK     1694837 1694988                    
S. mit   RF00380 60.02 ykoK     1696314 1696465                    
S. mit 674 SMT_1766 sucrose-6-phosphate hydrolase SMT_1767 oxidoreductase, short chain dehydrogenase/reductase family -><- 1743162 1743332 171 34.50% 0 0 0/0  +: 0/0/0
-: 0/0/0 
38 48 2 116
S. mit   RF00013 86.26 6S     1751188 1751381                    
S. mit 683 SMT_1781 transcriptional regulator, xre family SMT_1782 hypothetical protein ->-> 1759357 1759659 303 33.30% 0 0 0/0  +: 1/0/0
-: 0/0/0 
35 46 173 287
S. mit 698 SMT_1809 MutT/NUDIX family protein SMT_1810 hypothetical protein ->-> 1785195 1785358 164 37% 0 0 0/0  +: 0/0/0
-: 1/0/0 
38 42 1 161
S. mit 714 SMT_1849 hypothetical protein SMT_1850 K01421 putative membrane protein ->-> 1813865 1815201 1337 47.00% 0 1 3/1 +: 1/0/0
-: 0/3/0 
66 121 251 873
S. mit   RF00029 39.35 Intron_gpII     1851346 1851412                    
S. mit   RF00230 85.76 T-box     1892109 1892334                    
S. mit 776 SMT_1963 ribonuclease BN, putative SMT_1964 sodium/hydrogen exchanger family protein <-<- 1896553 1896685 133 40.60% 0 0 0/0  +: 0/0/0
-: 0/0/0 
39 54 1 133
S. mit 865 SMT_2108 adenylosuccinate lyase SMT_2109 hypothetical protein predicted by Glimmer/Critica <-<- 1998942 1999150 209 36.40% 0 0 0/0  +: 0/0/0
-: 0/0/0 
38 43 102 209
S. mit 878 SMT_2133 hypothetical protein SMT_2134 hypothetical protein <-<- 2014098 2014211 114 42.10% 0 0 0/0  +: 0/0/0
-: 0/0/0 
39 53 1 114
                                   
Organism IGS# Up stream Locus Up stream Product Down Stream Locus Down Stream Product Gene Dir type Start End IGS Len GC% IS NT IS AA NR PT-Pair Intra Spp. IGS Inter Spp. IGS Conserved Inter-spp IGS Start Conserved Inter-spp IGS End
S. san 39 SSA_0105 Uridine kinase, putative SSA_0106 30S ribosomal protein S10, putative ->-> 107540 107911 372 34.10% 0 0 0/0  +: 2/0/0
-: 1/0/0 
1 19 198 372
S. san 41 SSA_0120 30S ribosomal protein S8, putative SSA_0121 hypothetical protein ->-> 115157 115322 166 34.90% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 11 5 114
S. san 67 SSA_0175 Penicillin-binding protein 1B, putative SSA_0176 DNA-directed RNA polymerase I, beta chain (140 kDa subunit), putative ->-> 174545 175120 576 34.40% 0 0 0/98  +: 1/4/0
-: 0/1/0 
1 1 84 195
S. san 74 SSA_0205 NisK (sensor-receptor histidine kinase domain), putative SSA_2393 Transcriptional regulator, XRE family, putative ->-> 203900 204195 296 41.20% 0 0 0/0  +: 1/1/0
-: 0/2/0 
2 1 30 142
S. san 85 SSA_0230 hypothetical protein SSA_0231 Conserved uncharacterized protein <--> 225048 225852 805 31.10% 0 0 0/0  +: 1/5/4
-: 1/6/6 
1 2 188 805
S. san   RF00013 75.64 6S     231004 231197                    
S. san 89 SSA_0240 Acetyltransferase, GNAT family, putative SSA_0241 Ribosomal protein L11 methyltransferase, putative ->-> 234021 234160 140 36.40% 0 0 0/0  +: 0/0/0
-: 0/1/0 
1 4 7 137
S. san   RF00504 57.55 Glycine     353820 353912                    
S. san 159 SSA_0391 Pyruvate oxidase, putative SSA_0392 hypothetical protein ->-> 387862 388015 154 41.60% 0 0 0/0  +: 0/2/0
-: 0/0/0 
1 1 1 154
S. san 162 SSA_0394 hypothetical protein SSA_0395 6-phospho-beta-glucosidase, putative ->-> 390992 391200 209 27.30% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 1 68 206
S. san   RF00230 81.03 T-box     619717 619950                    
S. san   RF00230 68.46 T-box     644747 644964                    
S. san 288 SSA_0753 Foldase protein prsA precursor, putative SSA_0755 hypothetical protein ->-> 736116 736351 236 40.30% 0 0 0/0  +: 1/1/0
-: 0/0/0 
1 4 26 194
S. san 317 SSA_0826 hypothetical protein SSA_0827 Conserved uncharacterized cytosolic protein, putative ->-> 805577 806172 596 39.30% 0 0 0/0  +: 2/1/0
-: 0/0/0 
1 17 62 430
S. san   RF00023 165.29 tmRNA     805658 806005                    
S. san   RF00080 62.27 yybP-ykoY     856983 857085                    
S. san 329 SSA_0868 hypothetical protein SSA_0869 Peptide chain release factor 2, putative <--> 861626 862053 428 34.10% 0 0 0/0  +: 2/2/2
-: 1/0/0 
1 3 305 424
S. san   RF00230 73.66 T-box     919480 919713                    
S. san 344 SSA_0911 hypothetical protein SSA_0912 Phenylalanyl-tRNA synthetase alpha chain, putative ->-> 922959 923239 281 42.70% 0 0 0/0  +: 1/1/0
-: 0/0/0 
1 3 77 243
S. san 377 SSA_1008 Galactokinase, putative SSA_1009 Galactose-1-phosphate-uridylyltransferase, putative ->-> 1011175 1011393 219 44.30% 0 0 0/0  +: 0/1/0
-: 0/0/0 
2 1 79 191
S. san 395 SSA_1060 hypothetical protein SSA_1061 50S ribosomal protein L21, putative ->-> 1077508 1077731 224 41.50% 0 0 0/0  +: 1/0/0
-: 0/0/0 
1 3 75 224
S. san   RF00559 30.43 L21_leader     1077610 1077697                    
S. san   RF00557 74.44 L10_leader     1124710 1124845                    
S. san 442 SSA_1175 Acetoin dehydrogenase complex, E2 component, dihydrolipoamide acetyltransferase, putative SSA_1176 Acetoin dehydrogenase, E1 component, beta subunit, putative <-<- 1204067 1204526 460 32.60% 0 0 0/0  +: 1/3/3
-: 1/7/0 
1 1 129 236
S. san 461 SSA_1226 ParC, putative SSA_1227 Aminoglycoside adenylyltransferase, putative <-<- 1251324 1251568 245 30.20% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 7 5 231
S. san   RF00515 47.06 PyrR     1266382 1266485                    
S. san   RF00558 61.09   1505028 1505158                    
S. san 559 SSA_1520 Elongation factor Tu, putative SSA_1521 Phosphoenolpyruvate carboxylase, putative <-<- 1527672 1527926 255 28.20% 0 0 0/0  +: 0/0/0
-: 1/2/0 
1 5 7 107
S. san 639 SSA_1734 Cation (Mg/Ni uptake) transport ATPase, putative SSA_1735 UreX, putative <-<- 1720394 1720550 157 38.20% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 2 30 142
S. san 640 SSA_1735 UreX, putative SSA_1736 L-cysteine desulfhydrase, putative <-<- 1721163 1721542 380 38.90% 0 0 0/0  +: 0/1/0
-: 1/3/0 
1 2 163 297
S. san   RF00380 59.92 ykoK     1721312 1721468                    
S. san 657 SSA_1773 hypothetical protein SSA_1774 RRNA methylase, putative <-<- 1760183 1760578 396 40.20% 0 0 0/0  +: 0/0/0
-: 1/1/0 
1 19 135 277
S. san   RF00050 100.25 FMN     1760315 1760462                    
S. san 658 SSA_1774 RRNA methylase, putative SSA_1775 Potassium uptake protein, Trk family, putative <--> 1761134 1761423 290 30.70% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 2 1 108
S. san 670 SSA_1821 Acetyltransferase (N-acetylase of ribosomal proteins), putative SSA_1822 hypothetical protein <-<- 1810701 1810960 260 43.10% 0 0 0/0  +: 0/0/0
-: 0/1/0 
1 4 7 226
S. san 686 SSA_1856 hypothetical protein SSA_1857 Conserved DivIVA-like protein, putative <-<- 1852299 1852814 516 44.40% 0 0 0/0  +: 0/1/0
-: 0/1/0 
1 21 102 501
S. san   RF00011 299.46 RNaseP_bact_b     1852423 1852795                    
S. san 692 SSA_1881 hypothetical protein SSA_1882 Subtilisin-like serine proteases, putative <-<- 1876055 1876366 312 39.40% 0 0 0/0  +: 0/1/0
-: 1/2/0 
1 3 4 144
S. san   RF00230 67.9 T-box     1913445 1913613                    
S. san 742 SSA_2005 Chaperone protein dnaJ, putative SSA_2006 4-methyl-5(B-hydroxyethyl)-thiazole monophosphate biosynthesis enzyme, putative <-<- 2006475 2006672 198 38.90% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 1 3 152
S. san 744 SSA_2007 Chaperone protein dnaK/HSP70, putative SSA_2008 Molecular chaperone GrpE (HSP-70 cofactor), putative <-<- 2009363 2009615 253 39.50% 0 0 0/0  +: 1/1/0
-: 0/1/0 
1 3 30 234
S. san 745 SSA_2009 Heat shock transcription repressor HrcA, putative SSA_2010 ABC-type multidrug transport system, permease component, putative <-<- 2011223 2011385 163 38.70% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 1 29 163
S. san 753 SSA_2025 Conserved hypothetical GTPase protein SSA_2388 hypothetical protein -><- 2031073 2031756 684 30.80% 0 0 0/1  +: 2/0/0
-: 0/2/0 
1 3 55 209
S. san 759 SSA_2032 Integrase/recombinase, phage integrase family, putative SSA_2033 30S ribosomal protein S9, putative -><- 2038368 2038476 109 32.10% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 3 3 109
S. san 760 SSA_2034 50S ribosomal protein L13, putative SSA_2035 Conserved uncharacterized protein <-<- 2039335 2039514 180 37.80% 0 0 0/0  +: 0/0/0
-: 1/0/0 
1 3 18 119
S. san   RF00555 38.76 L13_leader     2039355 2039434                    
S. san 768 SSA_2056 Cinnamoyl ester hydrolase, putative SSA_2058 30S ribosomal protein S15, putative -><- 2059654 2060147 494 39.70% 0 0 0/0  +: 0/0/0
-: 0/6/0 
1 5 151 494
S. san 775 SSA_2076 Conserved uncharacterized protein SSA_2077 hypothetical protein <-<- 2080572 2081441 870 39.20% 1 0 0/0  +: 2/5/4
-: 1/4/0 
1 2 719 870
S. san   RF00169 54.81 SRP_bact     2183253 2183339                    
S. san 857 SSA_2287 50S ribosomal protein L32, putative SSA_2288 Cadmium resistance transporter, putative ->-> 2290144 2290600 457 33.90% 0 1 0/0  +: 2/4/7
-: 0/3/0 
1 21 168 457
S. san 858 SSA_2290 hypothetical protein SSA_2291 hypothetical protein ->-> 2293025 2293343 319 31.30% 0 0 0/0  +: 0/2/0
-: 0/0/0 
1 2 85 319
S. san 859 SSA_2291 hypothetical protein SSA_2292 DNA segregation ATPase FtsK/SpoIIIE-like protein, putative ->-> 2293794 2294312 519 37% 0 0 0/9  +: 0/0/0
-: 1/0/0 
1 2 3 107
S. san 860 SSA_2293 hypothetical protein SSA_2294 hypothetical protein ->-> 2296016 2296221 206 38.30% 0 0 0/0  +: 0/0/0
-: 2/0/0 
1 3 3 206
S. san 862 SSA_2295 Integrase/recombinase, phage integrase family, putative SSA_2296 Transcriptional regulator, XRE family, putative -><- 2298424 2298638 215 27.90% 0 0 0/0  +: 0/2/0
-: 0/3/0 
1 12 1 147
S. san   RF00059 43.71 TPP     2355509 2355610                    
                                   
Organism IGS# Up stream Locus Up stream Product Down Stream Locus Down Stream Product Gene Dir type Start End IGS Len GC% IS NT IS AA NR PT-Pair Intra Spp. IGS Inter Spp. IGS Conserved Inter-spp IGS Start Conserved Inter-spp IGS End
T. for 7 TF0021 outer membrane phospholipase A TF0022 two-component system sensor histidine kinase ->-> 16980 17434 455 41.80% 0 0 3/0 +: 0/0/0
-: 0/1/0
2 1 240 346
T. for 74 TF0254 hypothetical protein TF0255 conserved hypothetical protein ->-> 285745 285946 202 33.20% 0 0 0/0 +: 1/0/0
-: 0/1/0
1 1 1 181
T. for   RF00059 62.83 TPP     309091 309186                    
T. for 157 TF0527 prolyl endopeptidase TF0528 two-component system sensor histidine kinase ->-> 552011 552496 486 47.30% 0 0 0/0 +: 0/0/0
-: 0/0/0
1 4 73 265
T. for   RF00023 111.74 tmRNA     552085 552483                    
T. for   RF00010 169.52 RNaseP_bact_a     874173 874493                    
T. for 259 TF0862 conserved hypothetical protein fragment TF0864 O-succinylbenzoate--CoA ligase ->-> 914520 914709 190 52.60% 0 0 0/0 +: 0/4/0
-: 0/0/0
2 3 9 186
T. for 260 TF0869 conserved hypothetical protein; possible transcriptional regulator TF0870 conserved hypothetical protein ->-> 922075 922265 191 23.60% 0 0 0/0 +: 0/0/0
-: 1/0/0
1 1 1 188
T. for 267 TF0883 hypothetical protein TF0884 hypothetical protein ->-> 937533 937742 210 51.40% 0 0 0/0 +: 0/5/0
-: 0/0/0
2 3 1 210
T. for 269 TF0891 hypothetical protein TF0892 conserved hypothetical protein ->-> 940856 940975 120 30% 0 0 0/0 +: 0/0/0
-: 0/0/0
1 1 3 107
T. for 276 TF0904 conserved hypothetical protein TF0906 conserved hypothetical protein ->-> 949217 949814 598 44.10% 0 0 0/0 +: 0/4/0
-: 0/1/0
2 1 1 511
T. for 279 TF0914 conserved hypothetical protein; possible ketosteroid isomerase-related protein TF0915 possible transcriptional regulator ->-> 955874 956006 133 27.80% 0 0 0/0 +: 0/0/0
-: 0/1/0
1 3 6 129
T. for 284 TF0926 conserved hypothetical protein; possible TraB protein TF0927 conserved hypothetical protein; possible ATPase of the AAA superfamily ->-> 962998 963595 598 38.30% 0 0 0/2 +: 0/1/0
-: 0/1/0
1 1 1 276
T. for 286 TF0934 hypothetical protein TF0936 subtilisin-like serine protease ->-> 975268 976007 740 37.70% 0 0 1/0 +: 0/2/0
-: 0/2/0
1 3 98 587
T. for   RF00174 99.63 Cobalamin     986206 986411                    
T. for 405 TF1303 conserved hypothetical protein TF1304 conserved hypothetical protein ->-> 1384759 1385147 389 43.70% 0 0 0/0 +: 0/2/0
-: 0/0/0
1 1 113 237
T. for 480 TF1543 conserved hypothetical protein TF1544 possible transcriptional regulator ->-> 1650835 1651049 215 37.70% 0 0 0/0 +: 0/3/0
-: 0/2/0
1 1 3 203
T. for 502 TF1629 hypothetical protein TF1630 mobilization protein ->-> 1741976 1742161 186 31.20% 0 0 0/0 +: 0/0/0
-: 0/0/0
1 1 5 117
T. for 503 TF1631 hypothetical protein TF1632 conserved hypothetical protein ->-> 1743594 1743999 406 39.20% 0 0 0/0 +: 1/0/0
-: 0/3/0
1 2 66 332
T. for 504 TF1633 mobilizable transposon, excision protein TF1635 conserved hypothetical protein; possible helicase ->-> 1745409 1745592 184 27.20% 0 0 0/0 +: 0/0/0
-: 0/0/0
1 1 1 184
T. for 505 TF1634 hypothetical protein TF1637 hypothetical protein ->-> 1747139 1747245 107 42.10% 0 0 0/0 +: 0/0/0
-: 0/0/0
1 1 1 107
T. for 532 TF1700 conserved hypothetical protein TF1701 transcriptional regulator ->-> 1821522 1821632 111 30.60% 0 0 0/0  +: 0/1/0
-: 0/1/0 
1 1 1 111
T. for 689 TF2237 two-component system response regulator involved in CTn TF2238 conserved hypothetical protein ->-> 2413954 2414237 284 46.50% 0 0 0/0  +: 0/1/0
-: 0/0/0 
1 1 4 283
T. for 690 TF2250 transfer region-related protein, TraD TF2251 conjugative transposon protein, TraE ->-> 2424234 2424411 178 48.90% 0 0 0/0  +: 0/0/0
-: 0/0/0 
1 1 79 178
T. for 692 TF2281 hypothetical protein TF2283 hypothetical protein ->-> 2440056 2440405 350 33.10% 0 0 0/0  +: 0/0/0
-: 0/1/0 
1 3 49 196
T. for   RF00174 82.27 Cobalamin     2660443 2660654                    
T. for   RF00174 91.77 Cobalamin     2664883 2665093                    
T. for   RF00521 33.19 SAM_alpha     2729234 2729308                    
                                   
Organism IGS# Up stream Locus Up stream Product Down Stream Locus Down Stream Product Gene Dir type Start End IGS Len GC% IS NT IS AA NR PT-Pair Intra Spp. IGS Inter Spp. IGS Conserved Inter-spp IGS Start Conserved Inter-spp IGS End
T. den   RF00059 50.99 TPP     164049 164165                    
T. den   RF00023 131.38 tmRNA     246213 246562                    
T. den   RF00169 73.85 SRP_bact     250588 250687                    
T. den   RF00174 111.18 Cobalamin     799174 799355                    
T. den   RF00174 114.57 Cobalamin     1025683 1025870                    
T. den   RF00010 184.81 RNaseP_bact_a     1388137 1388443                    
T. den   RF00174 114.26 Cobalamin     2086585 2086789                    
T. den 625 TDE2330 hypothetical protein TDE2332 hypothetical protein -><- 2366759 2367627 869 32.10% 0 0 0/3  +: 1/1/0
-: 1/0/0 
1 1 1 165

 

References

  1. Argaman, L., Hershberg, R., Vogel, J., Bejerano, G., Wagner, E. G., Margalit, H., and Altuvia, S. (2001). Novel small RNA-encoding genes in the intergenic regions of Escherichia coli. Curr Biol 11, 941-950.
  2. Axmann, I. M., Kensche, P., Vogel, J., Kohl, S., Herzel, H., and Hess, W. R. (2005). Identification of cyanobacterial non-coding RNAs by comparative genome analysis. Genome Biol 6, R73.
  3. Barrick, J. E., Sudarsan, N., Weinberg, Z., Ruzzo, W. L., and Breaker, R. R. (2005). 6S RNA is a widespread regulator of eubacterial RNA polymerase that resembles an open promoter. Rna 11, 774-784.
  4. Chen, S., Lesnik, E. A., Hall, T. A., Sampath, R., Griffey, R. H., Ecker, D. J., and Blyn, L. B. (2002). A bioinformatics based approach to discover small RNA genes in the Escherichia coli genome. Biosystems 65, 157-177.
  5. Eddy, S. R. (2002). A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure. BMC Bioinformatics 3.
  6. Ermolaeva, M. D., Khalak, H. G., White, O., Smith, H. O., and Salzberg, S. L. (2000). Prediction of transcription terminators in bacterial genomes. J Mol Biol 301, 27-33.
  7. Huttenhofer, A., and Vogel, J. (2006). Experimental approaches to identify non-coding RNAs. Nucleic Acids Res 34, 635-646.
  8. Kulkarni, P. R., Cui, X., Williams, J. W., Stevens, A. M., and Kulkarni, R. V. (2006). Prediction of CsrA-regulating small RNAs in bacteria and their experimental verification in Vibrio fischeri. Nucleic Acids Res 34, 3361-3369.
  9. Lenz, D. H., Mok, K. C., Lilley, B. N., Kulkarni, R. V., Wingreen, N. S., and Bassler, B. L. (2004). The small RNA chaperone Hfq and multiple small RNAs control quorum sensing in Vibrio harveyi and Vibrio cholerae. Cell 118, 69-82.
  10. Li, W-H., and Godzik, A. (2006). Cd-hit: a fast program fro clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658-1659.
  11. Livny, J., Brencic, A., Lory, S., and Waldor, M. K. (2006). Identification of 17 Pseudomonas aeruginosa sRNAs and prediction of sRNA-encoding genes in 10 diverse pathogens using the bioinformatic tool sRNAPredict2. Nucleic Acids Res 34, 3484-3493.
  12. Majdalani, N., Vanderpool, C. K., and Gottesman, S. (2005). Bacterial small RNA regulators. Crit Rev Biochem Mol Biol 40, 93-113.
  13. Rivas, E., Klein, R. J., Jones, T. A., and Eddy, S. R. (2001). Computational identification of noncoding RNAs in E. coli by comparative genomics. Curr Biol 11, 1369-1373.
  14. Studholme DJ, Dixon R. (2003). Domain architectures of sigma54-dependent transcriptional activators. J Bacteriol. 185,1757-67.
  15. van Helden, J. (2003). Regulatory sequence analysis tools. Nucleic Acids Res 31, 3593-3596.
  16. Vogel, J., and Sharma, C. M. (2005). How to find small non-coding RNAs in bacteria. Biol Chem 386, 1219-1238.
  17. Wang, C., Ding, C., Meraz, R. F., and Holbrook, S. R. (2006). PSoL: a positive sample only learning algorithm for finding non-coding RNA genes. Bioinformatics 22, 2590-2596.
  18. Yachie, N., Numata, K., Saito, R., Kanai, A., and Tomita, M. (2006). Prediction of non-coding and antisense RNA genes in Escherichia coli with Gapped Markov Model. Gene 372, 171-181.
  19. Zhang, A., Wassarman, K. M., Rosenow, C., Tjaden, B. C., Storz, G., and Gottesman, S. (2003). Global analysis of small RNA and mRNA targets of Hfq. Mol Microbiol 50, 1111-1124.
  20. Zhang, Y., Sun, S., Wu, T., Wang, J., Liu, C., Chen, L., Zhu, X., Zhao, Y., Zhang, Z., Shi, B., et al. (2006). Identifying Hfq-binding small RNA targets in Escherichia coli. Biochem Biophys Res Commun 343, 950-955.
  21. http://www-is.biotoul.fr/