bitscore colors: <40, 40-50 , 50-80, 80-200, >200

BLASTP 2.2.30+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
49,011,213 sequences; 17,563,301,199 total letters
Query= Contig-3_CDS_annotation_glimmer3.pl_2_6
Length=630
Score E
Sequences producing significant alignments: (Bits) Value
gi|575094354|emb|CDL65742.1| unnamed protein product 402 6e-128
gi|490418709|ref|WP_004291032.1| hypothetical protein 392 9e-125
gi|496050829|ref|WP_008775336.1| hypothetical protein 387 1e-122
gi|547226430|ref|WP_021963493.1| putative uncharacterized protein 367 4e-115
gi|494822885|ref|WP_007558293.1| hypothetical protein 350 6e-108
gi|575094321|emb|CDL65708.1| unnamed protein product 256 3e-72
gi|494308783|ref|WP_007173938.1| hypothetical protein 181 6e-46
gi|647452987|ref|WP_025792807.1| hypothetical protein 179 3e-45
gi|517172762|ref|WP_018361580.1| hypothetical protein 171 1e-42
gi|496521299|ref|WP_009229582.1| capsid protein 166 4e-41
>gi|575094354|emb|CDL65742.1| unnamed protein product [uncultured bacterium]
Length=615
Score = 402 bits (1033), Expect = 6e-128, Method: Compositional matrix adjust.
Identities = 242/658 (37%), Positives = 367/658 (56%), Gaps = 79/658 (12%)
Query 7 LKQLQNHPHRSGFDIGAKNVFSAKCGELLPVYWDLGIPGCTYDIDIQYFTRTRPVQTAAY 66
+ ++N P R+GFD+ K F+AK GELLPV + +PG +++I+++ FTRT+P+ T+A+
Sbjct 3 MADIKNRPSRNGFDLSFKKNFTAKAGELLPVMTKVVLPGDSFNINLRSFTRTQPLNTSAF 62
Query 67 TRIREYFDFYAVPIDLIWKSFDASVIQMGETAPVQAKDILTALT-VSGDLPYCSLSDLGL 125
R+REY+DFY VP + +W FD+ + QM + L T +SG +PY +
Sbjct 63 ARMREYYDFYFVPFEQMWNKFDSCITQMNANVQHASGPTLDDNTPLSGRMPYFT------ 116
Query 126 SCFFASGSMSVPSLKSWQANNAYANIFGYIRGDVNYKLIHMLNYGNIIPNNMPALNIGNS 185
S + + QA A N FG+ R + KL+ L YG+ N +S
Sbjct 117 -------SEQIADYLNDQATAARKNPFGFNRSTLTCKLLQYLGYGD--------YNSFDS 161
Query 186 NYRWWNREAPLASDSLIVYSQMYNFNMNVNLFPLATYQKIYQDFFRWSQWEKADPTSYNF 245
W+ + L +N+ ++ FPL YQKIY DF+R++QWEK +P+++N
Sbjct 162 ETNTWSAKPLL-------------YNLELSPFPLLAYQKIYSDFYRYTQWEKTNPSTFNL 208
Query 246 DWYQGSGNLFGGTID-TSLPASSDYWKRDNLFSLRYCNWNKDLFMGVLPNSQFGDIAVID 304
D+ +G+ +L +D T LP+ + N F +RYCN+ KD+F GVLP +Q+G +V+
Sbjct 209 DYIKGTSDL---QMDLTGLPSDDN-----NFFDIRYCNYQKDMFHGVLPVAQYGSASVVP 260
Query 305 IEGGLNIPASRIS--LSSNNRPTIGIKVGAQVSSPNNCSITNSSGNLSTGDILSVGIPA- 361
I G LN+ ++ S + + P G + V+ N + N S +S G L+VG A
Sbjct 261 INGQLNVISNGDSGPIFKTSTPDPGTPGTSYVTVGGNIGVDNRSFGVS-GSTLNVGKSAD 319
Query 362 -ASYKLQSSFN----------------------VLALRQAESLQKYREITQSVDTNYRDQ 398
+ Y S+ + +LALRQAE LQK++E++ S + +Y+ Q
Sbjct 320 PSGYGFPSNASTRSLLWENPNLIIENNQGFYVPILALRQAEFLQKWKEVSVSGEEDYKSQ 379
Query 399 IKAHFGVNVPASDSHMAQYIGGIARNLDISEVVNNNLQGDGEAVIYGKGVGTGTGSMRYT 458
I+ H+G+ V SH A+Y+GG A +LDI+EV+NNN+ GD A I GKG TG GS+R+
Sbjct 380 IEKHWGIKVSDFLSHQARYLGGCATSLDINEVINNNITGDNAADIAGKGTFTGNGSIRFE 439
Query 459 TGSKYCILMCIYHCMPVLDYDISGQHPQLLATSVDELPIPEFDNIGMEGVPLVQLVNSNL 518
+ +Y I+MCIYH +P++DY SG PIPE D IGME VPLV+ +N
Sbjct 440 SKGEYGIIMCIYHVLPIVDYVGSGVDHSCTLVDATSFPIPELDQIGMESVPLVRAMNP-- 497
Query 519 YKTNKSVKIDSILGYNPRYYAWKSNIDRIHGAFTTTLQDWVSPVDDSFLYS--TFGTPSS 576
K + + D+ LGY PRY WK+++DR G F +L+ W PV D L S + PS+
Sbjct 498 VKESDTPSADTFLGYAPRYIDWKTSVDRSVGDFADSLRTWCLPVGDKELTSANSLNFPSN 557
Query 577 GSF----VTWPFFKVNPNTLDNIFAVKSDSTWESDQFLVNSYVGCKVVRPLSRDGVPY 630
+ + FFKVNP+ +D +FAV +DST ++D+FL +S+ KVVR L +G+PY
Sbjct 558 PNVEPDSIAAGFFKVNPSIVDPLFAVVADSTVKTDEFLCSSFFDVKVVRNLDVNGLPY 615
>gi|490418709|ref|WP_004291032.1| hypothetical protein [Bacteroides eggerthii]
gi|217986636|gb|EEC52970.1| putative capsid protein (F protein) [Bacteroides eggerthii DSM
20697]
Length=578
Score = 392 bits (1008), Expect = 9e-125, Method: Compositional matrix adjust.
Identities = 242/648 (37%), Positives = 352/648 (54%), Gaps = 88/648 (14%)
Query 1 MAHFTGLKQLQNHPHRSGFDIGAKNVFSAKCGELLPVYWDLGIPGCTYDIDIQYFTRTRP 60
MA+ LK ++N P R+GFD+ K F+AK GELLPV +PG T+ I+++ FTRT+P
Sbjct 1 MANIMSLKSIRNKPSRNGFDLSFKKNFTAKAGELLPVMVKEVLPGDTFKINLKAFTRTQP 60
Query 61 VQTAAYTRIREYFDFYAVPIDLIWKSFDASVIQMGETAPVQAKDI--LTALTVSGDLPYC 118
V TAA+ RIREY+DF+ VP DL+W + + QM + P A I +SG++PY
Sbjct 61 VNTAAFARIREYYDFFFVPYDLLWNKANTVLTQMYDN-PQHAVSIDPTRNFVLSGEMPYM 119
Query 119 SLSDLGLSCFFASGSMSVPSLKSWQANNAYANIFGYIRGDVNYKLIHMLNYGN---IIPN 175
+ + S + ++ KS N FGY R + KL+ L YGN + +
Sbjct 120 TSEAIASYINALSTASALADYKS--------NYFGYNRSKSSVKLLEYLGYGNYESFLTD 171
Query 176 NMPALNIGNSNYRWWNREAPLASDSLIVYSQMYNFNMNVNLFPLATYQKIYQDFFRWSQW 235
+ WN APL + N+N N+F L YQKIY DF+R SQW
Sbjct 172 D-------------WN-TAPLMA------------NLNHNIFGLLAYQKIYSDFYRDSQW 205
Query 236 EKADPTSYNFDWYQGSGNLFGGTIDTSLPASSDYWKRDNLFSLRYCNWNKDLFMGVLPNS 295
E+ P+++N D+ GS +++ S+++++ N F LRYCNW KDLF GVLP+
Sbjct 206 ERVSPSTFNVDYLDGS------SMNLDNAYSTEFYQNYNFFDLRYCNWQKDLFHGVLPHQ 259
Query 296 QFGDIAVIDIEGGLNIPASRISLSSNNRPTIGIKVGAQVSSPNNCSITNSSGNLSTGDIL 355
Q+G+ AV I P L+ +N T+G +SP S T ++ NL D
Sbjct 260 QYGETAVASI-----TPDVTGKLTLSNFSTVG-------TSPTTASGT-ATKNLPAFD-- 304
Query 356 SVGIPAASYKLQSSFNVLALRQAESLQKYREITQSVDTNYRDQIKAHFGVNVPASDSHMA 415
+VG ++L LRQAE LQK++EITQS + +Y+DQ++ H+GV+V S +
Sbjct 305 TVG----------DLSILVLRQAEFLQKWKEITQSGNKDYKDQLEKHWGVSVGDGFSELC 354
Query 416 QYIGGIARNLDISEVVNNNLQGDGEAVIYGKGVGTGTGSMRYTTGSKYCILMCIYHCMPV 475
Y+GG++ ++DI+EV+N N+ G A I GKGVG G + + + +Y ++MCIYHC+P+
Sbjct 355 TYLGGVSSSIDINEVINTNITGSAAADIAGKGVGVANGEINFNSNGRYGLIMCIYHCLPL 414
Query 476 LDYDISGQHPQLLATSVDELPIPEFDNIGMEGVPLVQLVNSNLYKTNKSVKIDSILGYNP 535
LDY P L + + IPEFD +GM+ +PLVQL+N N S +LGY P
Sbjct 415 LDYTTDMLDPAFLKVNSTDYAIPEFDRVGMQSMPLVQLMNPLRSFANAS---GLVLGYVP 471
Query 536 RYYAWKSNIDRIHGAFTTTLQDWV-------------SPVDDSFLYSTFGTPSSGSFVTW 582
RY +K+++D+ G F TL WV P D + + PS + +
Sbjct 472 RYIDYKTSVDQSVGGFKRTLNSWVISYGNISVLKQVTLPNDAPPIEPSEPVPSVAP-MNF 530
Query 583 PFFKVNPNTLDNIFAVKSDSTWESDQFLVNSYVGCKVVRPLSRDGVPY 630
FFKVNP+ LD IFAV++ +DQFL +S+ K VR L DG+PY
Sbjct 531 TFFKVNPDCLDPIFAVQAGDDTNTDQFLCSSFFDIKAVRNLDTDGLPY 578
>gi|496050829|ref|WP_008775336.1| hypothetical protein [Bacteroides sp. 2_2_4]
gi|229448893|gb|EEO54684.1| putative capsid protein (F protein) [Bacteroides sp. 2_2_4]
Length=580
Score = 387 bits (994), Expect = 1e-122, Method: Compositional matrix adjust.
Identities = 240/637 (38%), Positives = 347/637 (54%), Gaps = 64/637 (10%)
Query 1 MAHFTGLKQLQNHPHRSGFDIGAKNVFSAKCGELLPVY-WDLGIPGCTYDIDIQYFTRTR 59
MA+ LK L+N R+GFD+ +K F+AK GELLPV W++ +PG + ID++ FTRT+
Sbjct 1 MANIMSLKSLRNKTSRNGFDLSSKRNFTAKPGELLPVKCWEV-LPGDKWSIDLKSFTRTQ 59
Query 60 PVQTAAYTRIREYFDFYAVPIDLIWKSFDASVIQMGETAPVQAKDILTA-LTVSGDLPYC 118
P+ TAA+ R+REY+DFY VP +L+W + + QM + I +A ++G +P
Sbjct 60 PLNTAAFARMREYYDFYFVPYNLLWNKANTVLTQMYDNPQHATSYIPSANQALAGVMPNV 119
Query 119 SLSDLGLSCFFASGSMSVPSLKSWQANNAYANIFGYIRGDVNYKLIHMLNYGNIIPNNMP 178
+ G++ + + V + S++ N FGY R KL+ L YGN
Sbjct 120 TCK--GIADYLNLVAPDVTTTNSYEKN-----YFGYSRSLGTAKLLEYLGYGNF------ 166
Query 179 ALNIGNSNYRWWNREAPLASDSLIVYSQMYNFNMNVNLFPLATYQKIYQDFFRWSQWEKA 238
S W + +PL+S N+ +N++ + YQKIY D R SQWEK
Sbjct 167 -YTYATSKNNTWTK-SPLSS------------NLQLNIYGVLAYQKIYADHIRDSQWEKV 212
Query 239 DPTSYNFDWYQGSGNLFGGTIDTSLPASSDYWKRDNLFSLRYCNWNKDLFMGVLPNSQFG 298
P+ +N D+ G+ + TID S+ + N+F LRYCNW KDLF GVLP Q+G
Sbjct 213 SPSCFNVDYLSGTVDS-AMTID-SMITGQGFAPFYNMFDLRYCNWQKDLFHGVLPRQQYG 270
Query 299 DIAVIDIEGGLNIPASRISLSSNNRPTIGIKVGAQVSSPNNCSITNSSGNLSTGDILSVG 358
D A +++ + A + + + P G S+ N N SG
Sbjct 271 DTAAVNVNLSNVLSAQYMVQTPDGDPVGGSPFS---STGVNLQTVNGSG----------- 316
Query 359 IPAASYKLQSSFNVLALRQAESLQKYREITQSVDTNYRDQIKAHFGVNVPASDSHMAQYI 418
+F VLALRQAE LQK++EITQS + +Y+DQI+ H+ V+V + S M+ Y+
Sbjct 317 ----------TFTVLALRQAEFLQKWKEITQSGNKDYKDQIEKHWNVSVGEAYSEMSLYL 366
Query 419 GGIARNLDISEVVNNNLQGDGEAVIYGKGVGTGTGSMRYTTGSKYCILMCIYHCMPVLDY 478
GG +LDI+EVVNNN+ G A I GKGV G G + + G +Y ++MCIYH +P+LDY
Sbjct 367 GGTTASLDINEVVNNNITGSNAADIAGKGVVVGNGRISFDAGERYGLIMCIYHSLPLLDY 426
Query 479 DISGQHPQLLATSVDELPIPEFDNIGMEGVPLVQLVNSNLYKTNKSVKIDSILGYNPRYY 538
+P + + IPEFD +GME VPLV L+N N SILGY PRY
Sbjct 427 TTDLVNPAFTKINSTDFAIPEFDRVGMESVPLVSLMNPLQSSYNVG---SSILGYAPRYI 483
Query 539 AWKSNIDRIHGAFTTTLQDWVSPVDDSFL-----YSTFGTPSSGSFVTWPFFKVNPNTLD 593
++K+++D GAF TTL+ WV D+ + Y S G+ V + FKVNPN +D
Sbjct 484 SYKTDVDSSVGAFKTTLKSWVMSYDNQSVINQLNYQDDPNNSPGTLVNYTNFKVNPNCVD 543
Query 594 NIFAVKSDSTWESDQFLVNSYVGCKVVRPLSRDGVPY 630
+FAV + ++ ++DQFL +S+ KVVR L DG+PY
Sbjct 544 PLFAVAASNSIDTDQFLCSSFFDVKVVRNLDTDGLPY 580
>gi|547226430|ref|WP_021963493.1| putative uncharacterized protein [Prevotella sp. CAG:1185]
gi|524103382|emb|CCY83994.1| putative uncharacterized protein [Prevotella sp. CAG:1185]
Length=573
Score = 367 bits (943), Expect = 4e-115, Method: Compositional matrix adjust.
Identities = 244/644 (38%), Positives = 352/644 (55%), Gaps = 85/644 (13%)
Query 1 MAHFTGLKQLQNHPHRSGFDIGAKNVFSAKCGELLPVYWDLGIPGCTYDIDIQYFTRTRP 60
M+ L L+N R+GFD+ KN F+AK GELLP+ PG ++I Q FTRT+P
Sbjct 1 MSSVMSLTALKNSVKRNGFDLSFKNAFTAKVGELLPIMCKEVYPGDKFNIRGQAFTRTQP 60
Query 61 VQTAAYTRIREYFDFYAVPIDLIWKSFDASVIQMGETAPVQAKDILTALTVSGDLPYCSL 120
V +AAY+R+REY+DFY VP L+W M + P A D+++++ +S P+ +
Sbjct 61 VNSAAYSRLREYYDFYFVPYRLLWNMAPTFFTNMPD--PHHAADLVSSVNLSQRHPWFTF 118
Query 121 SDLGLSCFFASGSMSVPSLKSWQANNAYANIFGYIRGDVNYKLIHMLNYGNIIPNNMPAL 180
D+ + S+S + + +Q N FG+ R +++ KL++ LNYG
Sbjct 119 FDI-MEYLGNLNSLS-GAYEKYQKN-----FFGFSRVELSVKLLNYLNYG---------- 161
Query 181 NIGNSNYRWWNREAPLASDSLIVYSQMYNFNMNVNLFPLATYQKIYQDFFRWSQWEKADP 240
G + + + P SD +++ + FPL YQKI +D+FR QW+ A P
Sbjct 162 -FGKD---YESVKVPSDSDDIVL-----------SPFPLLAYQKICEDYFRDDQWQSAAP 206
Query 241 TSYNFDWYQGSGNLFGGTIDTSLPASS---DYWKRDNLFSLRYCNWNKDLFMGVLPNSQF 297
YN D+ L+G + +P SS D +K +F L YCN+ KD F G+LP +Q+
Sbjct 207 YRYNLDY------LYGKSSGFHIPMSSFTNDAFKNPTMFDLNYCNFQKDYFTGMLPRAQY 260
Query 298 GDIAVID-IEGGLNIPASRISLSSNNRPTIG---IKVGAQVSSPNNCSITNSSGNLSTGD 353
GD++V I G L+I S SL+ + P G I+ G V + N +N++ LS
Sbjct 261 GDVSVASPIFGDLDIGDSS-SLTFASAPQQGANTIQSGVLVVNNN----SNTTAGLS--- 312
Query 354 ILSVGIPAASYKLQSSFNVLALRQAESLQKYREITQSVDTNYRDQIKAHFGVNVPASDSH 413
VLALRQAE LQK+REI QS +Y+ Q++ HF V+ A+ S
Sbjct 313 ------------------VLALRQAECLQKWREIAQSGKMDYQTQMQKHFNVSPSATLSG 354
Query 414 MAQYIGGIARNLDISEVVNNNLQGDGEAVIYGKGVGTGTGSMRYTTGSKYCILMCIYHCM 473
+Y+GG NLDISEVVN NL GD +A I GKG GT G+ S++ I+MCIYHC+
Sbjct 355 HCKYLGGWTSNLDISEVVNTNLTGDNQADIQGKGTGTLNGNKVDFESSEHGIIMCIYHCL 414
Query 474 PVLDYDISGQHPQLLATSVDELPIPEFDNIGMEGVPLVQLVNSNLYKTNKSVKID--SI- 530
P+LD+ I+ Q T+ + IPEFD++GM+ QL S + + + D SI
Sbjct 415 PLLDWSINRIARQNFKTTFTDYAIPEFDSVGMQ-----QLYPSEMIFGLEDLPSDPSSIN 469
Query 531 LGYNPRYYAWKSNIDRIHGAFTTTLQDWVSPVDDSFLYSTFGTPSSGSF----VTWPFFK 586
+GY PRY K++ID IHG+F TL WVSP+ DS++ + F +T+ FFK
Sbjct 470 MGYVPRYADLKTSIDEIHGSFIDTLVSWVSPLTDSYISAYRQACKDAGFSDITMTYNFFK 529
Query 587 VNPNTLDNIFAVKSDSTWESDQFLVNSYVGCKVVRPLSRDGVPY 630
VNP+ +DNIF VK+DST +DQ L+NSY K VR +G+PY
Sbjct 530 VNPHIVDNIFGVKADSTINTDQLLINSYFDIKAVRNFDYNGLPY 573
>gi|494822885|ref|WP_007558293.1| hypothetical protein [Bacteroides plebeius]
gi|198272099|gb|EDY96368.1| putative capsid protein (F protein) [Bacteroides plebeius DSM
17135]
Length=613
Score = 350 bits (897), Expect = 6e-108, Method: Compositional matrix adjust.
Identities = 223/652 (34%), Positives = 343/652 (53%), Gaps = 68/652 (10%)
Query 1 MAHFTGLKQLQNHPHRSGFDIGAKNVFSAKCGELLPVYWDLGIPGCTYDIDIQYFTRTRP 60
MA+ +K ++N P R+G+D+ K F+AK G L+PV+W +P + ++ F RT+P
Sbjct 8 MANIMSMKSVRNKPTRAGYDLTQKINFTAKAGSLIPVWWTPVLPFDDLNATVKSFVRTQP 67
Query 61 VQTAAYTRIREYFDFYAVPIDLIWKSFDASVIQM-----GETAPVQAKDILTALTVSGDL 115
+ TAA+ R+R YFDFY VP +W F ++ QM + PV A ++ +S +L
Sbjct 68 LNTAAFARMRGYFDFYFVPFRQMWNKFPTAITQMRTNLLHASGPVLADNV----PLSDEL 123
Query 116 PYCSLSDLGLSCFFASGSMSVPSLKSWQANNAYANIFGYIRGDVNYKLIHMLNYGNIIPN 175
PY + + A +S+ K N FGY R + ++ L YG+ P
Sbjct 124 PYFTAEQV------ADYIVSLADSK---------NQFGYYRAWLVCIILEYLGYGDFYPY 168
Query 176 NMPALNIGNSNYRWWNREAPLASDSLIVYSQMYNFNMNVNLFPLATYQKIYQDFFRWSQW 235
+ A G W R M N N+ + FPL YQKIY DF R++QW
Sbjct 169 IVEA--AGGEGATWATRP-------------MLN-NLKFSPFPLFAYQKIYADFNRYTQW 212
Query 236 EKADPTSYNFDWYQGSGNLFGGTIDTSLPASSDYWKRDNLFSLRYCNWNKDLFMGVLPNS 295
E+++P+++N D+ GS + +D ++ D + NLF +RY NW +DL G +P +
Sbjct 213 ERSNPSTFNIDYISGSADSL--QLDFTVEGFKDSF---NLFDMRYSNWQRDLLHGTIPQA 267
Query 296 QFGDIAVIDIEGGLNIPASRISLSSNNRPTIGIKVGAQVSSPNNCSITNSSGNL----ST 351
Q+G+ + + + G + + + P N +I SSG L S
Sbjct 268 QYGEASAVPVSGSMQV------VEGPTPPAFTTGQDGVAFLNGNVTIQGSSGYLQAQTSV 321
Query 352 GD--ILSVGIPAASYKLQ--SSFNV--LALRQAESLQKYREITQSVDTNYRDQIKAHFGV 405
G+ IL + ++ SSF V LALR+AE+ QK++E+ + + +Y QI+AH+G
Sbjct 322 GESRILRFNNTNSGLIVEGDSSFGVSILALRRAEAAQKWKEVALASEEDYPSQIEAHWGQ 381
Query 406 NVPASDSHMAQYIGGIARNLDISEVVNNNLQGDGEAVIYGKGVGTGTGSMRYTTGSKYCI 465
+V + S M Q++G I +L I+EVVNNN+ G+ A I GKG +G GS+ + G +Y I
Sbjct 382 SVNKAYSDMCQWLGSINIDLSINEVVNNNITGENAADIAGKGTMSGNGSINFNVGGQYGI 441
Query 466 LMCIYHCMPVLDYDISGQHPQLLATSVDELPIPEFDNIGMEGVPLVQLVNSNLYKTNK-S 524
+MC++H +P LDY S H T+V + PIPEFD IGME VP+++ +N K
Sbjct 442 VMCVFHVLPQLDYITSAPHFGTTLTNVLDFPIPEFDKIGMEQVPVIRGLNPVKPKDGDFK 501
Query 525 VKIDSILGYNPRYYAWKSNIDRIHGAFTTTLQDWVSPVDDSFLYS--TFGTPS----SGS 578
V + GY P+YY WK+ +D+ G F +L+ W+ P DD L + + P
Sbjct 502 VSPNLYFGYAPQYYNWKTTLDKSMGEFRRSLKTWIIPFDDEALLAADSVDFPDNPNVEAD 561
Query 579 FVTWPFFKVNPNTLDNIFAVKSDSTWESDQFLVNSYVGCKVVRPLSRDGVPY 630
V FFKV+P+ LDN+FAVK++S +DQFL ++ VVR L +G+PY
Sbjct 562 SVKAGFFKVSPSVLDNLFAVKANSDLNTDQFLCSTLFDVNVVRSLDPNGLPY 613
>gi|575094321|emb|CDL65708.1| unnamed protein product [uncultured bacterium]
Length=642
Score = 256 bits (654), Expect = 3e-72, Method: Compositional matrix adjust.
Identities = 192/663 (29%), Positives = 310/663 (47%), Gaps = 61/663 (9%)
Query 2 AHFTGLKQLQNHPHRSGFDIGAKNVFSAKCGELLPVYWDLGIPGCTYDIDIQYFTRTRPV 61
++ GL L+N P R+ FD+ +N+F+AK GELLP + PG + + YFTRT P+
Sbjct 5 SNIMGLHGLKNKPSRNSFDLSHRNMFTAKVGELLPCFVQELNPGDSVKVSSSYFTRTAPL 64
Query 62 QTAAYTRIREYFDFYAVPIDLIWKSFDASVIQMGETA-----PVQAKDILTALTVSGDLP 116
Q+ A+TR+RE ++ VP +WK FD+ V+ M + A A ++ V+ +P
Sbjct 65 QSNAFTRLRENVQYFFVPYSALWKYFDSQVLNMTKNANGGDISRIASSLVGNQKVTTQMP 124
Query 117 YCSLSDLG--LSCFFASGSMSVPSLKSWQANNAYANIFGYIRGDVNYKLIHMLNYGNIIP 174
+ L L F ++ + N G R + KL+ +L YGN P
Sbjct 125 CVNYKTLHAYLLKFINRSTVGSDGSVGPEFNR------GCYRHAESAKLLQLLGYGNF-P 177
Query 175 NNMPALNIGNSNYRWWNREAPLASDSLIVYSQMYNFNMNVNLFPLATYQKIYQDFFRWSQ 234
+ N + N+ D YN + +++F L Y KI D + + Q
Sbjct 178 EQFANFKVNNDKH---NQSGQNFKDV------TYNNSPYLSIFRLLAYHKICNDHYLYRQ 228
Query 235 WEKADPTSYNFDWYQGSGNLFGGTIDT--SLPASSDYWKRDNLFSLRYCNWNKDLFMGVL 292
W+ + + N D+ + + D S+P S ++ NL +R+ N D F GVL
Sbjct 229 WQPYNASLCNVDYLTPNSSSLLSIDDALLSIPDDSIKAEKLNLLDMRFSNLPLDYFTGVL 288
Query 293 PNSQFGDIAVIDIEGG-------LNIPASRISLSSNNRPTIG---IKVGAQVSSPNNCSI 342
P SQFG +V+++ G LN S+ S R T G ++ S+ N +
Sbjct 289 PTSQFGSESVVNLNLGNASGSAVLNGTTSKDS--GRWRTTTGEWEMEQRVASSANGNLKL 346
Query 343 TNSSGNLSTGDILSVGIPAASYKLQSSFNVLALRQAESLQKYREITQSVDTNYRDQIKAH 402
NS+G + D G A + L + +++ALR A + QKY+EI + D +++ Q++AH
Sbjct 347 DNSNGTFISHDHTFSGNVAINTSLSGNLSIIALRNALAAQKYKEIQLANDVDFQSQVEAH 406
Query 403 FGVNVPASDSHMAQYIGGIARNLDISEVVNNNLQGDGEAVIYGKGVGTGTGSMRYTTGSK 462
FG+ P + + +IGG + ++I+E +N NL GD +A G G+ S+++T +
Sbjct 407 FGIK-PDEKNENSLFIGGSSSMININEQINQNLSGDNKATYGAAPQGNGSASIKFTAKT- 464
Query 463 YCILMCIYHCMPVLDYDISGQHPQLLATSVDELPIPEFDNIGMEGVPLVQLVNSNLYKTN 522
Y +++ IY C PVLD+ G L T + IPE D+IGM+ ++ Y
Sbjct 465 YGVVIGIYRCTPVLDFAHLGIDRTLFKTDASDFVIPEMDSIGMQQTFRCEVAAPAPYNDE 524
Query 523 ---------KSVKIDSILGYNPRYYAWKSNIDRIHGAFTTTLQDWVSPVDDSFLYSTFGT 573
S + GY PRY +K++ DR +GAF +L+ WV+ ++ F
Sbjct 525 FKAFRVGDGSSPDMSETYGYAPRYSEFKTSYDRYNGAFCHSLKSWVTGIN-------FDA 577
Query 574 PSSGSFVTWP------FFKVNPNTLDNIFAVKSDSTWESDQFLVNSYVGCKVVRPLSRDG 627
+ + TW F P+ + N+F V S + + DQ V C R LSR G
Sbjct 578 IQNNVWNTWAGINAPNMFACRPDIVKNLFLVSSTNNSDDDQLYVGMVNMCYATRNLSRYG 637
Query 628 VPY 630
+PY
Sbjct 638 LPY 640
>gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis]
gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM
17361]
Length=553
Score = 181 bits (459), Expect = 6e-46, Method: Compositional matrix adjust.
Identities = 172/639 (27%), Positives = 277/639 (43%), Gaps = 111/639 (17%)
Query 7 LKQLQNHPHRSGFDIGAKNVFSAKCGELLPVY-WDLGIPGCTYDIDIQYFTRTRPVQTAA 65
+K + + +R+ FD+ +++F+A G LLPV DL IP +I+ Q F RT P+ TAA
Sbjct 8 IKATRPNRNRNAFDLSQRHLFTAHAGMLLPVLNLDL-IPHDHVEINAQDFMRTLPMNTAA 66
Query 66 YTRIREYFDFYAVPIDLIWKSFDASVIQMGETAPVQAKDILTALTVSGDLPYCSLSDL-- 123
+ +R ++F+ VP +W FD + M + K I T +PY ++ +
Sbjct 67 FASMRGVYEFFFVPYHQLWAQFDQFITGMNDFHSSANKSIQGG-TSPLQVPYFNVDSVFN 125
Query 124 GLSCFFASGSMSVPSLKSWQANNAYA--NIFGYIRGDVNYKLIHMLNYGNIIPNNMPALN 181
L+ SGS S L+ A+ ++ GY R ++G P+N+ L
Sbjct 126 SLNTGKESGSGSTDDLQYKFKYGAFRLLDLLGYGR--------KFDSFGTAYPDNVSGLK 177
Query 182 IGNSNYRWWNREAPLASDSLIVYSQMYNFNMNVNLFPLATYQKIYQDFFRWSQWEKADPT 241
N + N ++F + Y KIYQD++R S +E D
Sbjct 178 --------------------------NNLDYNCSVFRILAYNKIYQDYYRNSNYENFDTD 211
Query 242 SYNFDWYQGSGNLFGGTIDTSLPASSDYWKRDNLFSLRYCNWNKDLFMGVLPNSQFGDIA 301
S+NFD ++G G +D + A +LF LRY N D F + + F
Sbjct 212 SFNFDKFKG------GLVDAKVVA--------DLFKLRYRNAQTDYFTNLRQSQLFSFTT 257
Query 302 VIDIEGGLNIPASRISLSSNNRPTIGIKVGAQVSSPNNCSITNSSGNLSTGDILSVGIPA 361
+ +NI A R + S+ + G S S GD
Sbjct 258 AFEDVDNINI-APRDYVKSDGSNFTRVNFGVDTDS-------------SEGD-------- 295
Query 362 ASYKLQSSFNVLALRQAESLQKYREITQSVDTNYRDQIKAHFGVNVPASDSHMAQYIGGI 421
F+V +LR A ++ K +T ++DQ++AH+GV +P S Y+GG
Sbjct 296 --------FSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYGVEIPDSRDGRVNYLGGF 347
Query 422 ARNLDISEVVNNN----LQGDGEAVIYGKGVGTGTGSMR---YTTGSKYCILMCIYHCMP 474
++ +S+V + + EA G+ G GTGS R ++ +LMCIY +P
Sbjct 348 DSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTGSGRGRIVFDAKEHGVLMCIYSLVP 407
Query 475 VLDYDISGQHPQLLATSVDELPIPEFDNIGMEGVPLVQLVNSNLYKTNKSVKIDSILGYN 534
+ YD + P + + PEF+N+GM+ PL S+ T+ + +LGY
Sbjct 408 QIQYDCTRLDPMVDKLDRFDYFTPEFENLGMQ--PLNSSYISSFCTTDPK---NPVLGYQ 462
Query 535 PRYYAWKSNIDRIHGAFTTTLQDWVSPVDDSFLYSTFGTPSSGSFVTWPFFKVNPNTLDN 594
PRY +K+ +D HG F + D +S S+ S F ++ + FK++P L++
Sbjct 463 PRYSEYKTALDVNHGQFAQS--DALS----SWSVSRFRRWTTFPQLEIADFKIDPGCLNS 516
Query 595 IFAVKSDSTWESDQFLVNSYVGCKV----VRPLSRDGVP 629
IF V + T +D Y GC V +S DG+P
Sbjct 517 IFPVDYNGTEANDCV----YGGCNFNIVKVSDMSVDGMP 551
>gi|647452987|ref|WP_025792807.1| hypothetical protein [Prevotella histicola]
Length=584
Score = 179 bits (455), Expect = 3e-45, Method: Compositional matrix adjust.
Identities = 183/650 (28%), Positives = 285/650 (44%), Gaps = 115/650 (18%)
Query 16 RSGFDIGAKNVFSAKCGELLPV-YWDLGIPGCTYDIDIQYFTRTRPVQTAAYTRIREYFD 74
R+GFD+ ++ +FSAK G+LLP+ W++ P + +Q RT + TA+Y R++EY+
Sbjct 10 RNGFDLSSRRIFSAKAGQLLPIGCWEVN-PSEHFKFSVQDLVRTTTLNTASYARMKEYYH 68
Query 75 FYAVPIDLIWKSFDASVIQMGETAPVQAKDILTALTVSGDLPYCSLSDLGLSCFFASGSM 134
F+ V +W+ FD ++ G P A L + +G Y +
Sbjct 69 FFFVSYRSLWQWFDQFIV--GTNNPHSA---LNGVKKNGTTNYNQICS------------ 111
Query 135 SVPS------LKSWQANNAYANIFGYIRGDVNYKLIHMLNYGNIIPNNMPALNIGNSNYR 188
SVP+ + + ++ + F Y G KL++MLNYG + N +N+ N
Sbjct 112 SVPTFDLGKLITRLKTSDMDSQGFNYSEGAA--KLLNMLNYG--VTNKGKFMNLEN---- 163
Query 189 WWNREAPLASDSLIVYSQMYNFNMNVNLFPLATYQKIYQDFFRWSQWEKADPTSYNFDWY 248
+ L S S +Y V+ F L YQKI+ DF+R W +D S+N D Y
Sbjct 164 LITSTSYLPSKDDKEPSSIYA--CKVSPFRLLAYQKIFNDFYRNQDWTPSDVRSFNVDDY 221
Query 249 QGSGNLFGGTIDTSLPASSDYWKRDNLFSLRYCNWNKDLFMGVLPNSQFGDIAVIDIEGG 308
NL TI+ + +RY + KD + P + D G
Sbjct 222 ADDSNL---TIEPDVALK--------FCQMRYRPYAKDWLTSMKPTPNYSD-------GI 263
Query 309 LNIPASRISLSSNNRPTIGIKVGAQVSSPNNCSITNS-SGNLSTGDILSVGIPAASYKLQ 367
N+P V N +TN+ SG++S L G + S
Sbjct 264 FNLPE-------------------YVRGNGNVILTNNKSGSVS----LDSGTVSPS---- 296
Query 368 SSFNVLALRQAESLQKYREITQSVD-TNYRDQIKAHFGVNVPASDSHMAQYIGGIARNLD 426
SF+V LR A +L K E T+ + +Y QI+AHFG VP S ++ A+++GG ++
Sbjct 297 -SFSVNDLRAAFALDKMLEATRRANGLDYASQIEAHFGFKVPESRANDARFLGGFDNSIV 355
Query 427 ISEVV--NNNLQGDGEAVIYGKGVGTGTGSMRYTT----GSKYCILMCIYHCMPVLDYDI 480
+SEVV N N DG G G G GSM T +++ I+MCIY P +Y+
Sbjct 356 VSEVVSTNGNAASDGSHASIGDLGGKGIGSMSSGTIEFDSTEHGIIMCIYSVAPQSEYNA 415
Query 481 SGQHPQLLATSVDELPIPEFDNIGMEGVPLVQLVNSNLYKTNKSVKIDSI------LGYN 534
S P + ++ PEF ++G + + L+ S L K I LGY
Sbjct 416 SYLDPFNRKLTREQFYQPEFADLGYQALIGSDLICSTLGMNEKQAGFSDIELNNNLLGYQ 475
Query 535 PRYYAWKSNIDRIHGAFTT--TLQDWVSPVDDSFLY--------------STFGTPSSGS 578
RY +K+ D + G F + +L W +P D F Y + + + S
Sbjct 476 VRYNEYKTARDLVFGDFESGKSLSYWCTPRFD-FGYGDTEKKIAPENKGGADYRKKGNRS 534
Query 579 FVTWPFFKVNPNTLDNIFAVKSDSTWESDQFLVNSYVGCKVVRPLSRDGV 628
+ F +NPN ++ IF S ++D F+VNS++ K VRP+S G+
Sbjct 535 HWSSRNFYINPNLVNPIFLT---SAVQADHFIVNSFLDVKAVRPMSVTGL 581
>gi|517172762|ref|WP_018361580.1| hypothetical protein [Prevotella nanceiensis]
Length=568
Score = 171 bits (434), Expect = 1e-42, Method: Compositional matrix adjust.
Identities = 165/634 (26%), Positives = 256/634 (40%), Gaps = 105/634 (17%)
Query 16 RSGFDIGAKNVFSAKCGELLPVYWDLGIPGCTYDIDIQYFTRTRPVQTAAYTRIREYFDF 75
R+ FDI +++F+A G LLPV +P +I+ F RT P+ +AA+ +R ++F
Sbjct 18 RNAFDISQRHLFTAPAGALLPVLSLDLLPHDHVEINASDFMRTLPMNSAAFMSMRGVYEF 77
Query 76 YAVPIDLIWKSFDASVIQMGETAPVQAKDILTALTVSGDLPYCSLSDLGLSCFFASGSMS 135
Y VP +W FD + M + Y SC S
Sbjct 78 YFVPYKQLWSGFDQFITGMSD--------------YKSSFMYAFKGKTPPSCV----SFD 119
Query 136 VPSLKSWQANNAYANIFGYIRGDVNYKLIHMLNYGNIIPNNMPALNIGNSNYRWWNREAP 195
V L W N +I G+ + Y+++ +L YG
Sbjct 120 VQKLVDWCKTNTAKDIHGFDKNKGVYRILDLLGYGK------------------------ 155
Query 196 LASDSLIVYSQMYNFNM-NVNLFPLATYQKIYQDFFRWSQWEKADPTSYNFDWYQGSGNL 254
A+ + + Y+ + M F YQKIY DF+R + +E+ S+N D + GSG
Sbjct 156 YANSAGVPYTNPTSTTMGKCTPFRGLAYQKIYNDFYRNTTYEEYQLESFNVDMFYGSGK- 214
Query 255 FGGTIDTSLPASSDYWKRDNLFSLRYCNWNKDLFMGVLPNSQFGDIAVIDIEGGLNIPAS 314
+ ++P ++ W D F+LRY N KDL V P F ++ D S
Sbjct 215 ----VKETIP--NEPWDYD-WFTLRYRNAQKDLLTNVRPTPLF---SIDDFNPQFFTGGS 264
Query 315 RISLSSNNRPTIGIKVGAQVSSPNNCSITNSSGNLSTGDILSVGIPAASYKLQSSFNVLA 374
I + T G + S+ NL + S ++ +V
Sbjct 265 DIVMEKGPNVTGG-------THEYRDSVVIVGKNLKENGVDS---------KRTMISVAD 308
Query 375 LRQAESLQKYREITQSVDTNYRDQIKAHFGVNVPASDSHMAQYIGGIARNLDISEVVNNN 434
+R A +L+K +T Y++Q++AHFG++V YIGG N+ + +V
Sbjct 309 IRNAFALEKLASVTMRAGKTYKEQMEAHFGISVEEGRDGRCTYIGGFDSNIQVGDVT--- 365
Query 435 LQGDGEAV--------------IYGKGVGTGTGSMRYTTGSKYCILMCIYHCMPVLDYDI 480
Q G V GK G+G+G +R+ ++ ILMCIY +P + YD
Sbjct 366 -QSSGTTVTGTKDTSFGGYLGRTTGKATGSGSGHIRF-DAKEHGILMCIYSLVPDVQYDS 423
Query 481 SGQHPQLLATSVDELPIPEFDNIGMEGVPLVQLVNSNLYKTNKS---VKIDSILGYNPRY 537
P + + +PEF+N+GM+ PL S Y N + +K G+ PRY
Sbjct 424 KRVDPFVQKIERGDFFVPEFENLGMQ--PLFAKNISYKYNNNTANSRIKNLGAFGWQPRY 481
Query 538 YAWKSNIDRIHGAFT--TTLQDWVSPVDDSFLYSTFGTPSSGSFVTWPFFKVNPNTLDNI 595
+K+ +D HG F L W S F + FK+NP LD++
Sbjct 482 SEYKTALDINHGQFVHQEPLSYWTVARARGESMSNFNIST---------FKINPKWLDDV 532
Query 596 FAVKSDSTWESDQFLVNSYVGCKVVRPLSRDGVP 629
FAV + T +DQ Y V +S DG+P
Sbjct 533 FAVNYNGTELTDQVFGGCYFNIVKVSDMSIDGMP 566
>gi|496521299|ref|WP_009229582.1| capsid protein [Prevotella sp. oral taxon 317]
gi|288330570|gb|EFC69154.1| putative capsid protein (F protein) [Prevotella sp. oral taxon
317 str. F0108]
Length=541
Score = 166 bits (421), Expect = 4e-41, Method: Compositional matrix adjust.
Identities = 165/641 (26%), Positives = 274/641 (43%), Gaps = 139/641 (22%)
Query 12 NHPHRSGFDIGAKNVFSAKCGELLPVYWDLGIPGCTYD---IDIQYFTRTRPVQTAAYTR 68
N P RS FD+ K++++A G LLPV L + +D I Q F RT P+ +AA+
Sbjct 15 NRP-RSAFDLSQKHLYTAPAGALLPV---LSVDLMFHDHIRIQAQDFMRTMPMNSAAFIS 70
Query 69 IREYFDFYAVPIDLIWKSFDASVIQMGETAPVQAKDILTALTVSGDLPYCSLSDLGLSCF 128
+R ++F+ VP +W +D + M + ++ + +A +GD S+ ++ L+
Sbjct 71 MRGVYEFFFVPYSQLWHPYDQFITSMND---YRSSVVSSA---AGDKALDSVPNVKLADM 124
Query 129 FASGSMSVPSLKSWQANNAYANIFGYIRGDVNYKLIHMLNYGNIIPNN---MPALNIGNS 185
+ + +IFGY + + +L+ +L YG I ++ +P L G
Sbjct 125 Y-----------KFVRERTDKDIFGYPHSNNSCRLMDLLGYGKPITSSKTPVPLLYTG-- 171
Query 186 NYRWWNREAPLASDSLIVYSQMYNFNMNVNLFPLATYQKIYQDFFRWSQWEKADPTSYNF 245
NVNLF L Y KIY D++R + +E D S+N
Sbjct 172 ---------------------------NVNLFRLLAYNKIYSDYYRNTTYEGVDVYSFNI 204
Query 246 DWYQGSGNLFGGTIDTSLPASSDYWKRDNLFSLRYCNWNKDLFMGVLPNSQFGDIAVIDI 305
D +G T +P + ++ K N L Y N D + + P F
Sbjct 205 DHKKG----------TFVPTADEFKKYLN---LHYRNAPLDFYTNLRPTPLF-------- 243
Query 306 EGGLNIPASRISLSSNNRPTIGIKVGAQVSSPNNCSITNSSGNLSTGDILSVGIPAASYK 365
++ S++ ++ Q+S P + ++ GN + ++ S +
Sbjct 244 -----------TIGSDSFSSV-----LQLSDPTGSAGFSADGNSAKLNMASPDV------ 281
Query 366 LQSSFNVLALRQAESLQKYREITQSVDTNYRDQIKAHFGVNVPASDSHMAQYIGGIARNL 425
NV A+R A +L K I+ Y +QI+AHFGV V Y+GG N+
Sbjct 282 ----LNVSAIRSAFALDKLLSISMRAGKTYAEQIEAHFGVTVSEGRDGQVYYLGGFDSNV 337
Query 426 DISEV------VNNNLQGDGEA-------VIYGKGVGTGTGSMRYTTGSKYCILMCIYHC 472
+ +V N N+ G A I GKG G+G G +++ + +LMCIY
Sbjct 338 QVGDVTQTSGTTNPNVSEVGNAKLAGYLGKITGKGTGSGYGEIQF-DAKEPGVLMCIYSV 396
Query 473 MPVLDYDISGQHPQLLATSVDELPIPEFDNIGMEGVPLV-QLVNSNLYKTNKSVKIDSIL 531
+P + YD P + + + IPEF+N+GM+ P+V V+ N K N
Sbjct 397 VPAMQYDCMRLDPFVAKQTRGDYFIPEFENLGMQ--PIVPAFVSLNRAKDNS-------Y 447
Query 532 GYNPRYYAWKSNIDRIHGAFT--TTLQDW-VSPVDDSFLYSTFGTPSSGSFVTWPFFKVN 588
G+ PRY +K+ D HG F L W ++ S +TF + K+N
Sbjct 448 GWQPRYSEYKTAFDINHGQFANGEPLSYWSIARARGSDTLNTFNVAA---------LKIN 498
Query 589 PNTLDNIFAVKSDSTWESDQFLVNSYVGCKVVRPLSRDGVP 629
P+ LD++FAV + T +D ++ + V ++ DG+P
Sbjct 499 PHWLDSVFAVNYNGTEVTDCMFGYAHFNIEKVSDMTEDGMP 539
Lambda K H a alpha
0.319 0.136 0.422 0.792 4.96
Gapped
Lambda K H a alpha sigma
0.267 0.0410 0.140 1.90 42.6 43.6
Effective search space used: 4767413412972