bitscore colors: <40, 40-50 , 50-80, 80-200, >200

BLASTP 2.2.30+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
49,011,213 sequences; 17,563,301,199 total letters
Query= Contig-31_CDS_annotation_glimmer3.pl_2_2
Length=322
Score E
Sequences producing significant alignments: (Bits) Value
gi|649557305|gb|KDS63784.1| capsid family protein 83.6 1e-15
gi|492501782|ref|WP_005867318.1| hypothetical protein 84.7 6e-15
gi|649569140|gb|KDS75238.1| capsid family protein 83.6 8e-15
gi|649555287|gb|KDS61824.1| capsid family protein 83.2 2e-14
gi|547920049|ref|WP_022322420.1| capsid protein VP1 82.0 4e-14
gi|494610271|ref|WP_007368517.1| capsid protein 81.6 6e-14
gi|494308783|ref|WP_007173938.1| hypothetical protein 75.5 7e-12
gi|494306153|ref|WP_007173049.1| hypothetical protein 74.7 1e-11
gi|609718276|emb|CDN73650.1| conserved hypothetical protein 74.3 2e-11
gi|496521299|ref|WP_009229582.1| capsid protein 71.6 1e-10
>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=245
Score = 83.6 bits (205), Expect = 1e-15, Method: Compositional matrix adjust.
Identities = 71/246 (29%), Positives = 113/246 (46%), Gaps = 20/246 (8%)
Query 79 LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE 137
R A SG Y + + FGVR S A + P ++GG + I EV+ T++ +S
Sbjct 18 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDS----TS 73
Query 138 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 196
P ++AG G G EE IM + SI PR Y QG K + + D M DF+
Sbjct 74 PQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNM-DFY 130
Query 197 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 256
P +G QE+ ++E++ +N + + + G P + EY + NE +GDF
Sbjct 131 FPEFAHLGEQEIKNEELY-----LNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGN-- 183
Query 257 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 316
+ + NR++ P L TT+++ N+ FA + +W+QI D+ R+
Sbjct 184 MAFWHLNRIF--KEKPNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK 239
Query 317 REIPNL 322
P L
Sbjct 240 YGTPML 245
>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis
CL09T03C24]
Length=538
Score = 84.7 bits (208), Expect = 6e-15, Method: Compositional matrix adjust.
Identities = 72/246 (29%), Positives = 115/246 (47%), Gaps = 20/246 (8%)
Query 79 LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE 137
R A SG Y + + FGVR S A + P ++GG + I EV+ T+A +S
Sbjct 311 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSATDS----TS 366
Query 138 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 196
P ++AG G G K EE I+ + SI PR Y QG K + + D M DF+
Sbjct 367 PQANMAGHGISAGVNHG--FKRYFEEHGYIIGIMSIRPRTGYQQGVPKDFRKFDNM-DFY 423
Query 197 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 256
P +G QE+ ++E++ + + + + G P + EY ++NE +GDF
Sbjct 424 FPEFAHLGEQEIKNEEVY-----LQQTPASNNGTFGYTPRYAEYKYSMNEVHGDFRGN-- 476
Query 257 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 316
+ + NR++ + P L TT+++ N+ FA + +WIQ+ DV R+
Sbjct 477 MAFWHLNRIF--SESPNLN--TTFVECNPSNRVFATAETSDDKYWIQLYQDVKALRLMPK 532
Query 317 REIPNL 322
P L
Sbjct 533 YGTPML 538
>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str.
3999B T(B) 6]
Length=390
Score = 83.6 bits (205), Expect = 8e-15, Method: Compositional matrix adjust.
Identities = 71/246 (29%), Positives = 113/246 (46%), Gaps = 20/246 (8%)
Query 79 LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE 137
R A SG Y + + FGVR S A + P ++GG + I EV+ T++ +S
Sbjct 163 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDS----TS 218
Query 138 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 196
P ++AG G G EE IM + SI PR Y QG K + + D M DF+
Sbjct 219 PQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNM-DFY 275
Query 197 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 256
P +G QE+ ++E++ +N + + + G P + EY + NE +GDF
Sbjct 276 FPEFAHLGEQEIKNEELY-----LNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGN-- 328
Query 257 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 316
+ + NR++ P L TT+++ N+ FA + +W+QI D+ R+
Sbjct 329 MAFWHLNRIFK--EKPNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK 384
Query 317 REIPNL 322
P L
Sbjct 385 YGTPML 390
>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=541
Score = 83.2 bits (204), Expect = 2e-14, Method: Compositional matrix adjust.
Identities = 71/246 (29%), Positives = 113/246 (46%), Gaps = 20/246 (8%)
Query 79 LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE 137
R A SG Y + + FGVR S A + P ++GG + I EV+ T++ +S
Sbjct 314 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDS----TS 369
Query 138 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 196
P ++AG G G EE IM + SI PR Y QG K + + D M DF+
Sbjct 370 PQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNM-DFY 426
Query 197 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 256
P +G QE+ ++E++ +N + + + G P + EY + NE +GDF
Sbjct 427 FPEFAHLGEQEIKNEELY-----LNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGN-- 479
Query 257 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 316
+ + NR++ P L TT+++ N+ FA + +W+QI D+ R+
Sbjct 480 MAFWHLNRIFK--EKPNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK 535
Query 317 REIPNL 322
P L
Sbjct 536 YGTPML 541
>gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
Length=553
Score = 82.0 bits (201), Expect = 4e-14, Method: Compositional matrix adjust.
Identities = 73/246 (30%), Positives = 115/246 (47%), Gaps = 20/246 (8%)
Query 79 LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE 137
R A G Y + + FGVR S A + P ++GG I EV+ T++ + ET
Sbjct 326 FERNARGGSRYIEQILSHFGVRSSDARLQRPQFLGGGRMPISVSEVLQTSS--TDETS-- 381
Query 138 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 196
P ++AG G G K EE I+ + SI PR Y QG + +T+ D M DF+
Sbjct 382 PQANMAGHGISAGINNG--FKHYFEEHGYIIGIMSITPRSGYQQGVPRDFTKFDNM-DFY 438
Query 197 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 256
P + QE+ +QE+ V+ + + + G P + EY +E +GDF
Sbjct 439 FPEFAHLSEQEIKNQELF-----VSEDAAYNNGTFGYTPRYAEYKYHPSEAHGDFRGN-- 491
Query 257 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 316
L + NR+++ + P L TT+++ + N+ FA S + FW+Q+ DV R+
Sbjct 492 LSFWHLNRIFE--DKPNLN--TTFVECKPSNRVFATSETEDDKFWVQMYQDVKALRLMPK 547
Query 317 REIPNL 322
P L
Sbjct 548 YGTPML 553
>gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis]
gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM
16608]
Length=531
Score = 81.6 bits (200), Expect = 6e-14, Method: Compositional matrix adjust.
Identities = 74/272 (27%), Positives = 117/272 (43%), Gaps = 44/272 (16%)
Query 86 GGSYRDWQEAVFGVRV--SRAAESPIYVGGYASEIVFDEVVSTAAFESGETGQEPLGSLA 143
G Y EA FG RV SRA ++ ++GG+ + +V EVV+ + F+ G LG L
Sbjct 269 GLDYSSQIEAHFGFRVPESRAGDA-RFIGGFDNPVVISEVVNQSEFDRGADESPCLGDLG 327
Query 144 GRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQGNKWWTRVDTMN------DFHK 197
G+G +I +E +IM + S+VP+ +Y+ T D N DF +
Sbjct 328 GKG--VGSLNSSSIDFDVKEHGIIMCIYSVVPQTEYNG-----TYFDPFNRKLRREDFFQ 380
Query 198 PNLDQIGFQELLSQEMHG------------RAWRVNANYKTTDFS-----VGKQPAWTEY 240
P +G+Q +++ ++ + R+ A Y + +G Q + EY
Sbjct 381 PEFADLGYQPVVTSDLISTYLDNPVPDGPEKQKRLAAGYPLSSIEANNRLLGWQVRYNEY 440
Query 241 TTTVNETYGDFAAGEPLEYMAFNRVYDVANDPKLKD----------ATTYIDPQIFNKAF 290
T+ + +G+F +G L Y R YD D K D A Y++P I N F
Sbjct 441 KTSRDLVFGEFESGLSLSYWCSPR-YDFGFDGKAGDKKLVNSPWSPAHFYVNPSILNTIF 499
Query 291 ANSNLDAKNFWIQIGFDVIGRRVKSAREIPNL 322
S + A +F + FDV R S + L
Sbjct 500 LVSAVKADHFLVNSFFDVKAVRPMSVSGLAGL 531
>gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis]
gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM
17361]
Length=553
Score = 75.5 bits (184), Expect = 7e-12, Method: Compositional matrix adjust.
Identities = 67/259 (26%), Positives = 120/259 (46%), Gaps = 30/259 (12%)
Query 46 DGTN--GINEITSVDVTSGLLTMDALILQKKVYDMLNRIAVSGGSYRDWQEAVFGVRVSR 103
DG+N +N D + G ++ +L V +L+ +G +++D A +GV +
Sbjct 276 DGSNFTRVNFGVDTDSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYGVEIPD 335
Query 104 AAESPI-YVGGYASEIVFDEVVSTAAFESGETGQEP--LGSLAGRGRETSKRGGKNIKIR 160
+ + + Y+GG+ S++ +V T+ + E E LG +AG+G + G I
Sbjct 336 SRDGRVNYLGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKG---TGSGRGRIVFD 392
Query 161 CEEPSLIMILGSIVPRVDYSQGNKWWTRVDTM------NDFHKPNLDQIGFQELLSQEMH 214
+E ++M + S+VP++ Y TR+D M D+ P + +G Q L S +
Sbjct 393 AKEHGVLMCIYSLVPQIQYD-----CTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYI- 446
Query 215 GRAWRVNANYKTTDFS---VGKQPAWTEYTTTVNETYGDFAAGEPLEYMAFNRVYDVAND 271
+++ TTD +G QP ++EY T ++ +G FA + L + +R
Sbjct 447 -------SSFCTTDPKNPVLGYQPRYSEYKTALDVNHGQFAQSDALSSWSVSRFRRWTTF 499
Query 272 PKLKDATTYIDPQIFNKAF 290
P+L+ A IDP N F
Sbjct 500 PQLEIADFKIDPGCLNSIF 518
>gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis]
gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM
17361]
Length=519
Score = 74.7 bits (182), Expect = 1e-11, Method: Compositional matrix adjust.
Identities = 71/285 (25%), Positives = 126/285 (44%), Gaps = 32/285 (11%)
Query 22 SQCGLGIRTYLSDRFNNWLNTEWIDGTNG----INEITSVDVTSGLLTMDALILQKKVYD 77
SQ I + D N+ ++ D + +N VD G ++ +L V
Sbjct 216 SQLFTFIPEFSDDEHLNFDRDQYADQSKSNFTQLNFPVDVDNNLGYFSVSSLRSAFAVDK 275
Query 78 MLNRIAVSGGSYRDWQEAVFGVRVSRAAESPI-YVGGYASEIVFDEVVSTAAFESGETGQ 136
+L+ +G +++D A +GV + + + + Y+GG+ S++ +V T+ + E
Sbjct 276 LLSVTMRAGKTFQDQMRAHYGVEIPDSRDGRVNYLGGFDSDLQVSDVTQTSGTTATEYKP 335
Query 137 EP--LGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQGNKWWTRVDTM-- 192
E LG +AG+G + G I +E ++M + S+VP++ Y TR+D M
Sbjct 336 EAGYLGRIAGKG---TGSGRGRIVFDAKEHGVLMCIYSLVPQIQYD-----CTRLDPMVD 387
Query 193 ----NDFHKPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFS---VGKQPAWTEYTTTVN 245
DF P + +G Q L S + +++ T D +G QP ++EY T ++
Sbjct 388 KLDRFDFFTPEFENLGMQPLNSSYI--------SSFCTPDPKNPVLGYQPRYSEYKTALD 439
Query 246 ETYGDFAAGEPLEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAF 290
+G FA + L + +R P+L+ A IDP N F
Sbjct 440 INHGQFAQNDALSSWSVSRFRRWTTFPQLEIADFKIDPGCLNSVF 484
>gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis]
Length=537
Score = 74.3 bits (181), Expect = 2e-11, Method: Compositional matrix adjust.
Identities = 62/244 (25%), Positives = 114/244 (47%), Gaps = 16/244 (7%)
Query 74 KVYDMLNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESG 132
K+ + L + A +G Y + + FGV+ S + P ++GG S I+ EV+ +A +S
Sbjct 299 KLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEFLGGNKSPIMISEVLQQSATDS- 357
Query 133 ETGQEPLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDT 191
P G++AG G K GG EE ++ L S++P+ YSQG + +++ D
Sbjct 358 ---TTPQGNMAGHGIGIGKDGG--FSRFFEEHGYVIGLMSVIPKTSYSQGIPRHFSKSDK 412
Query 192 MNDFHKPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDF 251
D+ P + IG Q + ++E+ + N + ++ G P ++EY + + +GDF
Sbjct 413 F-DYFWPQFEHIGEQPVYNKEIFAK----NIDAFDSEAVFGYLPRYSEYKFSPSTVHGDF 467
Query 252 AAGEPLEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGR 311
+ L + R++D P L + D ++ FA + D F+ + + +
Sbjct 468 K--DDLYFWHLGRIFDTDKPPVLNQSFIECDKNALSRIFAVED-DTDKFYCHLYQKITAK 524
Query 312 RVKS 315
R S
Sbjct 525 RKMS 528
>gi|496521299|ref|WP_009229582.1| capsid protein [Prevotella sp. oral taxon 317]
gi|288330570|gb|EFC69154.1| putative capsid protein (F protein) [Prevotella sp. oral taxon
317 str. F0108]
Length=541
Score = 71.6 bits (174), Expect = 1e-10, Method: Compositional matrix adjust.
Identities = 70/257 (27%), Positives = 110/257 (43%), Gaps = 28/257 (11%)
Query 46 DGTNGINEITSVDVTSGLLTMDALILQKKVYDMLNRIAVSGGSYRDWQEAVFGVRVSRAA 105
DG + + S DV + A L K +L+ +G +Y + EA FGV VS
Sbjct 268 DGNSAKLNMASPDVLNVSAIRSAFALDK----LLSISMRAGKTYAEQIEAHFGVTVSEGR 323
Query 106 ESPIY-VGGYASEIVFDEVVSTAAFESGETGQEPLGSLAGR-GRETSKRGGK---NIKIR 160
+ +Y +GG+ S + +V T+ + + LAG G+ T K G I+
Sbjct 324 DGQVYYLGGFDSNVQVGDVTQTSGTTNPNVSEVGNAKLAGYLGKITGKGTGSGYGEIQFD 383
Query 161 CEEPSLIMILGSIVPRVDYSQGNKWWTRVD------TMNDFHKPNLDQIGFQELLSQEMH 214
+EP ++M + S+VP + Y R+D T D+ P + +G Q ++
Sbjct 384 AKEPGVLMCIYSVVPAMQYD-----CMRLDPFVAKQTRGDYFIPEFENLGMQPIVPA--- 435
Query 215 GRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEPLEYMAFNRVYDVANDPKL 274
V+ N + D S G QP ++EY T + +G FA GEPL Y + R
Sbjct 436 ----FVSLN-RAKDNSYGWQPRYSEYKTAFDINHGQFANGEPLSYWSIARARGSDTLNTF 490
Query 275 KDATTYIDPQIFNKAFA 291
A I+P + FA
Sbjct 491 NVAALKINPHWLDSVFA 507
Lambda K H a alpha
0.317 0.134 0.404 0.792 4.96
Gapped
Lambda K H a alpha sigma
0.267 0.0410 0.140 1.90 42.6 43.6
Effective search space used: 1793877651450