bitscore colors: <40, 40-50 , 50-80, 80-200, >200

BLASTP 2.2.30+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
49,011,213 sequences; 17,563,301,199 total letters
Query= Contig-23_CDS_annotation_glimmer3.pl_2_1
Length=199
Score E
Sequences producing significant alignments: (Bits) Value
gi|649557305|gb|KDS63784.1| capsid family protein 79.0 1e-14
gi|649569140|gb|KDS75238.1| capsid family protein 78.6 5e-14
gi|649555287|gb|KDS61824.1| capsid family protein 78.2 1e-13
gi|492501782|ref|WP_005867318.1| hypothetical protein 74.3 3e-12
gi|547920049|ref|WP_022322420.1| capsid protein VP1 73.2 8e-12
gi|547312923|ref|WP_022044635.1| putative uncharacterized protein 68.2 2e-10
gi|494610271|ref|WP_007368517.1| capsid protein 64.7 4e-09
gi|647452987|ref|WP_025792807.1| hypothetical protein 60.5 1e-07
gi|565841287|ref|WP_023924568.1| hypothetical protein 59.3 3e-07
gi|648626869|ref|WP_026318620.1| hypothetical protein 50.8 1e-04
>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=245
Score = 79.0 bits (193), Expect = 1e-14, Method: Compositional matrix adjust.
Identities = 62/198 (31%), Positives = 90/198 (45%), Gaps = 18/198 (9%)
Query 2 SNSATENEPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDSDW 61
++S P +AG GI+AG G E GY+M I SI PR Y QG D
Sbjct 66 TSSTDSTSPQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRK 123
Query 62 ISLDDMHKPALDGIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVNRTY 121
D + P +G Q+ N +++Y+N + A + T G T + +Y + N +
Sbjct 124 FDNMDFYFPEFAHLGEQEIKN------EELYLNESDAANEGTFGYTPRYAEYKYSQNEVH 177
Query 122 GNFAAGMSEAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWVQI 181
G+F M AF LNR ++ K P ++ TT+++ N +FA +WVQI
Sbjct 178 GDFRGNM--AFWHLNRIFKEK------PNLN--TTFVECNPSNRVFATAETSDDKYWVQI 227
Query 182 KFDITARRLMSAKQIPNL 199
DI A RLM P L
Sbjct 228 YQDIKALRLMPKYGTPML 245
>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str.
3999B T(B) 6]
Length=390
Score = 78.6 bits (192), Expect = 5e-14, Method: Compositional matrix adjust.
Identities = 62/198 (31%), Positives = 90/198 (45%), Gaps = 18/198 (9%)
Query 2 SNSATENEPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDSDW 61
++S P +AG GI+AG G E GY+M I SI PR Y QG D
Sbjct 211 TSSTDSTSPQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRK 268
Query 62 ISLDDMHKPALDGIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVNRTY 121
D + P +G Q+ N +++Y+N + A + T G T + +Y + N +
Sbjct 269 FDNMDFYFPEFAHLGEQEIKN------EELYLNESDAANEGTFGYTPRYAEYKYSQNEVH 322
Query 122 GNFAAGMSEAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWVQI 181
G+F M AF LNR ++ K P ++ TT+++ N +FA +WVQI
Sbjct 323 GDFRGNM--AFWHLNRIFKEK------PNLN--TTFVECNPSNRVFATAETSDDKYWVQI 372
Query 182 KFDITARRLMSAKQIPNL 199
DI A RLM P L
Sbjct 373 YQDIKALRLMPKYGTPML 390
>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=541
Score = 78.2 bits (191), Expect = 1e-13, Method: Compositional matrix adjust.
Identities = 62/198 (31%), Positives = 90/198 (45%), Gaps = 18/198 (9%)
Query 2 SNSATENEPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDSDW 61
++S P +AG GI+AG G E GY+M I SI PR Y QG D
Sbjct 362 TSSTDSTSPQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRK 419
Query 62 ISLDDMHKPALDGIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVNRTY 121
D + P +G Q+ N +++Y+N + A + T G T + +Y + N +
Sbjct 420 FDNMDFYFPEFAHLGEQEIKN------EELYLNESDAANEGTFGYTPRYAEYKYSQNEVH 473
Query 122 GNFAAGMSEAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWVQI 181
G+F M AF LNR ++ K P ++ TT+++ N +FA +WVQI
Sbjct 474 GDFRGNM--AFWHLNRIFKEK------PNLN--TTFVECNPSNRVFATAETSDDKYWVQI 523
Query 182 KFDITARRLMSAKQIPNL 199
DI A RLM P L
Sbjct 524 YQDIKALRLMPKYGTPML 541
>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis
CL09T03C24]
Length=538
Score = 74.3 bits (181), Expect = 3e-12, Method: Compositional matrix adjust.
Identities = 60/200 (30%), Positives = 92/200 (46%), Gaps = 19/200 (10%)
Query 1 VSNSATEN-EPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDS 59
+ SAT++ P +AG GI+AG G K E GY++ I SI PR Y QG D
Sbjct 357 LQTSATDSTSPQANMAGHGISAGVNHG--FKRYFEEHGYIIGIMSIRPRTGYQQGVPKDF 414
Query 60 DWISLDDMHKPALDGIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVNR 119
D + P +G Q+ N ++VY+ T + T G T + +Y ++N
Sbjct 415 RKFDNMDFYFPEFAHLGEQEIKN------EEVYLQQTPASNNGTFGYTPRYAEYKYSMNE 468
Query 120 TYGNFAAGMSEAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWV 179
+G+F M AF LNR + +P ++ TT+++ N +FA +W+
Sbjct 469 VHGDFRGNM--AFWHLNRIFS------ESPNLN--TTFVECNPSNRVFATAETSDDKYWI 518
Query 180 QIKFDITARRLMSAKQIPNL 199
Q+ D+ A RLM P L
Sbjct 519 QLYQDVKALRLMPKYGTPML 538
>gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
Length=553
Score = 73.2 bits (178), Expect = 8e-12, Method: Compositional matrix adjust.
Identities = 62/200 (31%), Positives = 92/200 (46%), Gaps = 22/200 (11%)
Query 2 SNSATENEPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDSDW 61
++S E P +AG GI+AG G K E GY++ I SITPR Y QG D+
Sbjct 374 TSSTDETSPQANMAGHGISAGINNG--FKHYFEEHGYIIGIMSITPRSGYQQG--VPRDF 429
Query 62 ISLDDM--HKPALDGIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVNR 119
D+M + P + Q+ N +D NN T G T + +Y + +
Sbjct 430 TKFDNMDFYFPEFAHLSEQEIKNQELFVSEDAAYNNG------TFGYTPRYAEYKYHPSE 483
Query 120 TYGNFAAGMSEAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWV 179
+G+F +S F LNR +E K P ++ TT+++ N +FA + + FWV
Sbjct 484 AHGDFRGNLS--FWHLNRIFEDK------PNLN--TTFVECKPSNRVFATSETEDDKFWV 533
Query 180 QIKFDITARRLMSAKQIPNL 199
Q+ D+ A RLM P L
Sbjct 534 QMYQDVKALRLMPKYGTPML 553
>gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
Length=338
Score = 68.2 bits (165), Expect = 2e-10, Method: Compositional matrix adjust.
Identities = 62/220 (28%), Positives = 92/220 (42%), Gaps = 25/220 (11%)
Query 3 NSATENEPLGTLAG---RGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDS 59
+++ E+ LG LA R + G I A EPG M IT + P YSQG D
Sbjct 119 SASGEDANLGQLAACVDRYCDFSGHSG--IDYYAKEPGTFMLITMLVPEPAYSQGLHPDL 176
Query 60 DWISLDDMHKPALDGIGYQDSVN------------SGRAWWDDVYINNTGKAA-----KR 102
IS D P L+GIG+Q +G + +TG
Sbjct 177 ASISFGDDFNPELNGIGFQLVPRHRFSMMPRGFNFTGLDQEASPWFGHTGTGVLVDPNMV 236
Query 103 TAGKTVAWIDYMTNVNRTYGNFAAGMSEAFMVLNRNYEMKYEAGTNPKISD---LTTYID 159
+ G+ VAW T+ +R +G+FA + + VL R + + D TYI+
Sbjct 237 SVGEEVAWSWLRTDYSRLHGDFAQNGNYQYWVLTRRFTTYFPDDGTGFYQDGEYTGTYIN 296
Query 160 PVKYNYIFADTSIDAMNFWVQIKFDITARRLMSAKQIPNL 199
P+ + Y+F D ++ A NF FD+ +SA +P L
Sbjct 297 PLDWQYVFVDQTLMAGNFAYYGTFDLNVTSSLSANYMPYL 336
>gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis]
gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM
16608]
Length=531
Score = 64.7 bits (156), Expect = 4e-09, Method: Compositional matrix adjust.
Identities = 58/218 (27%), Positives = 96/218 (44%), Gaps = 26/218 (12%)
Query 5 ATENEPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFD--SDWI 62
A E+ LG L G+G+ G I E G +MCI S+ P+ +Y+ G FD + +
Sbjct 317 ADESPCLGDLGGKGV--GSLNSSSIDFDVKEHGIIMCIYSVVPQTEYN-GTYFDPFNRKL 373
Query 63 SLDDMHKPALDGIGYQDSVNSG--RAWWDDVYINNTGKAAKRTAGKTVAWID-------- 112
+D +P +GYQ V S + D+ + K + AG ++ I+
Sbjct 374 RREDFFQPEFADLGYQPVVTSDLISTYLDNPVPDGPEKQKRLAAGYPLSSIEANNRLLGW 433
Query 113 ------YMTNVNRTYGNFAAGMSEAFMVLNR-NYEMKYEAG----TNPKISDLTTYIDPV 161
Y T+ + +G F +G+S ++ R ++ +AG N S Y++P
Sbjct 434 QVRYNEYKTSRDLVFGEFESGLSLSYWCSPRYDFGFDGKAGDKKLVNSPWSPAHFYVNPS 493
Query 162 KYNYIFADTSIDAMNFWVQIKFDITARRLMSAKQIPNL 199
N IF +++ A +F V FD+ A R MS + L
Sbjct 494 ILNTIFLVSAVKADHFLVNSFFDVKAVRPMSVSGLAGL 531
>gi|647452987|ref|WP_025792807.1| hypothetical protein [Prevotella histicola]
Length=584
Score = 60.5 bits (145), Expect = 1e-07, Method: Compositional matrix adjust.
Identities = 55/220 (25%), Positives = 95/220 (43%), Gaps = 41/220 (19%)
Query 11 LGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDY--SQGNDFDSDWISLDDMH 68
+G L G+GI G G I+ +TE G +MCI S+ P+ +Y S + F+ ++ + +
Sbjct 375 IGDLGGKGI--GSMSSGTIEFDSTEHGIIMCIYSVAPQSEYNASYLDPFNRK-LTREQFY 431
Query 69 KPALDGIGYQDSV-----------NSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNV 117
+P +GYQ + N +A + D+ +NN G V + +Y T
Sbjct 432 QPEFADLGYQALIGSDLICSTLGMNEKQAGFSDIELNNN------LLGYQVRYNEYKTAR 485
Query 118 NRTYGNFAAGMSEAFMVLNRNYEMKY------------------EAGTNPKISDLTTYID 159
+ +G+F +G S ++ R ++ Y + G S YI+
Sbjct 486 DLVFGDFESGKSLSYWCTPR-FDFGYGDTEKKIAPENKGGADYRKKGNRSHWSSRNFYIN 544
Query 160 PVKYNYIFADTSIDAMNFWVQIKFDITARRLMSAKQIPNL 199
P N IF +++ A +F V D+ A R MS + +L
Sbjct 545 PNLVNPIFLTSAVQADHFIVNSFLDVKAVRPMSVTGLSSL 584
>gi|565841287|ref|WP_023924568.1| hypothetical protein [Prevotella nigrescens]
gi|564729907|gb|ETD29851.1| hypothetical protein HMPREF1173_00033 [Prevotella nigrescens
CC14M]
Length=656
Score = 59.3 bits (142), Expect = 3e-07, Method: Compositional matrix adjust.
Identities = 50/197 (25%), Positives = 87/197 (44%), Gaps = 14/197 (7%)
Query 1 VSNSATENEPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYS--QGNDFD 58
V +A+ +G + G+GI G G I E G +MCI SI P++DY + + F+
Sbjct 460 VDGTASTGSVVGQVFGKGI--GAMNSGHISYDVKEHGLIMCIYSIAPQVDYDARELDPFN 517
Query 59 SDWISLDDMHKPALDGIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVN 118
+ S +D +P + +G Q + S + +++ G + +++Y T +
Sbjct 518 RKF-SREDYFQPEFENLGMQPVIQSDLCLCINSAKSDSSDQHNNVLGYSARYLEYKTARD 576
Query 119 RTYGNFAAGMS-EAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFA---DTSIDA 174
+G F +G S A+ NY ++ + P + +DP IFA + S+
Sbjct 577 IIFGEFMSGGSLSAWATPKNNYTFEFGKLSLPDL-----LVDPKVLEPIFAVKYNGSMST 631
Query 175 MNFWVQIKFDITARRLM 191
F V FD+ A R M
Sbjct 632 DQFLVNSYFDVKAIRPM 648
>gi|648626869|ref|WP_026318620.1| hypothetical protein [Alistipes onderdonkii]
Length=231
Score = 50.8 bits (120), Expect = 1e-04, Method: Compositional matrix adjust.
Identities = 56/217 (26%), Positives = 79/217 (36%), Gaps = 56/217 (26%)
Query 35 EPGYMMCITSITPRIDYSQGNDFDSDWISLDDMHKPALDGIGYQDSV----------NSG 84
E GY M ITS+ P + Y + +L + PALD I Q N+G
Sbjct 16 ESGYFMEITSVVPTVMYPNYLNPTLLQTNLGQRYAPALDNIQMQPLTVPTLLGNAYFNTG 75
Query 85 RAWWDDVYINNTGKAAKRTA-------------GKTVAWIDYMTNVNRTYGNFAAGMSEA 131
+ V +N+ G RT G AW + MT V++ +G +
Sbjct 76 SGSYSHV-LNHMGTGELRTVAVDKLSAAEGIAVGYQPAWAELMTGVSKPHGRLCNDLD-- 132
Query 132 FMVLNRNY------------------EMKYEAGT-----------NPKIS-DLTTYIDPV 161
+ R Y E+ E T N +S D YI P
Sbjct 133 YWAFQRRYGTVLYSSNDAQDASVFLEELGNEVDTLDVETFNAWLKNTYVSTDFVPYILPA 192
Query 162 KYNYIFADTSIDAMNFWVQIKFDITARRLMSAKQIPN 198
YNY+FADT +A NF + +I+ R S +PN
Sbjct 193 MYNYVFADTDPNAQNFVLDNSAEISVYREKSKVNVPN 229
Lambda K H a alpha
0.316 0.133 0.402 0.792 4.96
Gapped
Lambda K H a alpha sigma
0.267 0.0410 0.140 1.90 42.6 43.6
Effective search space used: 631402151361