bitscore colors: <40, 40-50 , 50-80, 80-200, >200

BLASTP 2.2.30+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
49,011,213 sequences; 17,563,301,199 total letters
Query= Contig-40_CDS_annotation_glimmer3.pl_2_1
Length=204
Score E
Sequences producing significant alignments: (Bits) Value
gi|649557305|gb|KDS63784.1| capsid family protein 77.4 5e-14
gi|492501782|ref|WP_005867318.1| hypothetical protein 79.0 9e-14
gi|649569140|gb|KDS75238.1| capsid family protein 77.0 2e-13
gi|649555287|gb|KDS61824.1| capsid family protein 77.0 5e-13
gi|547920049|ref|WP_022322420.1| capsid protein VP1 76.6 6e-13
gi|494610271|ref|WP_007368517.1| capsid protein 68.6 3e-10
gi|647452987|ref|WP_025792807.1| hypothetical protein 66.2 2e-09
gi|565841287|ref|WP_023924568.1| hypothetical protein 55.5 5e-06
gi|496521299|ref|WP_009229582.1| capsid protein 53.5 2e-05
gi|494306153|ref|WP_007173049.1| hypothetical protein 50.1 3e-04
>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=245
Score = 77.4 bits (189), Expect = 5e-14, Method: Compositional matrix adjust.
Identities = 60/205 (29%), Positives = 98/205 (48%), Gaps = 16/205 (8%)
Query 2 QSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSITPRID 60
++ I+ E++ S+T+ P +AG G++ G E IM + SI PR
Sbjct 55 RTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGFTRYFEEHGYIMGIMSIRPRTG 112
Query 61 YSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIaeeaaawtteatDDHELVYQSLGKQPSW 119
Y QG K + + NMD F+ P +G QE+ EE ++A ++ + G P +
Sbjct 113 YQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLNESDAANE-----GTFGYTPRY 166
Query 120 IEYTTDVNETYGEFAAGMPLAFMCLNRVYEENSDHTIGNASTYIDPTIYNSIFAESRLSS 179
EY NE +G+F M AF LNR+++E + +T+++ N +FA + S
Sbjct 167 AEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN----TTFVECNPSNRVFATAETSD 220
Query 180 QNFWVQVAFDVTARRVMSAKQIPNL 204
+WVQ+ D+ A R+M P L
Sbjct 221 DKYWVQIYQDIKALRLMPKYGTPML 245
>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis
CL09T03C24]
Length=538
Score = 79.0 bits (193), Expect = 9e-14, Method: Compositional matrix adjust.
Identities = 62/205 (30%), Positives = 100/205 (49%), Gaps = 16/205 (8%)
Query 2 QSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSITPRID 60
++ I+ E++ SAT+ P +AG G++ G K E I+ + SI PR
Sbjct 348 RTPISVSEVLQTSATDSTSPQANMAGHGISA--GVNHGFKRYFEEHGYIIGIMSIRPRTG 405
Query 61 YSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIaeeaaawtteatDDHELVYQSLGKQPSW 119
Y QG K + + NMD F+ P +G QE+ EE T A+++ + G P +
Sbjct 406 YQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEEVYLQQTPASNN-----GTFGYTPRY 459
Query 120 IEYTTDVNETYGEFAAGMPLAFMCLNRVYEENSDHTIGNASTYIDPTIYNSIFAESRLSS 179
EY +NE +G+F M AF LNR++ E+ + +T+++ N +FA + S
Sbjct 460 AEYKYSMNEVHGDFRGNM--AFWHLNRIFSESPNLN----TTFVECNPSNRVFATAETSD 513
Query 180 QNFWVQVAFDVTARRVMSAKQIPNL 204
+W+Q+ DV A R+M P L
Sbjct 514 DKYWIQLYQDVKALRLMPKYGTPML 538
>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str.
3999B T(B) 6]
Length=390
Score = 77.0 bits (188), Expect = 2e-13, Method: Compositional matrix adjust.
Identities = 60/205 (29%), Positives = 98/205 (48%), Gaps = 16/205 (8%)
Query 2 QSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSITPRID 60
++ I+ E++ S+T+ P +AG G++ G E IM + SI PR
Sbjct 200 RTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGFTRYFEEHGYIMGIMSIRPRTG 257
Query 61 YSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIaeeaaawtteatDDHELVYQSLGKQPSW 119
Y QG K + + NMD F+ P +G QE+ EE ++A ++ + G P +
Sbjct 258 YQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLNESDAANE-----GTFGYTPRY 311
Query 120 IEYTTDVNETYGEFAAGMPLAFMCLNRVYEENSDHTIGNASTYIDPTIYNSIFAESRLSS 179
EY NE +G+F M AF LNR+++E + +T+++ N +FA + S
Sbjct 312 AEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN----TTFVECNPSNRVFATAETSD 365
Query 180 QNFWVQVAFDVTARRVMSAKQIPNL 204
+WVQ+ D+ A R+M P L
Sbjct 366 DKYWVQIYQDIKALRLMPKYGTPML 390
>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=541
Score = 77.0 bits (188), Expect = 5e-13, Method: Compositional matrix adjust.
Identities = 60/205 (29%), Positives = 98/205 (48%), Gaps = 16/205 (8%)
Query 2 QSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSITPRID 60
++ I+ E++ S+T+ P +AG G++ G E IM + SI PR
Sbjct 351 RTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGFTRYFEEHGYIMGIMSIRPRTG 408
Query 61 YSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIaeeaaawtteatDDHELVYQSLGKQPSW 119
Y QG K + + NMD F+ P +G QE+ EE ++A ++ + G P +
Sbjct 409 YQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLNESDAANE-----GTFGYTPRY 462
Query 120 IEYTTDVNETYGEFAAGMPLAFMCLNRVYEENSDHTIGNASTYIDPTIYNSIFAESRLSS 179
EY NE +G+F M AF LNR+++E + +T+++ N +FA + S
Sbjct 463 AEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN----TTFVECNPSNRVFATAETSD 516
Query 180 QNFWVQVAFDVTARRVMSAKQIPNL 204
+WVQ+ D+ A R+M P L
Sbjct 517 DKYWVQIYQDIKALRLMPKYGTPML 541
>gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
Length=553
Score = 76.6 bits (187), Expect = 6e-13, Method: Compositional matrix adjust.
Identities = 59/202 (29%), Positives = 95/202 (47%), Gaps = 16/202 (8%)
Query 5 IAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSITPRIDYSQ 63
I+ E++ S+T+E P +AG G++ +G K E I+ + SITPR Y Q
Sbjct 366 ISVSEVLQTSSTDETSPQANMAGHGISAGINNG--FKHYFEEHGYIIGIMSITPRSGYQQ 423
Query 64 G-NKWWTRLQNMDDFHKPTLDAIGFQELIaeeaaawtteatDDHELVYQSLGKQPSWIEY 122
G + +T+ NMD F+ P + QE+ ++D + G P + EY
Sbjct 424 GVPRDFTKFDNMD-FYFPEFAHLSEQEI-----KNQELFVSEDAAYNNGTFGYTPRYAEY 477
Query 123 TTDVNETYGEFAAGMPLAFMCLNRVYEENSDHTIGNASTYIDPTIYNSIFAESRLSSQNF 182
+E +G+F L+F LNR++E+ + +T+++ N +FA S F
Sbjct 478 KYHPSEAHGDFRGN--LSFWHLNRIFEDKPNLN----TTFVECKPSNRVFATSETEDDKF 531
Query 183 WVQVAFDVTARRVMSAKQIPNL 204
WVQ+ DV A R+M P L
Sbjct 532 WVQMYQDVKALRLMPKYGTPML 553
>gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis]
gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM
16608]
Length=531
Score = 68.6 bits (166), Expect = 3e-10, Method: Compositional matrix adjust.
Identities = 61/238 (26%), Positives = 101/238 (42%), Gaps = 38/238 (16%)
Query 1 MQSEIAFDEIVSNS-----ATEEEPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSI 55
+ + E+V+ S A E LG L G+GV ++ S +K E +IM + S+
Sbjct 298 FDNPVVISEVVNQSEFDRGADESPCLGDLGGKGVGSLNSSSIDFDVK--EHGIIMCIYSV 355
Query 56 TPRIDYSQGNKW--WTRLQNMDDFHKPTLDAIGFQELIae-----------------eaa 96
P+ +Y G + + R +DF +P +G+Q ++ +
Sbjct 356 VPQTEY-NGTYFDPFNRKLRREDFFQPEFADLGYQPVVTSDLISTYLDNPVPDGPEKQKR 414
Query 97 awtteatDDHELVYQSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENSDHTI 156
E + LG Q + EY T + +GEF +G+ L++ C R Y+ D
Sbjct 415 LAAGYPLSSIEANNRLLGWQVRYNEYKTSRDLVFGEFESGLSLSYWCSPR-YDFGFDGKA 473
Query 157 GN----------ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDVTARRVMSAKQIPNL 204
G+ A Y++P+I N+IF S + + +F V FDV A R MS + L
Sbjct 474 GDKKLVNSPWSPAHFYVNPSILNTIFLVSAVKADHFLVNSFFDVKAVRPMSVSGLAGL 531
>gi|647452987|ref|WP_025792807.1| hypothetical protein [Prevotella histicola]
Length=584
Score = 66.2 bits (160), Expect = 2e-09, Method: Compositional matrix adjust.
Identities = 66/241 (27%), Positives = 105/241 (44%), Gaps = 43/241 (18%)
Query 1 MQSEIAFDEIVS---NSATE--EEPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSI 55
+ I E+VS N+A++ +G L G+G+ +M S ++ TE +IM + S+
Sbjct 350 FDNSIVVSEVVSTNGNAASDGSHASIGDLGGKGIGSM--SSGTIEFDSTEHGIIMCIYSV 407
Query 56 TPRIDYSQG-----NKWWTRLQNMDDFHKPTLDAIGFQELIaeeaaawtteatD------ 104
P+ +Y+ N+ TR Q F++P +G+Q LI + T +
Sbjct 408 APQSEYNASYLDPFNRKLTREQ----FYQPEFADLGYQALIGSDLICSTLGMNEKQAGFS 463
Query 105 DHELVYQSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNR-----------VYEENSD 153
D EL LG Q + EY T + +G+F +G L++ C R + EN
Sbjct 464 DIELNNNLLGYQVRYNEYKTARDLVFGDFESGKSLSYWCTPRFDFGYGDTEKKIAPENKG 523
Query 154 ----HTIGNAST------YIDPTIYNSIFAESRLSSQNFWVQVAFDVTARRVMSAKQIPN 203
GN S YI+P + N IF S + + +F V DV A R MS + +
Sbjct 524 GADYRKKGNRSHWSSRNFYINPNLVNPIFLTSAVQADHFIVNSFLDVKAVRPMSVTGLSS 583
Query 204 L 204
L
Sbjct 584 L 584
>gi|565841287|ref|WP_023924568.1| hypothetical protein [Prevotella nigrescens]
gi|564729907|gb|ETD29851.1| hypothetical protein HMPREF1173_00033 [Prevotella nigrescens
CC14M]
Length=656
Score = 55.5 bits (132), Expect = 5e-06, Method: Compositional matrix adjust.
Identities = 54/216 (25%), Positives = 94/216 (44%), Gaps = 20/216 (9%)
Query 1 MQSEIAFDEIVSNS-------ATEEEPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALG 53
++I+ E+V+ S A+ +G + G+G+ M SG + E +IM +
Sbjct 443 FDNQISISEVVTTSNGSVDGTASTGSVVGQVFGKGIGAM-NSGH-ISYDVKEHGLIMCIY 500
Query 54 SITPRIDY-SQGNKWWTRLQNMDDFHKPTLDAIGFQELIaeeaaawtteatDDHELVYQS 112
SI P++DY ++ + R + +D+ +P + +G Q +I + A D + +
Sbjct 501 SIAPQVDYDARELDPFNRKFSREDYFQPEFENLGMQPVIQSDLCLCINSAKSDSSDQHNN 560
Query 113 -LGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENSDHTIGNAS---TYIDPTIY 168
LG ++EY T + +GEF +G L+ + N G S +DP +
Sbjct 561 VLGYSARYLEYKTARDIIFGEFMSGGSLSAWATPK---NNYTFEFGKLSLPDLLVDPKVL 617
Query 169 NSIFA---ESRLSSQNFWVQVAFDVTARRVMSAKQI 201
IFA +S+ F V FDV A R M +
Sbjct 618 EPIFAVKYNGSMSTDQFLVNSYFDVKAIRPMQVNDM 653
>gi|496521299|ref|WP_009229582.1| capsid protein [Prevotella sp. oral taxon 317]
gi|288330570|gb|EFC69154.1| putative capsid protein (F protein) [Prevotella sp. oral taxon
317 str. F0108]
Length=541
Score = 53.5 bits (127), Expect = 2e-05, Method: Compositional matrix adjust.
Identities = 37/154 (24%), Positives = 69/154 (45%), Gaps = 12/154 (8%)
Query 21 LGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSITPRIDYSQGN-KWWTRLQNMDDFHK 79
LG + G+G + Y ++ EP ++M + S+ P + Y + Q D+
Sbjct 365 LGKITGKGTGSGYGE---IQFDAKEPGVLMCIYSVVPAMQYDCMRLDPFVAKQTRGDYFI 421
Query 80 PTLDAIGFQELIaeeaaawtteatDDHELVYQSLGKQPSWIEYTTDVNETYGEFAAGMPL 139
P + +G Q ++ + + S G QP + EY T + +G+FA G PL
Sbjct 422 PEFENLGMQPIVPAFVSLNRAKD--------NSYGWQPRYSEYKTAFDINHGQFANGEPL 473
Query 140 AFMCLNRVYEENSDHTIGNASTYIDPTIYNSIFA 173
++ + R ++ +T A+ I+P +S+FA
Sbjct 474 SYWSIARARGSDTLNTFNVAALKINPHWLDSVFA 507
>gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis]
gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM
17361]
Length=519
Score = 50.1 bits (118), Expect = 3e-04, Method: Compositional matrix adjust.
Identities = 50/170 (29%), Positives = 75/170 (44%), Gaps = 25/170 (15%)
Query 14 SATEEEP----LGTLAGRGVATMYKSGRG-LKIKCTEPSMIMALGSITPRIDYSQGNKWW 68
+ATE +P LG +AG+G SGRG + E ++M + S+ P+I Y
Sbjct 329 TATEYKPEAGYLGRIAGKGTG----SGRGRIVFDAKEHGVLMCIYSLVPQIQYD-----C 379
Query 69 TRLQNMDD------FHKPTLDAIGFQELIaeeaaawtteatDDHELVYQSLGKQPSWIEY 122
TRL M D F P + +G Q L + + D V LG QP + EY
Sbjct 380 TRLDPMVDKLDRFDFFTPEFENLGMQPL--NSSYISSFCTPDPKNPV---LGYQPRYSEY 434
Query 123 TTDVNETYGEFAAGMPLAFMCLNRVYEENSDHTIGNASTYIDPTIYNSIF 172
T ++ +G+FA L+ ++R + + A IDP NS+F
Sbjct 435 KTALDINHGQFAQNDALSSWSVSRFRRWTTFPQLEIADFKIDPGCLNSVF 484
Lambda K H a alpha
0.317 0.132 0.393 0.792 4.96
Gapped
Lambda K H a alpha sigma
0.267 0.0410 0.140 1.90 42.6 43.6
Effective search space used: 671121370458