bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-23_CDS_annotation_glimmer3.pl_2_1 Length=199 Score E Sequences producing significant alignments: (Bits) Value gi|649557305|gb|KDS63784.1| capsid family protein 79.0 1e-14 gi|649569140|gb|KDS75238.1| capsid family protein 78.6 5e-14 gi|649555287|gb|KDS61824.1| capsid family protein 78.2 1e-13 gi|492501782|ref|WP_005867318.1| hypothetical protein 74.3 3e-12 gi|547920049|ref|WP_022322420.1| capsid protein VP1 73.2 8e-12 gi|547312923|ref|WP_022044635.1| putative uncharacterized protein 68.2 2e-10 gi|494610271|ref|WP_007368517.1| capsid protein 64.7 4e-09 gi|647452987|ref|WP_025792807.1| hypothetical protein 60.5 1e-07 gi|565841287|ref|WP_023924568.1| hypothetical protein 59.3 3e-07 gi|648626869|ref|WP_026318620.1| hypothetical protein 50.8 1e-04 >gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=245 Score = 79.0 bits (193), Expect = 1e-14, Method: Compositional matrix adjust. Identities = 62/198 (31%), Positives = 90/198 (45%), Gaps = 18/198 (9%) Query 2 SNSATENEPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDSDW 61 ++S P +AG GI+AG G E GY+M I SI PR Y QG D Sbjct 66 TSSTDSTSPQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRK 123 Query 62 ISLDDMHKPALDGIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVNRTY 121 D + P +G Q+ N +++Y+N + A + T G T + +Y + N + Sbjct 124 FDNMDFYFPEFAHLGEQEIKN------EELYLNESDAANEGTFGYTPRYAEYKYSQNEVH 177 Query 122 GNFAAGMSEAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWVQI 181 G+F M AF LNR ++ K P ++ TT+++ N +FA +WVQI Sbjct 178 GDFRGNM--AFWHLNRIFKEK------PNLN--TTFVECNPSNRVFATAETSDDKYWVQI 227 Query 182 KFDITARRLMSAKQIPNL 199 DI A RLM P L Sbjct 228 YQDIKALRLMPKYGTPML 245 >gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 3999B T(B) 6] Length=390 Score = 78.6 bits (192), Expect = 5e-14, Method: Compositional matrix adjust. Identities = 62/198 (31%), Positives = 90/198 (45%), Gaps = 18/198 (9%) Query 2 SNSATENEPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDSDW 61 ++S P +AG GI+AG G E GY+M I SI PR Y QG D Sbjct 211 TSSTDSTSPQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRK 268 Query 62 ISLDDMHKPALDGIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVNRTY 121 D + P +G Q+ N +++Y+N + A + T G T + +Y + N + Sbjct 269 FDNMDFYFPEFAHLGEQEIKN------EELYLNESDAANEGTFGYTPRYAEYKYSQNEVH 322 Query 122 GNFAAGMSEAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWVQI 181 G+F M AF LNR ++ K P ++ TT+++ N +FA +WVQI Sbjct 323 GDFRGNM--AFWHLNRIFKEK------PNLN--TTFVECNPSNRVFATAETSDDKYWVQI 372 Query 182 KFDITARRLMSAKQIPNL 199 DI A RLM P L Sbjct 373 YQDIKALRLMPKYGTPML 390 >gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=541 Score = 78.2 bits (191), Expect = 1e-13, Method: Compositional matrix adjust. Identities = 62/198 (31%), Positives = 90/198 (45%), Gaps = 18/198 (9%) Query 2 SNSATENEPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDSDW 61 ++S P +AG GI+AG G E GY+M I SI PR Y QG D Sbjct 362 TSSTDSTSPQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRK 419 Query 62 ISLDDMHKPALDGIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVNRTY 121 D + P +G Q+ N +++Y+N + A + T G T + +Y + N + Sbjct 420 FDNMDFYFPEFAHLGEQEIKN------EELYLNESDAANEGTFGYTPRYAEYKYSQNEVH 473 Query 122 GNFAAGMSEAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWVQI 181 G+F M AF LNR ++ K P ++ TT+++ N +FA +WVQI Sbjct 474 GDFRGNM--AFWHLNRIFKEK------PNLN--TTFVECNPSNRVFATAETSDDKYWVQI 523 Query 182 KFDITARRLMSAKQIPNL 199 DI A RLM P L Sbjct 524 YQDIKALRLMPKYGTPML 541 >gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis] gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis CL09T03C24] Length=538 Score = 74.3 bits (181), Expect = 3e-12, Method: Compositional matrix adjust. Identities = 60/200 (30%), Positives = 92/200 (46%), Gaps = 19/200 (10%) Query 1 VSNSATEN-EPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDS 59 + SAT++ P +AG GI+AG G K E GY++ I SI PR Y QG D Sbjct 357 LQTSATDSTSPQANMAGHGISAGVNHG--FKRYFEEHGYIIGIMSIRPRTGYQQGVPKDF 414 Query 60 DWISLDDMHKPALDGIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVNR 119 D + P +G Q+ N ++VY+ T + T G T + +Y ++N Sbjct 415 RKFDNMDFYFPEFAHLGEQEIKN------EEVYLQQTPASNNGTFGYTPRYAEYKYSMNE 468 Query 120 TYGNFAAGMSEAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWV 179 +G+F M AF LNR + +P ++ TT+++ N +FA +W+ Sbjct 469 VHGDFRGNM--AFWHLNRIFS------ESPNLN--TTFVECNPSNRVFATAETSDDKYWI 518 Query 180 QIKFDITARRLMSAKQIPNL 199 Q+ D+ A RLM P L Sbjct 519 QLYQDVKALRLMPKYGTPML 538 >gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48] gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48] Length=553 Score = 73.2 bits (178), Expect = 8e-12, Method: Compositional matrix adjust. Identities = 62/200 (31%), Positives = 92/200 (46%), Gaps = 22/200 (11%) Query 2 SNSATENEPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDSDW 61 ++S E P +AG GI+AG G K E GY++ I SITPR Y QG D+ Sbjct 374 TSSTDETSPQANMAGHGISAGINNG--FKHYFEEHGYIIGIMSITPRSGYQQG--VPRDF 429 Query 62 ISLDDM--HKPALDGIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVNR 119 D+M + P + Q+ N +D NN T G T + +Y + + Sbjct 430 TKFDNMDFYFPEFAHLSEQEIKNQELFVSEDAAYNNG------TFGYTPRYAEYKYHPSE 483 Query 120 TYGNFAAGMSEAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFADTSIDAMNFWV 179 +G+F +S F LNR +E K P ++ TT+++ N +FA + + FWV Sbjct 484 AHGDFRGNLS--FWHLNRIFEDK------PNLN--TTFVECKPSNRVFATSETEDDKFWV 533 Query 180 QIKFDITARRLMSAKQIPNL 199 Q+ D+ A RLM P L Sbjct 534 QMYQDVKALRLMPKYGTPML 553 >gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] Length=338 Score = 68.2 bits (165), Expect = 2e-10, Method: Compositional matrix adjust. Identities = 62/220 (28%), Positives = 92/220 (42%), Gaps = 25/220 (11%) Query 3 NSATENEPLGTLAG---RGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFDS 59 +++ E+ LG LA R + G I A EPG M IT + P YSQG D Sbjct 119 SASGEDANLGQLAACVDRYCDFSGHSG--IDYYAKEPGTFMLITMLVPEPAYSQGLHPDL 176 Query 60 DWISLDDMHKPALDGIGYQDSVN------------SGRAWWDDVYINNTGKAA-----KR 102 IS D P L+GIG+Q +G + +TG Sbjct 177 ASISFGDDFNPELNGIGFQLVPRHRFSMMPRGFNFTGLDQEASPWFGHTGTGVLVDPNMV 236 Query 103 TAGKTVAWIDYMTNVNRTYGNFAAGMSEAFMVLNRNYEMKYEAGTNPKISD---LTTYID 159 + G+ VAW T+ +R +G+FA + + VL R + + D TYI+ Sbjct 237 SVGEEVAWSWLRTDYSRLHGDFAQNGNYQYWVLTRRFTTYFPDDGTGFYQDGEYTGTYIN 296 Query 160 PVKYNYIFADTSIDAMNFWVQIKFDITARRLMSAKQIPNL 199 P+ + Y+F D ++ A NF FD+ +SA +P L Sbjct 297 PLDWQYVFVDQTLMAGNFAYYGTFDLNVTSSLSANYMPYL 336 >gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis] gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM 16608] Length=531 Score = 64.7 bits (156), Expect = 4e-09, Method: Compositional matrix adjust. Identities = 58/218 (27%), Positives = 96/218 (44%), Gaps = 26/218 (12%) Query 5 ATENEPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYSQGNDFD--SDWI 62 A E+ LG L G+G+ G I E G +MCI S+ P+ +Y+ G FD + + Sbjct 317 ADESPCLGDLGGKGV--GSLNSSSIDFDVKEHGIIMCIYSVVPQTEYN-GTYFDPFNRKL 373 Query 63 SLDDMHKPALDGIGYQDSVNSG--RAWWDDVYINNTGKAAKRTAGKTVAWID-------- 112 +D +P +GYQ V S + D+ + K + AG ++ I+ Sbjct 374 RREDFFQPEFADLGYQPVVTSDLISTYLDNPVPDGPEKQKRLAAGYPLSSIEANNRLLGW 433 Query 113 ------YMTNVNRTYGNFAAGMSEAFMVLNR-NYEMKYEAG----TNPKISDLTTYIDPV 161 Y T+ + +G F +G+S ++ R ++ +AG N S Y++P Sbjct 434 QVRYNEYKTSRDLVFGEFESGLSLSYWCSPRYDFGFDGKAGDKKLVNSPWSPAHFYVNPS 493 Query 162 KYNYIFADTSIDAMNFWVQIKFDITARRLMSAKQIPNL 199 N IF +++ A +F V FD+ A R MS + L Sbjct 494 ILNTIFLVSAVKADHFLVNSFFDVKAVRPMSVSGLAGL 531 >gi|647452987|ref|WP_025792807.1| hypothetical protein [Prevotella histicola] Length=584 Score = 60.5 bits (145), Expect = 1e-07, Method: Compositional matrix adjust. Identities = 55/220 (25%), Positives = 95/220 (43%), Gaps = 41/220 (19%) Query 11 LGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDY--SQGNDFDSDWISLDDMH 68 +G L G+GI G G I+ +TE G +MCI S+ P+ +Y S + F+ ++ + + Sbjct 375 IGDLGGKGI--GSMSSGTIEFDSTEHGIIMCIYSVAPQSEYNASYLDPFNRK-LTREQFY 431 Query 69 KPALDGIGYQDSV-----------NSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNV 117 +P +GYQ + N +A + D+ +NN G V + +Y T Sbjct 432 QPEFADLGYQALIGSDLICSTLGMNEKQAGFSDIELNNN------LLGYQVRYNEYKTAR 485 Query 118 NRTYGNFAAGMSEAFMVLNRNYEMKY------------------EAGTNPKISDLTTYID 159 + +G+F +G S ++ R ++ Y + G S YI+ Sbjct 486 DLVFGDFESGKSLSYWCTPR-FDFGYGDTEKKIAPENKGGADYRKKGNRSHWSSRNFYIN 544 Query 160 PVKYNYIFADTSIDAMNFWVQIKFDITARRLMSAKQIPNL 199 P N IF +++ A +F V D+ A R MS + +L Sbjct 545 PNLVNPIFLTSAVQADHFIVNSFLDVKAVRPMSVTGLSSL 584 >gi|565841287|ref|WP_023924568.1| hypothetical protein [Prevotella nigrescens] gi|564729907|gb|ETD29851.1| hypothetical protein HMPREF1173_00033 [Prevotella nigrescens CC14M] Length=656 Score = 59.3 bits (142), Expect = 3e-07, Method: Compositional matrix adjust. Identities = 50/197 (25%), Positives = 87/197 (44%), Gaps = 14/197 (7%) Query 1 VSNSATENEPLGTLAGRGINAGKQKGGKIKIRATEPGYMMCITSITPRIDYS--QGNDFD 58 V +A+ +G + G+GI G G I E G +MCI SI P++DY + + F+ Sbjct 460 VDGTASTGSVVGQVFGKGI--GAMNSGHISYDVKEHGLIMCIYSIAPQVDYDARELDPFN 517 Query 59 SDWISLDDMHKPALDGIGYQDSVNSGRAWWDDVYINNTGKAAKRTAGKTVAWIDYMTNVN 118 + S +D +P + +G Q + S + +++ G + +++Y T + Sbjct 518 RKF-SREDYFQPEFENLGMQPVIQSDLCLCINSAKSDSSDQHNNVLGYSARYLEYKTARD 576 Query 119 RTYGNFAAGMS-EAFMVLNRNYEMKYEAGTNPKISDLTTYIDPVKYNYIFA---DTSIDA 174 +G F +G S A+ NY ++ + P + +DP IFA + S+ Sbjct 577 IIFGEFMSGGSLSAWATPKNNYTFEFGKLSLPDL-----LVDPKVLEPIFAVKYNGSMST 631 Query 175 MNFWVQIKFDITARRLM 191 F V FD+ A R M Sbjct 632 DQFLVNSYFDVKAIRPM 648 >gi|648626869|ref|WP_026318620.1| hypothetical protein [Alistipes onderdonkii] Length=231 Score = 50.8 bits (120), Expect = 1e-04, Method: Compositional matrix adjust. Identities = 56/217 (26%), Positives = 79/217 (36%), Gaps = 56/217 (26%) Query 35 EPGYMMCITSITPRIDYSQGNDFDSDWISLDDMHKPALDGIGYQDSV----------NSG 84 E GY M ITS+ P + Y + +L + PALD I Q N+G Sbjct 16 ESGYFMEITSVVPTVMYPNYLNPTLLQTNLGQRYAPALDNIQMQPLTVPTLLGNAYFNTG 75 Query 85 RAWWDDVYINNTGKAAKRTA-------------GKTVAWIDYMTNVNRTYGNFAAGMSEA 131 + V +N+ G RT G AW + MT V++ +G + Sbjct 76 SGSYSHV-LNHMGTGELRTVAVDKLSAAEGIAVGYQPAWAELMTGVSKPHGRLCNDLD-- 132 Query 132 FMVLNRNY------------------EMKYEAGT-----------NPKIS-DLTTYIDPV 161 + R Y E+ E T N +S D YI P Sbjct 133 YWAFQRRYGTVLYSSNDAQDASVFLEELGNEVDTLDVETFNAWLKNTYVSTDFVPYILPA 192 Query 162 KYNYIFADTSIDAMNFWVQIKFDITARRLMSAKQIPN 198 YNY+FADT +A NF + +I+ R S +PN Sbjct 193 MYNYVFADTDPNAQNFVLDNSAEISVYREKSKVNVPN 229 Lambda K H a alpha 0.316 0.133 0.402 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 631402151361