bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-46_CDS_annotation_glimmer3.pl_2_2 Length=461 Score E Sequences producing significant alignments: (Bits) Value gi|649557305|gb|KDS63784.1| capsid family protein 87.4 4e-16 gi|492501782|ref|WP_005867318.1| hypothetical protein 89.7 7e-16 gi|649569140|gb|KDS75238.1| capsid family protein 86.7 3e-15 gi|649555287|gb|KDS61824.1| capsid family protein 87.0 5e-15 gi|494610271|ref|WP_007368517.1| capsid protein 87.0 5e-15 gi|547920049|ref|WP_022322420.1| capsid protein VP1 85.9 1e-14 gi|647452987|ref|WP_025792807.1| hypothetical protein 70.5 1e-09 gi|565841287|ref|WP_023924568.1| hypothetical protein 65.5 4e-08 gi|494308783|ref|WP_007173938.1| hypothetical protein 61.2 8e-07 gi|599087863|gb|AHN52857.1| major capsid protein 57.8 2e-06 >gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=245 Score = 87.4 bits (215), Expect = 4e-16, Method: Compositional matrix adjust. Identities = 66/226 (29%), Positives = 107/226 (47%), Gaps = 17/226 (8%) Query 239 ATYGIRSATLP-ESPIFCGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGL 296 + +G+RS+ + P F GG ++ I+ E++ S+T+ P +AG G++ G Sbjct 34 SHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGF 91 Query 297 KIKCTEPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaaw 355 E IM + SI PR Y QG K + + NMD F+ P +G QE+ EE Sbjct 92 TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLN 150 Query 356 ntettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGN 415 ++ + G P + EY NE +G+F M AF LNR+++E + Sbjct 151 ESDAANE-----GTFGYTPRYAEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN--- 200 Query 416 ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDVTARRVMSAKQIPNL 461 +T+++ N +FA + S +WVQ+ D+ A R+M P L Sbjct 201 -TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYGTPML 245 >gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis] gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis CL09T03C24] Length=538 Score = 89.7 bits (221), Expect = 7e-16, Method: Compositional matrix adjust. Identities = 75/264 (28%), Positives = 124/264 (47%), Gaps = 17/264 (6%) Query 201 VDVTDGKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRSATLP-ESPIFCGGMQ 259 V+V + ++++ L + R A + Y + +G+RS+ + P F GG + Sbjct 289 VNVDELGVSINDLRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGR 348 Query 260 SEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSITPRIDY 318 + I+ E++ SAT+ P +AG G++ G K E I+ + SI PR Y Sbjct 349 TPISVSEVLQTSATDSTSPQANMAGHGISA--GVNHGFKRYFEEHGYIIGIMSIRPRTGY 406 Query 319 SQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaawntettenYKHIYSSLGKQPSWI 377 QG K + + NMD F+ P +G QE+ EE T + N + G P + Sbjct 407 QQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEEVYLQQTPASNN-----GTFGYTPRYA 460 Query 378 EYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNASTYIDPTIYNSIFAESRLSSQ 437 EY +NE +G+F M AF LNR++ E+ + +T+++ N +FA + S Sbjct 461 EYKYSMNEVHGDFRGNM--AFWHLNRIFSESPNLN----TTFVECNPSNRVFATAETSDD 514 Query 438 NFWVQVAFDVTARRVMSAKQIPNL 461 +W+Q+ DV A R+M P L Sbjct 515 KYWIQLYQDVKALRLMPKYGTPML 538 >gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 3999B T(B) 6] Length=390 Score = 86.7 bits (213), Expect = 3e-15, Method: Compositional matrix adjust. Identities = 66/226 (29%), Positives = 107/226 (47%), Gaps = 17/226 (8%) Query 239 ATYGIRSATLP-ESPIFCGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGL 296 + +G+RS+ + P F GG ++ I+ E++ S+T+ P +AG G++ G Sbjct 179 SHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGF 236 Query 297 KIKCTEPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaaw 355 E IM + SI PR Y QG K + + NMD F+ P +G QE+ EE Sbjct 237 TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLN 295 Query 356 ntettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGN 415 ++ + G P + EY NE +G+F M AF LNR+++E + Sbjct 296 ESDAANE-----GTFGYTPRYAEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN--- 345 Query 416 ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDVTARRVMSAKQIPNL 461 +T+++ N +FA + S +WVQ+ D+ A R+M P L Sbjct 346 -TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYGTPML 390 >gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=541 Score = 87.0 bits (214), Expect = 5e-15, Method: Compositional matrix adjust. Identities = 66/226 (29%), Positives = 107/226 (47%), Gaps = 17/226 (8%) Query 239 ATYGIRSATLP-ESPIFCGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGL 296 + +G+RS+ + P F GG ++ I+ E++ S+T+ P +AG G++ G Sbjct 330 SHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGF 387 Query 297 KIKCTEPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaaw 355 E IM + SI PR Y QG K + + NMD F+ P +G QE+ EE Sbjct 388 TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLN 446 Query 356 ntettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGN 415 ++ + G P + EY NE +G+F M AF LNR+++E + Sbjct 447 ESDAANE-----GTFGYTPRYAEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN--- 496 Query 416 ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDVTARRVMSAKQIPNL 461 +T+++ N +FA + S +WVQ+ D+ A R+M P L Sbjct 497 -TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYGTPML 541 >gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis] gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM 16608] Length=531 Score = 87.0 bits (214), Expect = 5e-15, Method: Compositional matrix adjust. Identities = 83/314 (26%), Positives = 138/314 (44%), Gaps = 55/314 (18%) Query 188 IDGTTGGINAITAVDVTD--GKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRS 245 + G + IN ++ + V D +D ++ + N L+ Y + EA +G R Sbjct 233 VSGASTFINGVSVLSVNDLRAAFALDKMLEATRRANGLD--------YSSQIEAHFGFR- 283 Query 246 ATLPESPI----FCGGMQSEIAFDEIVSNS-----ATEEEPLGTLAGRGVATMYKSGRGL 296 +PES F GG + + E+V+ S A E LG L G+GV ++ S Sbjct 284 --VPESRAGDARFIGGFDNPVVISEVVNQSEFDRGADESPCLGDLGGKGVGSLNSSSIDF 341 Query 297 KIKCTEPSMIMALGSITPRIDYSQGNKW--WTRLQNMDDFHKPTLDAIGFQ-----ELIt 349 +K E +IM + S+ P+ +Y+ G + + R +DF +P +G+Q +LI+ Sbjct 342 DVK--EHGIIMCIYSVVPQTEYN-GTYFDPFNRKLRREDFFQPEFADLGYQPVVTSDLIS 398 Query 350 eeaaawntettenYKHIYSS------------LGKQPSWIEYTTDVNETYGEFAAGMPLA 397 + E K + + LG Q + EY T + +GEF +G+ L+ Sbjct 399 TYLDNPVPDGPEKQKRLAAGYPLSSIEANNRLLGWQVRYNEYKTSRDLVFGEFESGLSLS 458 Query 398 FMCLNRVYEENTDHTIGN----------ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDV 447 + C R Y+ D G+ A Y++P+I N+IF S + + +F V FDV Sbjct 459 YWCSPR-YDFGFDGKAGDKKLVNSPWSPAHFYVNPSILNTIFLVSAVKADHFLVNSFFDV 517 Query 448 TARRVMSAKQIPNL 461 A R MS + L Sbjct 518 KAVRPMSVSGLAGL 531 >gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48] gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48] Length=553 Score = 85.9 bits (211), Expect = 1e-14, Method: Compositional matrix adjust. Identities = 73/269 (27%), Positives = 121/269 (45%), Gaps = 17/269 (6%) Query 196 NAITAVDVTDGKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRSATLP-ESPIF 254 N V+V + + ++ L + R A Y + +G+RS+ + P F Sbjct 299 NGTLKVNVDEMGININDLRTSNALQRWFERNARGGSRYIEQILSHFGVRSSDARLQRPQF 358 Query 255 CGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSIT 313 GG + I+ E++ S+T+E P +AG G++ +G K E I+ + SIT Sbjct 359 LGGGRMPISVSEVLQTSSTDETSPQANMAGHGISAGINNG--FKHYFEEHGYIIGIMSIT 416 Query 314 PRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaawntettenYKHIYSSLGK 372 PR Y QG + +T+ NMD F+ P + QE+ + Y + + G Sbjct 417 PRSGYQQGVPRDFTKFDNMD-FYFPEFAHLSEQEI---KNQELFVSEDAAYNN--GTFGY 470 Query 373 QPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNASTYIDPTIYNSIFAES 432 P + EY +E +G+F L+F LNR++E+ + +T+++ N +FA S Sbjct 471 TPRYAEYKYHPSEAHGDFRGN--LSFWHLNRIFEDKPNLN----TTFVECKPSNRVFATS 524 Query 433 RLSSQNFWVQVAFDVTARRVMSAKQIPNL 461 FWVQ+ DV A R+M P L Sbjct 525 ETEDDKFWVQMYQDVKALRLMPKYGTPML 553 >gi|647452987|ref|WP_025792807.1| hypothetical protein [Prevotella histicola] Length=584 Score = 70.5 bits (171), Expect = 1e-09, Method: Compositional matrix adjust. Identities = 74/270 (27%), Positives = 118/270 (44%), Gaps = 50/270 (19%) Query 233 YQAWREATYGIRSATLPESPI----FCGGMQSEIAFDEIVS---NSATE--EEPLGTLAG 283 Y + EA +G + +PES F GG + I E+VS N+A++ +G L G Sbjct 324 YASQIEAHFGFK---VPESRANDARFLGGFDNSIVVSEVVSTNGNAASDGSHASIGDLGG 380 Query 284 RGVATMYKSGRGLKIKCTEPSMIMALGSITPRIDYSQG-----NKWWTRLQNMDDFHKPT 338 +G+ +M S ++ TE +IM + S+ P+ +Y+ N+ TR Q F++P Sbjct 381 KGIGSM--SSGTIEFDSTEHGIIMCIYSVAPQSEYNASYLDPFNRKLTREQ----FYQPE 434 Query 339 LDAIGFQELIte---eaaawntettenYKHIYSS---LGKQPSWIEYTTDVNETYGEFAA 392 +G+Q LI + E + I + LG Q + EY T + +G+F + Sbjct 435 FADLGYQALIGSDLICSTLGMNEKQAGFSDIELNNNLLGYQVRYNEYKTARDLVFGDFES 494 Query 393 GMPLAFMCLNR-----------VYEENTD----HTIGNAST------YIDPTIYNSIFAE 431 G L++ C R + EN GN S YI+P + N IF Sbjct 495 GKSLSYWCTPRFDFGYGDTEKKIAPENKGGADYRKKGNRSHWSSRNFYINPNLVNPIFLT 554 Query 432 SRLSSQNFWVQVAFDVTARRVMSAKQIPNL 461 S + + +F V DV A R MS + +L Sbjct 555 SAVQADHFIVNSFLDVKAVRPMSVTGLSSL 584 >gi|565841287|ref|WP_023924568.1| hypothetical protein [Prevotella nigrescens] gi|564729907|gb|ETD29851.1| hypothetical protein HMPREF1173_00033 [Prevotella nigrescens CC14M] Length=656 Score = 65.5 bits (158), Expect = 4e-08, Method: Compositional matrix adjust. Identities = 83/338 (25%), Positives = 138/338 (41%), Gaps = 49/338 (14%) Query 144 GTIELPNYNRTKVYRSSNAWFSQAGLAVKTYLSDRFNN---WLNTEWIDGTTGGINAITA 200 G ELP+Y G A D NN L + +D + G N I+ Sbjct 342 GIFELPDYIN-----------GNTGFATTEVKRDVVNNRGSQLEIKSMDAGSLGSNNISY 390 Query 201 VDVTDGKLTMDALILQKKIFNMLNRVAITDGT-YQAWREATYGIRSATLPES----PIFC 255 + D + A+ +K ML R +G Y A +G + +PES F Sbjct 391 ISPND----IRAMFALEK---MLERTRAANGLDYSNQIAAHFGFK---VPESRKNCASFI 440 Query 256 GGMQSEIAFDEIVSNS-------ATEEEPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMA 308 GG ++I+ E+V+ S A+ +G + G+G+ M SG + E +IM Sbjct 441 GGFDNQISISEVVTTSNGSVDGTASTGSVVGQVFGKGIGAM-NSGH-ISYDVKEHGLIMC 498 Query 309 LGSITPRIDY-SQGNKWWTRLQNMDDFHKPTLDAIGFQELIteeaaawntettenYKHIY 367 + SI P++DY ++ + R + +D+ +P + +G Q +I + + + Sbjct 499 IYSIAPQVDYDARELDPFNRKFSREDYFQPEFENLGMQPVIQSDLCLCINSAKSDSSDQH 558 Query 368 SS-LGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNAS---TYIDPT 423 ++ LG ++EY T + +GEF +G L+ + N G S +DP Sbjct 559 NNVLGYSARYLEYKTARDIIFGEFMSGGSLSAWATPK---NNYTFEFGKLSLPDLLVDPK 615 Query 424 IYNSIFA---ESRLSSQNFWVQVAFDVTARRVMSAKQI 458 + IFA +S+ F V FDV A R M + Sbjct 616 VLEPIFAVKYNGSMSTDQFLVNSYFDVKAIRPMQVNDM 653 >gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis] gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=553 Score = 61.2 bits (147), Expect = 8e-07, Method: Compositional matrix adjust. Identities = 67/253 (26%), Positives = 111/253 (44%), Gaps = 35/253 (14%) Query 195 INAITAVDVTDGKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRSATLPESPI- 253 +N D ++G ++ +L + +L+ T+Q A YG+ +P+S Sbjct 283 VNFGVDTDSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYGVE---IPDSRDG 339 Query 254 ---FCGGMQSEIAFDEIVSNS---ATEEEP----LGTLAGRGVATMYKSGRG-LKIKCTE 302 + GG S++ ++ S ATE +P LG +AG+G SGRG + E Sbjct 340 RVNYLGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTG----SGRGRIVFDAKE 395 Query 303 PSMIMALGSITPRIDYSQGNKWWTRLQNMDD------FHKPTLDAIGFQELIteeaaawn 356 ++M + S+ P+I Y TRL M D + P + +G Q L + +++ Sbjct 396 HGVLMCIYSLVPQIQYD-----CTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYISSFC 450 Query 357 tettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNA 416 T +N LG QP + EY T ++ +G+FA L+ ++R T + A Sbjct 451 TTDPKN-----PVLGYQPRYSEYKTALDVNHGQFAQSDALSSWSVSRFRRWTTFPQLEIA 505 Query 417 STYIDPTIYNSIF 429 IDP NSIF Sbjct 506 DFKIDPGCLNSIF 518 >gi|599087863|gb|AHN52857.1| major capsid protein, partial [uncultured Gokushovirinae] Length=219 Score = 57.8 bits (138), Expect = 2e-06, Method: Compositional matrix adjust. Identities = 48/168 (29%), Positives = 81/168 (48%), Gaps = 11/168 (7%) Query 188 IDGTTGGINAITAVDVTDG-KLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRS- 245 + G T ++ + D+T+ T++ L +I ML R A Y ++ +G+ S Sbjct 51 VSGDTSAVSNVMYADLTEATAATINQLRQAFQIQKMLERDARGGTRYTEIIKSHFGVTSP 110 Query 246 -ATLPESPIFCGGMQSEIAFDEIVSNSATEEE---PLGTLAGRGVATMYKSGRGLKIKCT 301 A L + P + GG + + + + S T+++ P GTLA G A + G G T Sbjct 111 DARL-QRPEYLGGGSTPVIINPVAQTSGTDQQSDTPQGTLAAIGTAQV--RGHGFTKSFT 167 Query 302 EPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELI 348 E +I+ L S+ + Y QG N+ W R Q D++ P L +G QE++ Sbjct 168 EHCIILGLVSVRADLTYQQGLNRMWNR-QTRYDYYFPALSHLGEQEIL 214 Lambda K H a alpha 0.316 0.131 0.388 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 3125101418307