bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-39_CDS_annotation_glimmer3.pl_2_2 Length=716 Score E Sequences producing significant alignments: (Bits) Value gi|649557305|gb|KDS63784.1| capsid family protein 83.2 3e-14 gi|492501782|ref|WP_005867318.1| hypothetical protein 84.3 1e-13 gi|649569140|gb|KDS75238.1| capsid family protein 83.2 1e-13 gi|649555287|gb|KDS61824.1| capsid family protein 83.2 2e-13 gi|494610271|ref|WP_007368517.1| capsid protein 82.4 3e-13 gi|547920049|ref|WP_022322420.1| capsid protein VP1 81.3 9e-13 gi|494308783|ref|WP_007173938.1| hypothetical protein 76.3 3e-11 gi|609718276|emb|CDN73650.1| conserved hypothetical protein 75.1 7e-11 gi|494306153|ref|WP_007173049.1| hypothetical protein 75.1 8e-11 gi|496521299|ref|WP_009229582.1| capsid protein 73.2 3e-10 >gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=245 Score = 83.2 bits (204), Expect = 3e-14, Method: Compositional matrix adjust. Identities = 71/246 (29%), Positives = 113/246 (46%), Gaps = 20/246 (8%) Query 473 LNRIAVSGGSYRDWQEAVFGVRVSRAA-ESPIYVGGYASEIVFDEVVSTAAFESGETGQE 531 R A SG Y + + FGVR S A + P ++GG + I EV+ T++ +S Sbjct 18 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDS----TS 73 Query 532 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 590 P ++AG G G EE IM + SI PR Y QG K + + D M DF+ Sbjct 74 PQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNM-DFY 130 Query 591 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 650 P +G QE+ ++E++ +N + + + G P + EY + NE +GDF Sbjct 131 FPEFAHLGEQEIKNEELY-----LNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGN-- 183 Query 651 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 710 + + NR++ P L TT+++ N+ FA + +W+QI D+ R+ Sbjct 184 MAFWHLNRIF--KEKPNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK 239 Query 711 REIPNL 716 P L Sbjct 240 YGTPML 245 >gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis] gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis CL09T03C24] Length=538 Score = 84.3 bits (207), Expect = 1e-13, Method: Compositional matrix adjust. Identities = 72/246 (29%), Positives = 115/246 (47%), Gaps = 20/246 (8%) Query 473 LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE 531 R A SG Y + + FGVR S A + P ++GG + I EV+ T+A +S Sbjct 311 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSATDS----TS 366 Query 532 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 590 P ++AG G G K EE I+ + SI PR Y QG K + + D M DF+ Sbjct 367 PQANMAGHGISAGVNHG--FKRYFEEHGYIIGIMSIRPRTGYQQGVPKDFRKFDNM-DFY 423 Query 591 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 650 P +G QE+ ++E++ + + + + G P + EY ++NE +GDF Sbjct 424 FPEFAHLGEQEIKNEEVY-----LQQTPASNNGTFGYTPRYAEYKYSMNEVHGDFRGN-- 476 Query 651 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 710 + + NR++ + P L TT+++ N+ FA + +WIQ+ DV R+ Sbjct 477 MAFWHLNRIF--SESPNLN--TTFVECNPSNRVFATAETSDDKYWIQLYQDVKALRLMPK 532 Query 711 REIPNL 716 P L Sbjct 533 YGTPML 538 >gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 3999B T(B) 6] Length=390 Score = 83.2 bits (204), Expect = 1e-13, Method: Compositional matrix adjust. Identities = 71/246 (29%), Positives = 113/246 (46%), Gaps = 20/246 (8%) Query 473 LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE 531 R A SG Y + + FGVR S A + P ++GG + I EV+ T++ +S Sbjct 163 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDS----TS 218 Query 532 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 590 P ++AG G G EE IM + SI PR Y QG K + + D M DF+ Sbjct 219 PQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNM-DFY 275 Query 591 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 650 P +G QE+ ++E++ +N + + + G P + EY + NE +GDF Sbjct 276 FPEFAHLGEQEIKNEELY-----LNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGN-- 328 Query 651 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 710 + + NR++ P L TT+++ N+ FA + +W+QI D+ R+ Sbjct 329 MAFWHLNRIF--KEKPNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK 384 Query 711 REIPNL 716 P L Sbjct 385 YGTPML 390 >gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=541 Score = 83.2 bits (204), Expect = 2e-13, Method: Compositional matrix adjust. Identities = 71/246 (29%), Positives = 113/246 (46%), Gaps = 20/246 (8%) Query 473 LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE 531 R A SG Y + + FGVR S A + P ++GG + I EV+ T++ +S Sbjct 314 FERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDS----TS 369 Query 532 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 590 P ++AG G G EE IM + SI PR Y QG K + + D M DF+ Sbjct 370 PQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNM-DFY 426 Query 591 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 650 P +G QE+ ++E++ +N + + + G P + EY + NE +GDF Sbjct 427 FPEFAHLGEQEIKNEELY-----LNESDAANEGTFGYTPRYAEYKYSQNEVHGDFRGN-- 479 Query 651 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 710 + + NR++ P L TT+++ N+ FA + +W+QI D+ R+ Sbjct 480 MAFWHLNRIF--KEKPNLN--TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPK 535 Query 711 REIPNL 716 P L Sbjct 536 YGTPML 541 >gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis] gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM 16608] Length=531 Score = 82.4 bits (202), Expect = 3e-13, Method: Compositional matrix adjust. Identities = 74/272 (27%), Positives = 117/272 (43%), Gaps = 44/272 (16%) Query 480 GGSYRDWQEAVFGVRV--SRAAESPIYVGGYASEIVFDEVVSTAAFESGETGQEPLGSLA 537 G Y EA FG RV SRA ++ ++GG+ + +V EVV+ + F+ G LG L Sbjct 269 GLDYSSQIEAHFGFRVPESRAGDA-RFIGGFDNPVVISEVVNQSEFDRGADESPCLGDLG 327 Query 538 GRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQGNKWWTRVDTMN------DFHK 591 G+G +I +E +IM + S+VP+ +Y+ T D N DF + Sbjct 328 GKG--VGSLNSSSIDFDVKEHGIIMCIYSVVPQTEYNG-----TYFDPFNRKLRREDFFQ 380 Query 592 PNLDQIGFQELLSQEMHG------------RAWRVNANYKTTDFS-----VGKQPAWTEY 634 P +G+Q +++ ++ + R+ A Y + +G Q + EY Sbjct 381 PEFADLGYQPVVTSDLISTYLDNPVPDGPEKQKRLAAGYPLSSIEANNRLLGWQVRYNEY 440 Query 635 TTTVNETYGDFAAGEPLEYMAFNRVYDVANDPKLKD----------ATTYIDPQIFNKAF 684 T+ + +G+F +G L Y R YD D K D A Y++P I N F Sbjct 441 KTSRDLVFGEFESGLSLSYWCSPR-YDFGFDGKAGDKKLVNSPWSPAHFYVNPSILNTIF 499 Query 685 ANSNLDAKNFWIQIGFDVIGRRVKSAREIPNL 716 S + A +F + FDV R S + L Sbjct 500 LVSAVKADHFLVNSFFDVKAVRPMSVSGLAGL 531 >gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48] gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48] Length=553 Score = 81.3 bits (199), Expect = 9e-13, Method: Compositional matrix adjust. Identities = 73/246 (30%), Positives = 115/246 (47%), Gaps = 20/246 (8%) Query 473 LNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEVVSTAAFESGETGQE 531 R A G Y + + FGVR S A + P ++GG I EV+ T++ + ET Sbjct 326 FERNARGGSRYIEQILSHFGVRSSDARLQRPQFLGGGRMPISVSEVLQTSS--TDET--S 381 Query 532 PLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG-NKWWTRVDTMNDFH 590 P ++AG G G K EE I+ + SI PR Y QG + +T+ D M DF+ Sbjct 382 PQANMAGHGISAGINNG--FKHYFEEHGYIIGIMSITPRSGYQQGVPRDFTKFDNM-DFY 438 Query 591 KPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEP 650 P + QE+ +QE+ V+ + + + G P + EY +E +GDF Sbjct 439 FPEFAHLSEQEIKNQELF-----VSEDAAYNNGTFGYTPRYAEYKYHPSEAHGDFRGN-- 491 Query 651 LEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWIQIGFDVIGRRVKSA 710 L + NR+++ + P L TT+++ + N+ FA S + FW+Q+ DV R+ Sbjct 492 LSFWHLNRIFE--DKPNLN--TTFVECKPSNRVFATSETEDDKFWVQMYQDVKALRLMPK 547 Query 711 REIPNL 716 P L Sbjct 548 YGTPML 553 >gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis] gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=553 Score = 76.3 bits (186), Expect = 3e-11, Method: Compositional matrix adjust. Identities = 67/259 (26%), Positives = 120/259 (46%), Gaps = 30/259 (12%) Query 440 DGTN--GINEITSVDVTSGLLTMDALILQKKVYDMLNRIAVSGGSYRDWQEAVFGVRVSR 497 DG+N +N D + G ++ +L V +L+ +G +++D A +GV + Sbjct 276 DGSNFTRVNFGVDTDSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYGVEIPD 335 Query 498 AAESPI-YVGGYASEIVFDEVVSTAAFESGETGQEP--LGSLAGRGRETSKRGGKNIKIR 554 + + + Y+GG+ S++ +V T+ + E E LG +AG+G + G I Sbjct 336 SRDGRVNYLGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKG---TGSGRGRIVFD 392 Query 555 CEEPSLIMILGSIVPRVDYSQGNKWWTRVDTM------NDFHKPNLDQIGFQELLSQEMH 608 +E ++M + S+VP++ Y TR+D M D+ P + +G Q L S + Sbjct 393 AKEHGVLMCIYSLVPQIQYD-----CTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYI- 446 Query 609 GRAWRVNANYKTTDFS---VGKQPAWTEYTTTVNETYGDFAAGEPLEYMAFNRVYDVAND 665 +++ TTD +G QP ++EY T ++ +G FA + L + +R Sbjct 447 -------SSFCTTDPKNPVLGYQPRYSEYKTALDVNHGQFAQSDALSSWSVSRFRRWTTF 499 Query 666 PKLKDATTYIDPQIFNKAF 684 P+L+ A IDP N F Sbjct 500 PQLEIADFKIDPGCLNSIF 518 >gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis] Length=537 Score = 75.1 bits (183), Expect = 7e-11, Method: Compositional matrix adjust. Identities = 64/253 (25%), Positives = 118/253 (47%), Gaps = 16/253 (6%) Query 459 TMDALILQKKVYDMLNRIAVSGGSYRDWQEAVFGVRVSRA-AESPIYVGGYASEIVFDEV 517 T++ L K+ + L + A +G Y + + FGV+ S + P ++GG S I+ EV Sbjct 290 TVNDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEFLGGNKSPIMISEV 349 Query 518 VSTAAFESGETGQEPLGSLAGRGRETSKRGGKNIKIRCEEPSLIMILGSIVPRVDYSQG- 576 + +A +S P G++AG G K GG EE ++ L S++P+ YSQG Sbjct 350 LQQSATDS----TTPQGNMAGHGIGIGKDGG--FSRFFEEHGYVIGLMSVIPKTSYSQGI 403 Query 577 NKWWTRVDTMNDFHKPNLDQIGFQELLSQEMHGRAWRVNANYKTTDFSVGKQPAWTEYTT 636 + +++ D D+ P + IG Q + ++E+ + N + ++ G P ++EY Sbjct 404 PRHFSKSDKF-DYFWPQFEHIGEQPVYNKEIFAK----NIDAFDSEAVFGYLPRYSEYKF 458 Query 637 TVNETYGDFAAGEPLEYMAFNRVYDVANDPKLKDATTYIDPQIFNKAFANSNLDAKNFWI 696 + + +GDF + L + R++D P L + D ++ FA + D F+ Sbjct 459 SPSTVHGDFK--DDLYFWHLGRIFDTDKPPVLNQSFIECDKNALSRIFAVED-DTDKFYC 515 Query 697 QIGFDVIGRRVKS 709 + + +R S Sbjct 516 HLYQKITAKRKMS 528 >gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis] gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=519 Score = 75.1 bits (183), Expect = 8e-11, Method: Compositional matrix adjust. Identities = 65/252 (26%), Positives = 115/252 (46%), Gaps = 28/252 (11%) Query 445 INEITSVDVTSGLLTMDALILQKKVYDMLNRIAVSGGSYRDWQEAVFGVRVSRAAESPI- 503 +N VD G ++ +L V +L+ +G +++D A +GV + + + + Sbjct 249 LNFPVDVDNNLGYFSVSSLRSAFAVDKLLSVTMRAGKTFQDQMRAHYGVEIPDSRDGRVN 308 Query 504 YVGGYASEIVFDEVVSTAAFESGETGQEP--LGSLAGRGRETSKRGGKNIKIRCEEPSLI 561 Y+GG+ S++ +V T+ + E E LG +AG+G + G I +E ++ Sbjct 309 YLGGFDSDLQVSDVTQTSGTTATEYKPEAGYLGRIAGKG---TGSGRGRIVFDAKEHGVL 365 Query 562 MILGSIVPRVDYSQGNKWWTRVDTM------NDFHKPNLDQIGFQELLSQEMHGRAWRVN 615 M + S+VP++ Y TR+D M DF P + +G Q L S + Sbjct 366 MCIYSLVPQIQYD-----CTRLDPMVDKLDRFDFFTPEFENLGMQPLNSSYI-------- 412 Query 616 ANYKTTDFS---VGKQPAWTEYTTTVNETYGDFAAGEPLEYMAFNRVYDVANDPKLKDAT 672 +++ T D +G QP ++EY T ++ +G FA + L + +R P+L+ A Sbjct 413 SSFCTPDPKNPVLGYQPRYSEYKTALDINHGQFAQNDALSSWSVSRFRRWTTFPQLEIAD 472 Query 673 TYIDPQIFNKAF 684 IDP N F Sbjct 473 FKIDPGCLNSVF 484 >gi|496521299|ref|WP_009229582.1| capsid protein [Prevotella sp. oral taxon 317] gi|288330570|gb|EFC69154.1| putative capsid protein (F protein) [Prevotella sp. oral taxon 317 str. F0108] Length=541 Score = 73.2 bits (178), Expect = 3e-10, Method: Compositional matrix adjust. Identities = 70/257 (27%), Positives = 110/257 (43%), Gaps = 28/257 (11%) Query 440 DGTNGINEITSVDVTSGLLTMDALILQKKVYDMLNRIAVSGGSYRDWQEAVFGVRVSRAA 499 DG + + S DV + A L K +L+ +G +Y + EA FGV VS Sbjct 268 DGNSAKLNMASPDVLNVSAIRSAFALDK----LLSISMRAGKTYAEQIEAHFGVTVSEGR 323 Query 500 ESPIY-VGGYASEIVFDEVVSTAAFESGETGQEPLGSLAGR-GRETSKRGGKN---IKIR 554 + +Y +GG+ S + +V T+ + + LAG G+ T K G I+ Sbjct 324 DGQVYYLGGFDSNVQVGDVTQTSGTTNPNVSEVGNAKLAGYLGKITGKGTGSGYGEIQFD 383 Query 555 CEEPSLIMILGSIVPRVDYSQGNKWWTRVD------TMNDFHKPNLDQIGFQELLSQEMH 608 +EP ++M + S+VP + Y R+D T D+ P + +G Q ++ Sbjct 384 AKEPGVLMCIYSVVPAMQYD-----CMRLDPFVAKQTRGDYFIPEFENLGMQPIVPA--- 435 Query 609 GRAWRVNANYKTTDFSVGKQPAWTEYTTTVNETYGDFAAGEPLEYMAFNRVYDVANDPKL 668 V+ N + D S G QP ++EY T + +G FA GEPL Y + R Sbjct 436 ----FVSLN-RAKDNSYGWQPRYSEYKTAFDINHGQFANGEPLSYWSIARARGSDTLNTF 490 Query 669 KDATTYIDPQIFNKAFA 685 A I+P + FA Sbjct 491 NVAALKINPHWLDSVFA 507 Lambda K H a alpha 0.316 0.134 0.390 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 5553829103760