bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-23_CDS_annotation_glimmer3.pl_2_3 Length=658 Score E Sequences producing significant alignments: (Bits) Value gi|494308783|ref|WP_007173938.1| hypothetical protein 71.2 1e-09 gi|547920049|ref|WP_022322420.1| capsid protein VP1 70.1 3e-09 gi|494306153|ref|WP_007173049.1| hypothetical protein 68.9 5e-09 gi|492501782|ref|WP_005867318.1| hypothetical protein 68.9 5e-09 gi|649557305|gb|KDS63784.1| capsid family protein 65.5 1e-08 gi|649569140|gb|KDS75238.1| capsid family protein 65.9 4e-08 gi|649555287|gb|KDS61824.1| capsid family protein 65.9 5e-08 gi|609718276|emb|CDN73650.1| conserved hypothetical protein 65.1 1e-07 gi|517172762|ref|WP_018361580.1| hypothetical protein 63.9 2e-07 gi|639237429|ref|WP_024568106.1| hypothetical protein 60.8 2e-06 >gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis] gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=553 Score = 71.2 bits (173), Expect = 1e-09, Method: Compositional matrix adjust. Identities = 57/200 (29%), Positives = 94/200 (47%), Gaps = 25/200 (13%) Query 426 DTSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVYTSGGLNHIETPI-------Y 478 D+S G F++ +L A V +L+ + ++ ++ Y +E P Y Sbjct 290 DSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYG------VEIPDSRDGRVNY 343 Query 479 LGGSSLEIEFQEVVNNSGT---EDQP----LGTLAGRGVATNHKGGNIVFKADEPGYLFC 531 LGG +++ +V SGT E +P LG +AG+G + G IVF A E G L C Sbjct 344 LGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTGSGR--GRIVFDAKEHGVLMC 401 Query 532 ITSITPRVDYFQGNEWDMYLESLD--DLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVG 589 I S+ P++ Y D ++ LD D P+ + +G Q +I++ + N +G Sbjct 402 IYSLVPQIQY-DCTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYISSFCTTDPKNPVLG 460 Query 590 KQPAWIEYMTNVNKTYGNFA 609 QP + EY T ++ +G FA Sbjct 461 YQPRYSEYKTALDVNHGQFA 480 >gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48] gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48] Length=553 Score = 70.1 bits (170), Expect = 3e-09, Method: Compositional matrix adjust. Identities = 57/185 (31%), Positives = 87/185 (47%), Gaps = 9/185 (5%) Query 472 HIETPIYLGGSSLEIEFQEVVNNSGT-EDQPLGTLAGRGVATNHKGGNIVFKA--DEPGY 528 ++ P +LGG + I EV+ S T E P +AG G++ G N FK +E GY Sbjct 352 RLQRPQFLGGGRMPISVSEVLQTSSTDETSPQANMAGHGISA---GINNGFKHYFEEHGY 408 Query 529 LFCITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTV 588 + I SITPR Y QG D D + P+ + Q+ + + + D+ N T Sbjct 409 IIGIMSITPRSGYQQGVPRDFTKFDNMDFYFPEFAHLSEQEIKNQELFVSEDAAYNNGTF 468 Query 589 GKQPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD-INTYTTYIFPHLYNNIFADTDV 647 G P + EY + ++ +G+F N + LNRIF D N TT++ N +FA ++ Sbjct 469 GYTPRYAEYKYHPSEAHGDFR--GNLSFWHLNRIFEDKPNLNTTFVECKPSNRVFATSET 526 Query 648 TSLKL 652 K Sbjct 527 EDDKF 531 Score = 50.1 bits (118), Expect = 0.004, Method: Compositional matrix adjust. Identities = 46/177 (26%), Positives = 77/177 (44%), Gaps = 19/177 (11%) Query 19 RLNNYNRSTHDLSFVMRTSMAPGVLVPTLKMLMLPGDTFPVKTRCHTLTHPTVGPLFGSF 78 R+ R+ +LS+ + ++ G LVP + M ++ GD F VKT P V P+ Sbjct 9 RMKRPRRNAFNLSYESKLTLNMGELVPIMCMPVVSGDKFRVKTESLVRLAPLVAPMMHRV 68 Query 79 KQQNDFFFCPIRL-YNAMLHNNALNIGLDMKKVKLPIVRIIASDLDLTKKMKGSNGTLKK 137 +FF P RL +N + + G+D + +P+ I + D + S +K+ Sbjct 69 NVFTHYFFVPNRLVWNEW--EDFITKGVDGE--DMPMFPKIQINQD--SHLVSSASLIKE 122 Query 138 MIHPSSLVKTLGLSNLEKNNSQQWD------------WNAIPILAYFDIFKNYYANK 182 SSL LGL L ++ +D +A+P AY I+ YY ++ Sbjct 123 YFGDSSLWDYLGLPTLSACGNKSYDVVNGVKVPSGFQVSALPFRAYQLIYNEYYRDQ 179 >gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis] gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=519 Score = 68.9 bits (167), Expect = 5e-09, Method: Compositional matrix adjust. Identities = 60/234 (26%), Positives = 104/234 (44%), Gaps = 29/234 (12%) Query 396 LKTYQSDVNTNWVNTEWLDGDSG----INSITAIDTSGGSFTLDTLNLAKKVYTMLNRIA 451 + + D + N+ ++ D +N +D + G F++ +L A V +L+ Sbjct 222 IPEFSDDEHLNFDRDQYADQSKSNFTQLNFPVDVDNNLGYFSVSSLRSAFAVDKLLSVTM 281 Query 452 ISDGSYNAWIQTVYTSGGLNHIETPI-------YLGGSSLEIEFQEVVNNSGT---EDQP 501 + ++ ++ Y +E P YLGG +++ +V SGT E +P Sbjct 282 RAGKTFQDQMRAHYG------VEIPDSRDGRVNYLGGFDSDLQVSDVTQTSGTTATEYKP 335 Query 502 ----LGTLAGRGVATNHKGGNIVFKADEPGYLFCITSITPRVDYFQGNEWDMYLESLD-- 555 LG +AG+G + G IVF A E G L CI S+ P++ Y D ++ LD Sbjct 336 EAGYLGRIAGKGTGSGR--GRIVFDAKEHGVLMCIYSLVPQIQY-DCTRLDPMVDKLDRF 392 Query 556 DLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVGKQPAWIEYMTNVNKTYGNFA 609 D P+ + +G Q +I++ N +G QP + EY T ++ +G FA Sbjct 393 DFFTPEFENLGMQPLNSSYISSFCTPDPKNPVLGYQPRYSEYKTALDINHGQFA 446 >gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis] gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis CL09T03C24] Length=538 Score = 68.9 bits (167), Expect = 5e-09, Method: Compositional matrix adjust. Identities = 55/184 (30%), Positives = 85/184 (46%), Gaps = 9/184 (5%) Query 472 HIETPIYLGGSSLEIEFQEVVNNSGTED-QPLGTLAGRGVATNHKGGNIVFKA--DEPGY 528 ++ P +LGG I EV+ S T+ P +AG G++ G N FK +E GY Sbjct 337 RLQRPQFLGGGRTPISVSEVLQTSATDSTSPQANMAGHGISA---GVNHGFKRYFEEHGY 393 Query 529 LFCITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTV 588 + I SI PR Y QG D D + P+ +G Q+ + + N T Sbjct 394 IIGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEVYLQQTPASNNGTF 453 Query 589 GKQPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD-INTYTTYIFPHLYNNIFADTDV 647 G P + EY ++N+ +G+F N + LNRIF + N TT++ + N +FA + Sbjct 454 GYTPRYAEYKYSMNEVHGDFR--GNMAFWHLNRIFSESPNLNTTFVECNPSNRVFATAET 511 Query 648 TSLK 651 + K Sbjct 512 SDDK 515 Score = 39.7 bits (91), Expect = 8.1, Method: Compositional matrix adjust. Identities = 22/74 (30%), Positives = 33/74 (45%), Gaps = 0/74 (0%) Query 18 TRLNNYNRSTHDLSFVMRTSMAPGVLVPTLKMLMLPGDTFPVKTRCHTLTHPTVGPLFGS 77 +L R+ +LS+ + + G LVP + ++PGD F V T P V P+ Sbjct 8 VKLKRPRRNVFNLSYENKLTANAGELVPIMCKPVVPGDKFRVNTEMLVRLAPLVAPMMHR 67 Query 78 FKQQNDFFFCPIRL 91 +FF P RL Sbjct 68 VDVFTHYFFVPNRL 81 >gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=245 Score = 65.5 bits (158), Expect = 1e-08, Method: Compositional matrix adjust. Identities = 52/181 (29%), Positives = 82/181 (45%), Gaps = 5/181 (3%) Query 473 IETPIYLGGSSLEIEFQEVVNNSGTED-QPLGTLAGRGVATNHKGGNIVFKADEPGYLFC 531 ++ P +LGG I EV+ S T+ P +AG G++ G + +E GY+ Sbjct 45 LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRY-FEEHGYIMG 103 Query 532 ITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVGKQ 591 I SI PR Y QG D D + P+ +G Q+ + + N T G Sbjct 104 IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEGTFGYT 163 Query 592 PAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGDI-NTYTTYIFPHLYNNIFADTDVTSL 650 P + EY + N+ +G+F N + LNRIF + N TT++ + N +FA + + Sbjct 164 PRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAETSDD 221 Query 651 K 651 K Sbjct 222 K 222 >gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 3999B T(B) 6] Length=390 Score = 65.9 bits (159), Expect = 4e-08, Method: Compositional matrix adjust. Identities = 52/182 (29%), Positives = 82/182 (45%), Gaps = 5/182 (3%) Query 472 HIETPIYLGGSSLEIEFQEVVNNSGTED-QPLGTLAGRGVATNHKGGNIVFKADEPGYLF 530 ++ P +LGG I EV+ S T+ P +AG G++ G + +E GY+ Sbjct 189 RLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRY-FEEHGYIM 247 Query 531 CITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVGK 590 I SI PR Y QG D D + P+ +G Q+ + + N T G Sbjct 248 GIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEGTFGY 307 Query 591 QPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD-INTYTTYIFPHLYNNIFADTDVTS 649 P + EY + N+ +G+F N + LNRIF + N TT++ + N +FA + + Sbjct 308 TPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAETSD 365 Query 650 LK 651 K Sbjct 366 DK 367 >gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=541 Score = 65.9 bits (159), Expect = 5e-08, Method: Compositional matrix adjust. Identities = 52/182 (29%), Positives = 82/182 (45%), Gaps = 5/182 (3%) Query 472 HIETPIYLGGSSLEIEFQEVVNNSGTED-QPLGTLAGRGVATNHKGGNIVFKADEPGYLF 530 ++ P +LGG I EV+ S T+ P +AG G++ G + +E GY+ Sbjct 340 RLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRY-FEEHGYIM 398 Query 531 CITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVGK 590 I SI PR Y QG D D + P+ +G Q+ + + N T G Sbjct 399 GIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEGTFGY 458 Query 591 QPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD-INTYTTYIFPHLYNNIFADTDVTS 649 P + EY + N+ +G+F N + LNRIF + N TT++ + N +FA + + Sbjct 459 TPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAETSD 516 Query 650 LK 651 K Sbjct 517 DK 518 Score = 40.0 bits (92), Expect = 5.7, Method: Compositional matrix adjust. Identities = 21/74 (28%), Positives = 34/74 (46%), Gaps = 0/74 (0%) Query 18 TRLNNYNRSTHDLSFVMRTSMAPGVLVPTLKMLMLPGDTFPVKTRCHTLTHPTVGPLFGS 77 +L R+ +LS+ + ++ G L+P + ++PGD F V T P V P+ Sbjct 8 VKLKRPRRNVFNLSYENKLTVNAGELIPIMCKPVVPGDKFRVNTEMLVRLAPLVAPMMHR 67 Query 78 FKQQNDFFFCPIRL 91 +FF P RL Sbjct 68 VDVFTHYFFVPNRL 81 >gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis] Length=537 Score = 65.1 bits (157), Expect = 1e-07, Method: Compositional matrix adjust. Identities = 72/254 (28%), Positives = 111/254 (44%), Gaps = 23/254 (9%) Query 377 EDISGRHIPNCAYPMVGLALKTYQSDVNTNW--VNTEWLDGDSGINSITAIDTSGGSFTL 434 +D++G PN K +SDVN N V+ + L D N + + S T+ Sbjct 243 KDMAGNPAPN----------KDLRSDVNGNLQDVSGQPLSLDPSKNLKLNMASENVS-TV 291 Query 435 DTLNLAKKVYTMLNRIAISDGSYNAWIQTVY---TSGGLNHIETPIYLGGSSLEIEFQEV 491 + L A K+ L + A + Y I + + TS G ++ P +LGG+ I EV Sbjct 292 NDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSDG--RLQRPEFLGGNKSPIMISEV 349 Query 492 VNNSGTEDQ-PLGTLAGRGVATNHKGGNIVFKADEPGYLFCITSITPRVDYFQGNEWDMY 550 + S T+ P G +AG G+ GG F +E GY+ + S+ P+ Y QG Sbjct 350 LQQSATDSTTPQGNMAGHGIGIGKDGGFSRF-FEEHGYVIGLMSVIPKTSYSQGIPRHFS 408 Query 551 LESLDDLHKPQLDGIGFQDRLYKHINA-NTDSTEFNKTVGKQPAWIEYMTNVNKTYGNFA 609 D PQ + IG Q K I A N D+ + G P + EY + + +G+F Sbjct 409 KSDKFDYFWPQFEHIGEQPVYNKEIFAKNIDAFDSEAVFGYLPRYSEYKFSPSTVHGDFK 468 Query 610 LVENEGWMCLNRIF 623 ++ + L RIF Sbjct 469 --DDLYFWHLGRIF 480 >gi|517172762|ref|WP_018361580.1| hypothetical protein [Prevotella nanceiensis] Length=568 Score = 63.9 bits (154), Expect = 2e-07, Method: Compositional matrix adjust. Identities = 53/185 (29%), Positives = 83/185 (45%), Gaps = 20/185 (11%) Query 478 YLGGSSLEIEFQEVVNNSGT-----EDQPLGTLAGR--GVATNHKGGNIVFKADEPGYLF 530 Y+GG I+ +V +SGT +D G GR G AT G+I F A E G L Sbjct 351 YIGGFDSNIQVGDVTQSSGTTVTGTKDTSFGGYLGRTTGKATGSGSGHIRFDAKEHGILM 410 Query 531 CITSITPRVDYFQGNEWDMYLESLD--DLHKPQLDGIGFQDRLYKHINANTDSTEFNKTV 588 CI S+ P V Y D +++ ++ D P+ + +G Q K+I+ ++ N + Sbjct 411 CIYSLVPDVQY-DSKRVDPFVQKIERGDFFVPEFENLGMQPLFAKNISYKYNNNTANSRI 469 Query 589 ------GKQPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD----INTYTTYIFPHLY 638 G QP + EY T ++ +G F E + + R G+ N T I P Sbjct 470 KNLGAFGWQPRYSEYKTALDINHGQFVHQEPLSYWTVARARGESMSNFNISTFKINPKWL 529 Query 639 NNIFA 643 +++FA Sbjct 530 DDVFA 534 >gi|639237429|ref|WP_024568106.1| hypothetical protein [Elizabethkingia anophelis] Length=546 Score = 60.8 bits (146), Expect = 2e-06, Method: Compositional matrix adjust. Identities = 67/237 (28%), Positives = 104/237 (44%), Gaps = 16/237 (7%) Query 392 VGLALKTYQSDVNTNWVNTEWLDGDSGINSITAIDTSGGSFTLDTLNLAKKVYTMLNRIA 451 +G + S N+N VN D+ N + T+ GS T++ L A K+ L + A Sbjct 264 IGHLMVETSSTGNSNPVNI-----DNSSNLGVDLKTASGS-TINDLRRAFKLQEWLEKNA 317 Query 452 ISDGSYNAWIQTVY---TSGGLNHIETPIYLGGSSLEIEFQEVVNNSGTEDQ-PLGTLAG 507 + Y I + + TS G ++ P +LGG+ I EV+ S T+ P G +AG Sbjct 318 RAGSRYAESILSFFGVKTSDG--RLQRPEFLGGNKTPILISEVLQQSSTDSTTPQGNMAG 375 Query 508 RGVATNHKGGNIVFKADEPGYLFCITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGF 567 G++ +GG F +E GY+ + S+ P+ Y QG D PQ + IG Sbjct 376 HGISVGKEGGFSKF-FEEHGYVIGLMSVIPKTSYSQGIPRHFSKFDKFDYFWPQFEHIGE 434 Query 568 QDRLYKHINA-NTDSTEFNKTVGKQPAWIEYMTNVNKTYGNFALVENEGWMCLNRIF 623 Q K I A N + G P + EY + + +G+F + + L RIF Sbjct 435 QPVYNKEIFAKNVGDYDSGGVFGYVPRYSEYKYSPSTIHGDFK--DTLYFWHLGRIF 489 Lambda K H a alpha 0.317 0.135 0.400 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 5013181281552