bitscore colors: <40, 40-50 , 50-80, 80-200, >200

BLASTP 2.2.30+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
49,011,213 sequences; 17,563,301,199 total letters
Query= Contig-23_CDS_annotation_glimmer3.pl_2_3
Length=658
Score E
Sequences producing significant alignments: (Bits) Value
gi|494308783|ref|WP_007173938.1| hypothetical protein 71.2 1e-09
gi|547920049|ref|WP_022322420.1| capsid protein VP1 70.1 3e-09
gi|494306153|ref|WP_007173049.1| hypothetical protein 68.9 5e-09
gi|492501782|ref|WP_005867318.1| hypothetical protein 68.9 5e-09
gi|649557305|gb|KDS63784.1| capsid family protein 65.5 1e-08
gi|649569140|gb|KDS75238.1| capsid family protein 65.9 4e-08
gi|649555287|gb|KDS61824.1| capsid family protein 65.9 5e-08
gi|609718276|emb|CDN73650.1| conserved hypothetical protein 65.1 1e-07
gi|517172762|ref|WP_018361580.1| hypothetical protein 63.9 2e-07
gi|639237429|ref|WP_024568106.1| hypothetical protein 60.8 2e-06
>gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis]
gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM
17361]
Length=553
Score = 71.2 bits (173), Expect = 1e-09, Method: Compositional matrix adjust.
Identities = 57/200 (29%), Positives = 94/200 (47%), Gaps = 25/200 (13%)
Query 426 DTSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVYTSGGLNHIETPI-------Y 478
D+S G F++ +L A V +L+ + ++ ++ Y +E P Y
Sbjct 290 DSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYG------VEIPDSRDGRVNY 343
Query 479 LGGSSLEIEFQEVVNNSGT---EDQP----LGTLAGRGVATNHKGGNIVFKADEPGYLFC 531
LGG +++ +V SGT E +P LG +AG+G + G IVF A E G L C
Sbjct 344 LGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTGSGR--GRIVFDAKEHGVLMC 401
Query 532 ITSITPRVDYFQGNEWDMYLESLD--DLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVG 589
I S+ P++ Y D ++ LD D P+ + +G Q +I++ + N +G
Sbjct 402 IYSLVPQIQY-DCTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYISSFCTTDPKNPVLG 460
Query 590 KQPAWIEYMTNVNKTYGNFA 609
QP + EY T ++ +G FA
Sbjct 461 YQPRYSEYKTALDVNHGQFA 480
>gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
Length=553
Score = 70.1 bits (170), Expect = 3e-09, Method: Compositional matrix adjust.
Identities = 57/185 (31%), Positives = 87/185 (47%), Gaps = 9/185 (5%)
Query 472 HIETPIYLGGSSLEIEFQEVVNNSGT-EDQPLGTLAGRGVATNHKGGNIVFKA--DEPGY 528
++ P +LGG + I EV+ S T E P +AG G++ G N FK +E GY
Sbjct 352 RLQRPQFLGGGRMPISVSEVLQTSSTDETSPQANMAGHGISA---GINNGFKHYFEEHGY 408
Query 529 LFCITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTV 588
+ I SITPR Y QG D D + P+ + Q+ + + + D+ N T
Sbjct 409 IIGIMSITPRSGYQQGVPRDFTKFDNMDFYFPEFAHLSEQEIKNQELFVSEDAAYNNGTF 468
Query 589 GKQPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD-INTYTTYIFPHLYNNIFADTDV 647
G P + EY + ++ +G+F N + LNRIF D N TT++ N +FA ++
Sbjct 469 GYTPRYAEYKYHPSEAHGDFR--GNLSFWHLNRIFEDKPNLNTTFVECKPSNRVFATSET 526
Query 648 TSLKL 652
K
Sbjct 527 EDDKF 531
Score = 50.1 bits (118), Expect = 0.004, Method: Compositional matrix adjust.
Identities = 46/177 (26%), Positives = 77/177 (44%), Gaps = 19/177 (11%)
Query 19 RLNNYNRSTHDLSFVMRTSMAPGVLVPTLKMLMLPGDTFPVKTRCHTLTHPTVGPLFGSF 78
R+ R+ +LS+ + ++ G LVP + M ++ GD F VKT P V P+
Sbjct 9 RMKRPRRNAFNLSYESKLTLNMGELVPIMCMPVVSGDKFRVKTESLVRLAPLVAPMMHRV 68
Query 79 KQQNDFFFCPIRL-YNAMLHNNALNIGLDMKKVKLPIVRIIASDLDLTKKMKGSNGTLKK 137
+FF P RL +N + + G+D + +P+ I + D + S +K+
Sbjct 69 NVFTHYFFVPNRLVWNEW--EDFITKGVDGE--DMPMFPKIQINQD--SHLVSSASLIKE 122
Query 138 MIHPSSLVKTLGLSNLEKNNSQQWD------------WNAIPILAYFDIFKNYYANK 182
SSL LGL L ++ +D +A+P AY I+ YY ++
Sbjct 123 YFGDSSLWDYLGLPTLSACGNKSYDVVNGVKVPSGFQVSALPFRAYQLIYNEYYRDQ 179
>gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis]
gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM
17361]
Length=519
Score = 68.9 bits (167), Expect = 5e-09, Method: Compositional matrix adjust.
Identities = 60/234 (26%), Positives = 104/234 (44%), Gaps = 29/234 (12%)
Query 396 LKTYQSDVNTNWVNTEWLDGDSG----INSITAIDTSGGSFTLDTLNLAKKVYTMLNRIA 451
+ + D + N+ ++ D +N +D + G F++ +L A V +L+
Sbjct 222 IPEFSDDEHLNFDRDQYADQSKSNFTQLNFPVDVDNNLGYFSVSSLRSAFAVDKLLSVTM 281
Query 452 ISDGSYNAWIQTVYTSGGLNHIETPI-------YLGGSSLEIEFQEVVNNSGT---EDQP 501
+ ++ ++ Y +E P YLGG +++ +V SGT E +P
Sbjct 282 RAGKTFQDQMRAHYG------VEIPDSRDGRVNYLGGFDSDLQVSDVTQTSGTTATEYKP 335
Query 502 ----LGTLAGRGVATNHKGGNIVFKADEPGYLFCITSITPRVDYFQGNEWDMYLESLD-- 555
LG +AG+G + G IVF A E G L CI S+ P++ Y D ++ LD
Sbjct 336 EAGYLGRIAGKGTGSGR--GRIVFDAKEHGVLMCIYSLVPQIQY-DCTRLDPMVDKLDRF 392
Query 556 DLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVGKQPAWIEYMTNVNKTYGNFA 609
D P+ + +G Q +I++ N +G QP + EY T ++ +G FA
Sbjct 393 DFFTPEFENLGMQPLNSSYISSFCTPDPKNPVLGYQPRYSEYKTALDINHGQFA 446
>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis
CL09T03C24]
Length=538
Score = 68.9 bits (167), Expect = 5e-09, Method: Compositional matrix adjust.
Identities = 55/184 (30%), Positives = 85/184 (46%), Gaps = 9/184 (5%)
Query 472 HIETPIYLGGSSLEIEFQEVVNNSGTED-QPLGTLAGRGVATNHKGGNIVFKA--DEPGY 528
++ P +LGG I EV+ S T+ P +AG G++ G N FK +E GY
Sbjct 337 RLQRPQFLGGGRTPISVSEVLQTSATDSTSPQANMAGHGISA---GVNHGFKRYFEEHGY 393
Query 529 LFCITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTV 588
+ I SI PR Y QG D D + P+ +G Q+ + + N T
Sbjct 394 IIGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEVYLQQTPASNNGTF 453
Query 589 GKQPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD-INTYTTYIFPHLYNNIFADTDV 647
G P + EY ++N+ +G+F N + LNRIF + N TT++ + N +FA +
Sbjct 454 GYTPRYAEYKYSMNEVHGDFR--GNMAFWHLNRIFSESPNLNTTFVECNPSNRVFATAET 511
Query 648 TSLK 651
+ K
Sbjct 512 SDDK 515
Score = 39.7 bits (91), Expect = 8.1, Method: Compositional matrix adjust.
Identities = 22/74 (30%), Positives = 33/74 (45%), Gaps = 0/74 (0%)
Query 18 TRLNNYNRSTHDLSFVMRTSMAPGVLVPTLKMLMLPGDTFPVKTRCHTLTHPTVGPLFGS 77
+L R+ +LS+ + + G LVP + ++PGD F V T P V P+
Sbjct 8 VKLKRPRRNVFNLSYENKLTANAGELVPIMCKPVVPGDKFRVNTEMLVRLAPLVAPMMHR 67
Query 78 FKQQNDFFFCPIRL 91
+FF P RL
Sbjct 68 VDVFTHYFFVPNRL 81
>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=245
Score = 65.5 bits (158), Expect = 1e-08, Method: Compositional matrix adjust.
Identities = 52/181 (29%), Positives = 82/181 (45%), Gaps = 5/181 (3%)
Query 473 IETPIYLGGSSLEIEFQEVVNNSGTED-QPLGTLAGRGVATNHKGGNIVFKADEPGYLFC 531
++ P +LGG I EV+ S T+ P +AG G++ G + +E GY+
Sbjct 45 LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRY-FEEHGYIMG 103
Query 532 ITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVGKQ 591
I SI PR Y QG D D + P+ +G Q+ + + N T G
Sbjct 104 IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEGTFGYT 163
Query 592 PAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGDI-NTYTTYIFPHLYNNIFADTDVTSL 650
P + EY + N+ +G+F N + LNRIF + N TT++ + N +FA + +
Sbjct 164 PRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAETSDD 221
Query 651 K 651
K
Sbjct 222 K 222
>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str.
3999B T(B) 6]
Length=390
Score = 65.9 bits (159), Expect = 4e-08, Method: Compositional matrix adjust.
Identities = 52/182 (29%), Positives = 82/182 (45%), Gaps = 5/182 (3%)
Query 472 HIETPIYLGGSSLEIEFQEVVNNSGTED-QPLGTLAGRGVATNHKGGNIVFKADEPGYLF 530
++ P +LGG I EV+ S T+ P +AG G++ G + +E GY+
Sbjct 189 RLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRY-FEEHGYIM 247
Query 531 CITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVGK 590
I SI PR Y QG D D + P+ +G Q+ + + N T G
Sbjct 248 GIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEGTFGY 307
Query 591 QPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD-INTYTTYIFPHLYNNIFADTDVTS 649
P + EY + N+ +G+F N + LNRIF + N TT++ + N +FA + +
Sbjct 308 TPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAETSD 365
Query 650 LK 651
K
Sbjct 366 DK 367
>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=541
Score = 65.9 bits (159), Expect = 5e-08, Method: Compositional matrix adjust.
Identities = 52/182 (29%), Positives = 82/182 (45%), Gaps = 5/182 (3%)
Query 472 HIETPIYLGGSSLEIEFQEVVNNSGTED-QPLGTLAGRGVATNHKGGNIVFKADEPGYLF 530
++ P +LGG I EV+ S T+ P +AG G++ G + +E GY+
Sbjct 340 RLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRY-FEEHGYIM 398
Query 531 CITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVGK 590
I SI PR Y QG D D + P+ +G Q+ + + N T G
Sbjct 399 GIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEGTFGY 458
Query 591 QPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD-INTYTTYIFPHLYNNIFADTDVTS 649
P + EY + N+ +G+F N + LNRIF + N TT++ + N +FA + +
Sbjct 459 TPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAETSD 516
Query 650 LK 651
K
Sbjct 517 DK 518
Score = 40.0 bits (92), Expect = 5.7, Method: Compositional matrix adjust.
Identities = 21/74 (28%), Positives = 34/74 (46%), Gaps = 0/74 (0%)
Query 18 TRLNNYNRSTHDLSFVMRTSMAPGVLVPTLKMLMLPGDTFPVKTRCHTLTHPTVGPLFGS 77
+L R+ +LS+ + ++ G L+P + ++PGD F V T P V P+
Sbjct 8 VKLKRPRRNVFNLSYENKLTVNAGELIPIMCKPVVPGDKFRVNTEMLVRLAPLVAPMMHR 67
Query 78 FKQQNDFFFCPIRL 91
+FF P RL
Sbjct 68 VDVFTHYFFVPNRL 81
>gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis]
Length=537
Score = 65.1 bits (157), Expect = 1e-07, Method: Compositional matrix adjust.
Identities = 72/254 (28%), Positives = 111/254 (44%), Gaps = 23/254 (9%)
Query 377 EDISGRHIPNCAYPMVGLALKTYQSDVNTNW--VNTEWLDGDSGINSITAIDTSGGSFTL 434
+D++G PN K +SDVN N V+ + L D N + + S T+
Sbjct 243 KDMAGNPAPN----------KDLRSDVNGNLQDVSGQPLSLDPSKNLKLNMASENVS-TV 291
Query 435 DTLNLAKKVYTMLNRIAISDGSYNAWIQTVY---TSGGLNHIETPIYLGGSSLEIEFQEV 491
+ L A K+ L + A + Y I + + TS G ++ P +LGG+ I EV
Sbjct 292 NDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSDG--RLQRPEFLGGNKSPIMISEV 349
Query 492 VNNSGTEDQ-PLGTLAGRGVATNHKGGNIVFKADEPGYLFCITSITPRVDYFQGNEWDMY 550
+ S T+ P G +AG G+ GG F +E GY+ + S+ P+ Y QG
Sbjct 350 LQQSATDSTTPQGNMAGHGIGIGKDGGFSRF-FEEHGYVIGLMSVIPKTSYSQGIPRHFS 408
Query 551 LESLDDLHKPQLDGIGFQDRLYKHINA-NTDSTEFNKTVGKQPAWIEYMTNVNKTYGNFA 609
D PQ + IG Q K I A N D+ + G P + EY + + +G+F
Sbjct 409 KSDKFDYFWPQFEHIGEQPVYNKEIFAKNIDAFDSEAVFGYLPRYSEYKFSPSTVHGDFK 468
Query 610 LVENEGWMCLNRIF 623
++ + L RIF
Sbjct 469 --DDLYFWHLGRIF 480
>gi|517172762|ref|WP_018361580.1| hypothetical protein [Prevotella nanceiensis]
Length=568
Score = 63.9 bits (154), Expect = 2e-07, Method: Compositional matrix adjust.
Identities = 53/185 (29%), Positives = 83/185 (45%), Gaps = 20/185 (11%)
Query 478 YLGGSSLEIEFQEVVNNSGT-----EDQPLGTLAGR--GVATNHKGGNIVFKADEPGYLF 530
Y+GG I+ +V +SGT +D G GR G AT G+I F A E G L
Sbjct 351 YIGGFDSNIQVGDVTQSSGTTVTGTKDTSFGGYLGRTTGKATGSGSGHIRFDAKEHGILM 410
Query 531 CITSITPRVDYFQGNEWDMYLESLD--DLHKPQLDGIGFQDRLYKHINANTDSTEFNKTV 588
CI S+ P V Y D +++ ++ D P+ + +G Q K+I+ ++ N +
Sbjct 411 CIYSLVPDVQY-DSKRVDPFVQKIERGDFFVPEFENLGMQPLFAKNISYKYNNNTANSRI 469
Query 589 ------GKQPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD----INTYTTYIFPHLY 638
G QP + EY T ++ +G F E + + R G+ N T I P
Sbjct 470 KNLGAFGWQPRYSEYKTALDINHGQFVHQEPLSYWTVARARGESMSNFNISTFKINPKWL 529
Query 639 NNIFA 643
+++FA
Sbjct 530 DDVFA 534
>gi|639237429|ref|WP_024568106.1| hypothetical protein [Elizabethkingia anophelis]
Length=546
Score = 60.8 bits (146), Expect = 2e-06, Method: Compositional matrix adjust.
Identities = 67/237 (28%), Positives = 104/237 (44%), Gaps = 16/237 (7%)
Query 392 VGLALKTYQSDVNTNWVNTEWLDGDSGINSITAIDTSGGSFTLDTLNLAKKVYTMLNRIA 451
+G + S N+N VN D+ N + T+ GS T++ L A K+ L + A
Sbjct 264 IGHLMVETSSTGNSNPVNI-----DNSSNLGVDLKTASGS-TINDLRRAFKLQEWLEKNA 317
Query 452 ISDGSYNAWIQTVY---TSGGLNHIETPIYLGGSSLEIEFQEVVNNSGTEDQ-PLGTLAG 507
+ Y I + + TS G ++ P +LGG+ I EV+ S T+ P G +AG
Sbjct 318 RAGSRYAESILSFFGVKTSDG--RLQRPEFLGGNKTPILISEVLQQSSTDSTTPQGNMAG 375
Query 508 RGVATNHKGGNIVFKADEPGYLFCITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGF 567
G++ +GG F +E GY+ + S+ P+ Y QG D PQ + IG
Sbjct 376 HGISVGKEGGFSKF-FEEHGYVIGLMSVIPKTSYSQGIPRHFSKFDKFDYFWPQFEHIGE 434
Query 568 QDRLYKHINA-NTDSTEFNKTVGKQPAWIEYMTNVNKTYGNFALVENEGWMCLNRIF 623
Q K I A N + G P + EY + + +G+F + + L RIF
Sbjct 435 QPVYNKEIFAKNVGDYDSGGVFGYVPRYSEYKYSPSTIHGDFK--DTLYFWHLGRIF 489
Lambda K H a alpha
0.317 0.135 0.400 0.792 4.96
Gapped
Lambda K H a alpha sigma
0.267 0.0410 0.140 1.90 42.6 43.6
Effective search space used: 5013181281552