bitscore colors: <40, 40-50 , 50-80, 80-200, >200

BLASTP 2.2.30+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
49,011,213 sequences; 17,563,301,199 total letters
Query= Contig-26_CDS_annotation_glimmer3.pl_2_1
Length=567
Score E
Sequences producing significant alignments: (Bits) Value
gi|649557305|gb|KDS63784.1| capsid family protein 82.4 3e-14
gi|649569140|gb|KDS75238.1| capsid family protein 82.8 1e-13
gi|649555287|gb|KDS61824.1| capsid family protein 82.8 2e-13
gi|492501782|ref|WP_005867318.1| hypothetical protein 80.5 1e-12
gi|547920049|ref|WP_022322420.1| capsid protein VP1 78.6 4e-12
gi|547312923|ref|WP_022044635.1| putative uncharacterized protein 68.6 2e-09
gi|494308783|ref|WP_007173938.1| hypothetical protein 61.6 8e-07
gi|496521299|ref|WP_009229582.1| capsid protein 61.6 9e-07
gi|494306153|ref|WP_007173049.1| hypothetical protein 59.3 4e-06
gi|517172762|ref|WP_018361580.1| hypothetical protein 59.7 4e-06
>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=245
Score = 82.4 bits (202), Expect = 3e-14, Method: Compositional matrix adjust.
Identities = 62/208 (30%), Positives = 99/208 (48%), Gaps = 13/208 (6%)
Query 366 VETPIYLGGSSMEIEFQEVVNNSGTED-QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFC 424
++ P +LGG I EV+ S T+ P ++AG G++ G +Y +E GYI
Sbjct 45 LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRYF-EEHGYIMG 103
Query 425 ITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQD----RLYKNINSSAKREDLIKS 480
I SI PR Y QG D D + P+ +G Q+ LY N + +A +
Sbjct 104 IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANE----GT 159
Query 481 IGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTY-TTYIQPHLYNNIFADTD 539
G P + +Y S N +G+F N + LNR+F + TT+++ + N +FA +
Sbjct 160 FGYTPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAE 217
Query 540 VAAQNFWVQIAFNVEARRVMSAKVIPNL 567
+ +WVQI +++A R+M P L
Sbjct 218 TSDDKYWVQIYQDIKALRLMPKYGTPML 245
>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str.
3999B T(B) 6]
Length=390
Score = 82.8 bits (203), Expect = 1e-13, Method: Compositional matrix adjust.
Identities = 62/208 (30%), Positives = 100/208 (48%), Gaps = 13/208 (6%)
Query 366 VETPIYLGGSSMEIEFQEVVNNSGTED-QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFC 424
++ P +LGG I EV+ S T+ P ++AG G++ G +Y +E GYI
Sbjct 190 LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRYF-EEHGYIMG 248
Query 425 ITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQD----RLYKNINSSAKREDLIKS 480
I SI PR Y QG D D + P+ +G Q+ LY N + +A +
Sbjct 249 IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANE----GT 304
Query 481 IGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGD-IDTYTTYIQPHLYNNIFADTD 539
G P + +Y S N +G+F N + LNR+F + + TT+++ + N +FA +
Sbjct 305 FGYTPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAE 362
Query 540 VAAQNFWVQIAFNVEARRVMSAKVIPNL 567
+ +WVQI +++A R+M P L
Sbjct 363 TSDDKYWVQIYQDIKALRLMPKYGTPML 390
>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=541
Score = 82.8 bits (203), Expect = 2e-13, Method: Compositional matrix adjust.
Identities = 62/208 (30%), Positives = 99/208 (48%), Gaps = 13/208 (6%)
Query 366 VETPIYLGGSSMEIEFQEVVNNSGTED-QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFC 424
++ P +LGG I EV+ S T+ P ++AG G++ G +Y +E GYI
Sbjct 341 LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRYF-EEHGYIMG 399
Query 425 ITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQD----RLYKNINSSAKREDLIKS 480
I SI PR Y QG D D + P+ +G Q+ LY N + +A +
Sbjct 400 IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANE----GT 455
Query 481 IGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTY-TTYIQPHLYNNIFADTD 539
G P + +Y S N +G+F N + LNR+F + TT+++ + N +FA +
Sbjct 456 FGYTPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAE 513
Query 540 VAAQNFWVQIAFNVEARRVMSAKVIPNL 567
+ +WVQI +++A R+M P L
Sbjct 514 TSDDKYWVQIYQDIKALRLMPKYGTPML 541
>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis
CL09T03C24]
Length=538
Score = 80.5 bits (197), Expect = 1e-12, Method: Compositional matrix adjust.
Identities = 78/297 (26%), Positives = 130/297 (44%), Gaps = 24/297 (8%)
Query 277 RPSCSYPLVGLALKTYQSDINTNWVNTEWLDGDSGINSITAIDTSGGSFTLDTLNLAKKV 336
RP+ + LVG AL +D +L+ D N +D G S ++ L + +
Sbjct 260 RPAGAMQLVGGALIAGGTD-------GAYLEPD---NFQVNVDELGVS--INDLRTSNAL 307
Query 337 YTMLNRIAISDGSYNAWIQTVYTSGGLN----HVETPIYLGGSSMEIEFQEVVNNSGTED 392
R A S Y I+ + + G+ ++ P +LGG I EV+ S T+
Sbjct 308 QRWFERNARSGSRY---IEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSATDS 364
Query 393 -QPLGSLAGRGVTDNHKGGVIKYKPDEPGYIFCITSITPRVDYYQGNDWDLEIETLDDLH 451
P ++AG G++ G +Y +E GYI I SI PR Y QG D D +
Sbjct 365 TSPQANMAGHGISAGVNHGFKRYF-EEHGYIIGIMSIRPRTGYQQGVPKDFRKFDNMDFY 423
Query 452 KPQLDGIGFQDRLYKNINSSAKREDLIKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMC 511
P+ +G Q+ + + + G P + +Y S N +G+F N +
Sbjct 424 FPEFAHLGEQEIKNEEVYLQQTPASNNGTFGYTPRYAEYKYSMNEVHGDFR--GNMAFWH 481
Query 512 LNRVFGDIDTY-TTYIQPHLYNNIFADTDVAAQNFWVQIAFNVEARRVMSAKVIPNL 567
LNR+F + TT+++ + N +FA + + +W+Q+ +V+A R+M P L
Sbjct 482 LNRIFSESPNLNTTFVECNPSNRVFATAETSDDKYWIQLYQDVKALRLMPKYGTPML 538
>gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
Length=553
Score = 78.6 bits (192), Expect = 4e-12, Method: Compositional matrix adjust.
Identities = 61/205 (30%), Positives = 95/205 (46%), Gaps = 5/205 (2%)
Query 365 HVETPIYLGGSSMEIEFQEVVNNSGT-EDQPLGSLAGRGVTDNHKGGVIKYKPDEPGYIF 423
++ P +LGG M I EV+ S T E P ++AG G++ G K+ +E GYI
Sbjct 352 RLQRPQFLGGGRMPISVSEVLQTSSTDETSPQANMAGHGISAGINNG-FKHYFEEHGYII 410
Query 424 CITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQDRLYKNINSSAKREDLIKSIGK 483
I SITPR Y QG D D + P+ + Q+ + + S + G
Sbjct 411 GIMSITPRSGYQQGVPRDFTKFDNMDFYFPEFAHLSEQEIKNQELFVSEDAAYNNGTFGY 470
Query 484 QPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTY-TTYIQPHLYNNIFADTDVAA 542
P + +Y + +G+F N + LNR+F D TT+++ N +FA ++
Sbjct 471 TPRYAEYKYHPSEAHGDFR--GNLSFWHLNRIFEDKPNLNTTFVECKPSNRVFATSETED 528
Query 543 QNFWVQIAFNVEARRVMSAKVIPNL 567
FWVQ+ +V+A R+M P L
Sbjct 529 DKFWVQMYQDVKALRLMPKYGTPML 553
>gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
Length=338
Score = 68.6 bits (166), Expect = 2e-09, Method: Compositional matrix adjust.
Identities = 79/305 (26%), Positives = 123/305 (40%), Gaps = 53/305 (17%)
Query 312 INSITAID---TSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVY-TSGGLNHVE 367
I + A+D ++G S + L L K+ ++R+ +S G +T++ T +V
Sbjct 36 IEVMNALDLNISTGFSVAVPELRLRTKIQNWMDRLFVSGGRVGDVFRTLWGTKSSAIYVN 95
Query 368 TPIYLGGSSMEIE---FQEVVNNSGT-EDQPLGSLAGRGVTD------NHKGGVIKYKPD 417
P +LG I + + N S + ED LG LA D H G I Y
Sbjct 96 KPDFLGVWQASINPSNVRAMANGSASGEDANLGQLAA--CVDRYCDFSGHSG--IDYYAK 151
Query 418 EPGYIFCITSITPRVDYYQGNDWDLEIETLDDLHKPQLDGIGFQ-------DRLYKNINS 470
EPG IT + P Y QG DL + D P+L+GIGFQ + + N
Sbjct 152 EPGTFMLITMLVPEPAYSQGLHPDLASISFGDDFNPELNGIGFQLVPRHRFSMMPRGFNF 211
Query 471 SAKREDL----------------IKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNR 514
+ ++ + S+G++ AW T ++R +G+FA N + L R
Sbjct 212 TGLDQEASPWFGHTGTGVLVDPNMVSVGEEVAWSWLRTDYSRLHGDFAQNGNYQYWVLTR 271
Query 515 VF------------GDIDTYTTYIQPHLYNNIFADTDVAAQNFWVQIAFNVEARRVMSAK 562
F D + TYI P + +F D + A NF F++ +SA
Sbjct 272 RFTTYFPDDGTGFYQDGEYTGTYINPLDWQYVFVDQTLMAGNFAYYGTFDLNVTSSLSAN 331
Query 563 VIPNL 567
+P L
Sbjct 332 YMPYL 336
>gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis]
gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM
17361]
Length=553
Score = 61.6 bits (148), Expect = 8e-07, Method: Compositional matrix adjust.
Identities = 61/238 (26%), Positives = 103/238 (43%), Gaps = 30/238 (13%)
Query 319 DTSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVYTSGGLNHVETPI-------Y 371
D+S G F++ +L A V +L+ + ++ ++ Y VE P Y
Sbjct 290 DSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYG------VEIPDSRDGRVNY 343
Query 372 LGGSSMEIEFQEVVNNSGT---EDQP----LGSLAGRGVTDNHKGGVIKYKPDEPGYIFC 424
LGG +++ +V SGT E +P LG +AG+G G I + E G + C
Sbjct 344 LGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTGSGR--GRIVFDAKEHGVLMC 401
Query 425 ITSITPRVDYYQGNDWDLEIETLD--DLHKPQLDGIGFQDRLYKNINSSAKREDLIKSIG 482
I S+ P++ Y D ++ LD D P+ + +G Q I+S + +G
Sbjct 402 IYSLVPQIQY-DCTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYISSFCTTDPKNPVLG 460
Query 483 KQPAWLDYMTSFNRNYGNFALIENEGWMCLNR-----VFGDIDTYTTYIQPHLYNNIF 535
QP + +Y T+ + N+G FA + ++R F ++ I P N+IF
Sbjct 461 YQPRYSEYKTALDVNHGQFAQSDALSSWSVSRFRRWTTFPQLEIADFKIDPGCLNSIF 518
>gi|496521299|ref|WP_009229582.1| capsid protein [Prevotella sp. oral taxon 317]
gi|288330570|gb|EFC69154.1| putative capsid protein (F protein) [Prevotella sp. oral taxon
317 str. F0108]
Length=541
Score = 61.6 bits (148), Expect = 9e-07, Method: Compositional matrix adjust.
Identities = 58/220 (26%), Positives = 98/220 (45%), Gaps = 30/220 (14%)
Query 371 YLGGSSMEIEFQEVVNNSGTEDQP------------LGSLAGRGVTDNHKGGVIKYKPDE 418
YLGG ++ +V SGT + LG + G+G + G I++ E
Sbjct 329 YLGGFDSNVQVGDVTQTSGTTNPNVSEVGNAKLAGYLGKITGKGTGSGY--GEIQFDAKE 386
Query 419 PGYIFCITSITPRVDY-YQGNDWDLEIETLDDLHKPQLDGIGFQDRLYKNINSSAKREDL 477
PG + CI S+ P + Y D + +T D P+ + +G Q + ++ + +++
Sbjct 387 PGVLMCIYSVVPAMQYDCMRLDPFVAKQTRGDYFIPEFENLGMQPIVPAFVSLNRAKDN- 445
Query 478 IKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGDIDTYTTY------IQPHLY 531
S G QP + +Y T+F+ N+G FA E + + R G DT T+ I PH
Sbjct 446 --SYGWQPRYSEYKTAFDINHGQFANGEPLSYWSIARARGS-DTLNTFNVAALKINPHWL 502
Query 532 NNIFA----DTDVAAQNFWVQIAFNVEARRVMSAKVIPNL 567
+++FA T+V F FN+E M+ +P +
Sbjct 503 DSVFAVNYNGTEVTDCMFGYA-HFNIEKVSDMTEDGMPRV 541
>gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis]
gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM
17361]
Length=519
Score = 59.3 bits (142), Expect = 4e-06, Method: Compositional matrix adjust.
Identities = 54/207 (26%), Positives = 92/207 (44%), Gaps = 25/207 (12%)
Query 312 INSITAIDTSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVYTSGGLNHVETPI- 370
+N +D + G F++ +L A V +L+ + ++ ++ Y VE P
Sbjct 249 LNFPVDVDNNLGYFSVSSLRSAFAVDKLLSVTMRAGKTFQDQMRAHYG------VEIPDS 302
Query 371 ------YLGGSSMEIEFQEVVNNSGT---EDQP----LGSLAGRGVTDNHKGGVIKYKPD 417
YLGG +++ +V SGT E +P LG +AG+G G I +
Sbjct 303 RDGRVNYLGGFDSDLQVSDVTQTSGTTATEYKPEAGYLGRIAGKGTGSGR--GRIVFDAK 360
Query 418 EPGYIFCITSITPRVDYYQGNDWDLEIETLD--DLHKPQLDGIGFQDRLYKNINSSAKRE 475
E G + CI S+ P++ Y D ++ LD D P+ + +G Q I+S +
Sbjct 361 EHGVLMCIYSLVPQIQY-DCTRLDPMVDKLDRFDFFTPEFENLGMQPLNSSYISSFCTPD 419
Query 476 DLIKSIGKQPAWLDYMTSFNRNYGNFA 502
+G QP + +Y T+ + N+G FA
Sbjct 420 PKNPVLGYQPRYSEYKTALDINHGQFA 446
>gi|517172762|ref|WP_018361580.1| hypothetical protein [Prevotella nanceiensis]
Length=568
Score = 59.7 bits (143), Expect = 4e-06, Method: Compositional matrix adjust.
Identities = 49/185 (26%), Positives = 81/185 (44%), Gaps = 20/185 (11%)
Query 371 YLGGSSMEIEFQEVVNNSGT-----EDQPLGSLAGR--GVTDNHKGGVIKYKPDEPGYIF 423
Y+GG I+ +V +SGT +D G GR G G I++ E G +
Sbjct 351 YIGGFDSNIQVGDVTQSSGTTVTGTKDTSFGGYLGRTTGKATGSGSGHIRFDAKEHGILM 410
Query 424 CITSITPRVDYYQGNDWDLEIETLD--DLHKPQLDGIGFQDRLYKNI------NSSAKRE 475
CI S+ P V Y D ++ ++ D P+ + +G Q KNI N++ R
Sbjct 411 CIYSLVPDVQY-DSKRVDPFVQKIERGDFFVPEFENLGMQPLFAKNISYKYNNNTANSRI 469
Query 476 DLIKSIGKQPAWLDYMTSFNRNYGNFALIENEGWMCLNRVFGD----IDTYTTYIQPHLY 531
+ + G QP + +Y T+ + N+G F E + + R G+ + T I P
Sbjct 470 KNLGAFGWQPRYSEYKTALDINHGQFVHQEPLSYWTVARARGESMSNFNISTFKINPKWL 529
Query 532 NNIFA 536
+++FA
Sbjct 530 DDVFA 534
Lambda K H a alpha
0.317 0.136 0.403 0.792 4.96
Gapped
Lambda K H a alpha sigma
0.267 0.0410 0.140 1.90 42.6 43.6
Effective search space used: 4166738442540