bitscore colors: <40, 40-50 , 50-80, 80-200, >200

BLASTP 2.2.30+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
49,011,213 sequences; 17,563,301,199 total letters
Query= Contig-46_CDS_annotation_glimmer3.pl_2_2
Length=461
Score E
Sequences producing significant alignments: (Bits) Value
gi|649557305|gb|KDS63784.1| capsid family protein 87.4 4e-16
gi|492501782|ref|WP_005867318.1| hypothetical protein 89.7 7e-16
gi|649569140|gb|KDS75238.1| capsid family protein 86.7 3e-15
gi|649555287|gb|KDS61824.1| capsid family protein 87.0 5e-15
gi|494610271|ref|WP_007368517.1| capsid protein 87.0 5e-15
gi|547920049|ref|WP_022322420.1| capsid protein VP1 85.9 1e-14
gi|647452987|ref|WP_025792807.1| hypothetical protein 70.5 1e-09
gi|565841287|ref|WP_023924568.1| hypothetical protein 65.5 4e-08
gi|494308783|ref|WP_007173938.1| hypothetical protein 61.2 8e-07
gi|599087863|gb|AHN52857.1| major capsid protein 57.8 2e-06
>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=245
Score = 87.4 bits (215), Expect = 4e-16, Method: Compositional matrix adjust.
Identities = 66/226 (29%), Positives = 107/226 (47%), Gaps = 17/226 (8%)
Query 239 ATYGIRSATLP-ESPIFCGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGL 296
+ +G+RS+ + P F GG ++ I+ E++ S+T+ P +AG G++ G
Sbjct 34 SHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGF 91
Query 297 KIKCTEPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaaw 355
E IM + SI PR Y QG K + + NMD F+ P +G QE+ EE
Sbjct 92 TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLN 150
Query 356 ntettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGN 415
++ + G P + EY NE +G+F M AF LNR+++E +
Sbjct 151 ESDAANE-----GTFGYTPRYAEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN--- 200
Query 416 ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDVTARRVMSAKQIPNL 461
+T+++ N +FA + S +WVQ+ D+ A R+M P L
Sbjct 201 -TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYGTPML 245
>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis
CL09T03C24]
Length=538
Score = 89.7 bits (221), Expect = 7e-16, Method: Compositional matrix adjust.
Identities = 75/264 (28%), Positives = 124/264 (47%), Gaps = 17/264 (6%)
Query 201 VDVTDGKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRSATLP-ESPIFCGGMQ 259
V+V + ++++ L + R A + Y + +G+RS+ + P F GG +
Sbjct 289 VNVDELGVSINDLRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGR 348
Query 260 SEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSITPRIDY 318
+ I+ E++ SAT+ P +AG G++ G K E I+ + SI PR Y
Sbjct 349 TPISVSEVLQTSATDSTSPQANMAGHGISA--GVNHGFKRYFEEHGYIIGIMSIRPRTGY 406
Query 319 SQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaawntettenYKHIYSSLGKQPSWI 377
QG K + + NMD F+ P +G QE+ EE T + N + G P +
Sbjct 407 QQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEEVYLQQTPASNN-----GTFGYTPRYA 460
Query 378 EYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNASTYIDPTIYNSIFAESRLSSQ 437
EY +NE +G+F M AF LNR++ E+ + +T+++ N +FA + S
Sbjct 461 EYKYSMNEVHGDFRGNM--AFWHLNRIFSESPNLN----TTFVECNPSNRVFATAETSDD 514
Query 438 NFWVQVAFDVTARRVMSAKQIPNL 461
+W+Q+ DV A R+M P L
Sbjct 515 KYWIQLYQDVKALRLMPKYGTPML 538
>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str.
3999B T(B) 6]
Length=390
Score = 86.7 bits (213), Expect = 3e-15, Method: Compositional matrix adjust.
Identities = 66/226 (29%), Positives = 107/226 (47%), Gaps = 17/226 (8%)
Query 239 ATYGIRSATLP-ESPIFCGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGL 296
+ +G+RS+ + P F GG ++ I+ E++ S+T+ P +AG G++ G
Sbjct 179 SHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGF 236
Query 297 KIKCTEPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaaw 355
E IM + SI PR Y QG K + + NMD F+ P +G QE+ EE
Sbjct 237 TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLN 295
Query 356 ntettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGN 415
++ + G P + EY NE +G+F M AF LNR+++E +
Sbjct 296 ESDAANE-----GTFGYTPRYAEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN--- 345
Query 416 ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDVTARRVMSAKQIPNL 461
+T+++ N +FA + S +WVQ+ D+ A R+M P L
Sbjct 346 -TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYGTPML 390
>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=541
Score = 87.0 bits (214), Expect = 5e-15, Method: Compositional matrix adjust.
Identities = 66/226 (29%), Positives = 107/226 (47%), Gaps = 17/226 (8%)
Query 239 ATYGIRSATLP-ESPIFCGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGL 296
+ +G+RS+ + P F GG ++ I+ E++ S+T+ P +AG G++ G
Sbjct 330 SHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGF 387
Query 297 KIKCTEPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaaw 355
E IM + SI PR Y QG K + + NMD F+ P +G QE+ EE
Sbjct 388 TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLN 446
Query 356 ntettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGN 415
++ + G P + EY NE +G+F M AF LNR+++E +
Sbjct 447 ESDAANE-----GTFGYTPRYAEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN--- 496
Query 416 ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDVTARRVMSAKQIPNL 461
+T+++ N +FA + S +WVQ+ D+ A R+M P L
Sbjct 497 -TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYGTPML 541
>gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis]
gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM
16608]
Length=531
Score = 87.0 bits (214), Expect = 5e-15, Method: Compositional matrix adjust.
Identities = 83/314 (26%), Positives = 138/314 (44%), Gaps = 55/314 (18%)
Query 188 IDGTTGGINAITAVDVTD--GKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRS 245
+ G + IN ++ + V D +D ++ + N L+ Y + EA +G R
Sbjct 233 VSGASTFINGVSVLSVNDLRAAFALDKMLEATRRANGLD--------YSSQIEAHFGFR- 283
Query 246 ATLPESPI----FCGGMQSEIAFDEIVSNS-----ATEEEPLGTLAGRGVATMYKSGRGL 296
+PES F GG + + E+V+ S A E LG L G+GV ++ S
Sbjct 284 --VPESRAGDARFIGGFDNPVVISEVVNQSEFDRGADESPCLGDLGGKGVGSLNSSSIDF 341
Query 297 KIKCTEPSMIMALGSITPRIDYSQGNKW--WTRLQNMDDFHKPTLDAIGFQ-----ELIt 349
+K E +IM + S+ P+ +Y+ G + + R +DF +P +G+Q +LI+
Sbjct 342 DVK--EHGIIMCIYSVVPQTEYN-GTYFDPFNRKLRREDFFQPEFADLGYQPVVTSDLIS 398
Query 350 eeaaawntettenYKHIYSS------------LGKQPSWIEYTTDVNETYGEFAAGMPLA 397
+ E K + + LG Q + EY T + +GEF +G+ L+
Sbjct 399 TYLDNPVPDGPEKQKRLAAGYPLSSIEANNRLLGWQVRYNEYKTSRDLVFGEFESGLSLS 458
Query 398 FMCLNRVYEENTDHTIGN----------ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDV 447
+ C R Y+ D G+ A Y++P+I N+IF S + + +F V FDV
Sbjct 459 YWCSPR-YDFGFDGKAGDKKLVNSPWSPAHFYVNPSILNTIFLVSAVKADHFLVNSFFDV 517
Query 448 TARRVMSAKQIPNL 461
A R MS + L
Sbjct 518 KAVRPMSVSGLAGL 531
>gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
Length=553
Score = 85.9 bits (211), Expect = 1e-14, Method: Compositional matrix adjust.
Identities = 73/269 (27%), Positives = 121/269 (45%), Gaps = 17/269 (6%)
Query 196 NAITAVDVTDGKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRSATLP-ESPIF 254
N V+V + + ++ L + R A Y + +G+RS+ + P F
Sbjct 299 NGTLKVNVDEMGININDLRTSNALQRWFERNARGGSRYIEQILSHFGVRSSDARLQRPQF 358
Query 255 CGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSIT 313
GG + I+ E++ S+T+E P +AG G++ +G K E I+ + SIT
Sbjct 359 LGGGRMPISVSEVLQTSSTDETSPQANMAGHGISAGINNG--FKHYFEEHGYIIGIMSIT 416
Query 314 PRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaawntettenYKHIYSSLGK 372
PR Y QG + +T+ NMD F+ P + QE+ + Y + + G
Sbjct 417 PRSGYQQGVPRDFTKFDNMD-FYFPEFAHLSEQEI---KNQELFVSEDAAYNN--GTFGY 470
Query 373 QPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNASTYIDPTIYNSIFAES 432
P + EY +E +G+F L+F LNR++E+ + +T+++ N +FA S
Sbjct 471 TPRYAEYKYHPSEAHGDFRGN--LSFWHLNRIFEDKPNLN----TTFVECKPSNRVFATS 524
Query 433 RLSSQNFWVQVAFDVTARRVMSAKQIPNL 461
FWVQ+ DV A R+M P L
Sbjct 525 ETEDDKFWVQMYQDVKALRLMPKYGTPML 553
>gi|647452987|ref|WP_025792807.1| hypothetical protein [Prevotella histicola]
Length=584
Score = 70.5 bits (171), Expect = 1e-09, Method: Compositional matrix adjust.
Identities = 74/270 (27%), Positives = 118/270 (44%), Gaps = 50/270 (19%)
Query 233 YQAWREATYGIRSATLPESPI----FCGGMQSEIAFDEIVS---NSATE--EEPLGTLAG 283
Y + EA +G + +PES F GG + I E+VS N+A++ +G L G
Sbjct 324 YASQIEAHFGFK---VPESRANDARFLGGFDNSIVVSEVVSTNGNAASDGSHASIGDLGG 380
Query 284 RGVATMYKSGRGLKIKCTEPSMIMALGSITPRIDYSQG-----NKWWTRLQNMDDFHKPT 338
+G+ +M S ++ TE +IM + S+ P+ +Y+ N+ TR Q F++P
Sbjct 381 KGIGSM--SSGTIEFDSTEHGIIMCIYSVAPQSEYNASYLDPFNRKLTREQ----FYQPE 434
Query 339 LDAIGFQELIte---eaaawntettenYKHIYSS---LGKQPSWIEYTTDVNETYGEFAA 392
+G+Q LI + E + I + LG Q + EY T + +G+F +
Sbjct 435 FADLGYQALIGSDLICSTLGMNEKQAGFSDIELNNNLLGYQVRYNEYKTARDLVFGDFES 494
Query 393 GMPLAFMCLNR-----------VYEENTD----HTIGNAST------YIDPTIYNSIFAE 431
G L++ C R + EN GN S YI+P + N IF
Sbjct 495 GKSLSYWCTPRFDFGYGDTEKKIAPENKGGADYRKKGNRSHWSSRNFYINPNLVNPIFLT 554
Query 432 SRLSSQNFWVQVAFDVTARRVMSAKQIPNL 461
S + + +F V DV A R MS + +L
Sbjct 555 SAVQADHFIVNSFLDVKAVRPMSVTGLSSL 584
>gi|565841287|ref|WP_023924568.1| hypothetical protein [Prevotella nigrescens]
gi|564729907|gb|ETD29851.1| hypothetical protein HMPREF1173_00033 [Prevotella nigrescens
CC14M]
Length=656
Score = 65.5 bits (158), Expect = 4e-08, Method: Compositional matrix adjust.
Identities = 83/338 (25%), Positives = 138/338 (41%), Gaps = 49/338 (14%)
Query 144 GTIELPNYNRTKVYRSSNAWFSQAGLAVKTYLSDRFNN---WLNTEWIDGTTGGINAITA 200
G ELP+Y G A D NN L + +D + G N I+
Sbjct 342 GIFELPDYIN-----------GNTGFATTEVKRDVVNNRGSQLEIKSMDAGSLGSNNISY 390
Query 201 VDVTDGKLTMDALILQKKIFNMLNRVAITDGT-YQAWREATYGIRSATLPES----PIFC 255
+ D + A+ +K ML R +G Y A +G + +PES F
Sbjct 391 ISPND----IRAMFALEK---MLERTRAANGLDYSNQIAAHFGFK---VPESRKNCASFI 440
Query 256 GGMQSEIAFDEIVSNS-------ATEEEPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMA 308
GG ++I+ E+V+ S A+ +G + G+G+ M SG + E +IM
Sbjct 441 GGFDNQISISEVVTTSNGSVDGTASTGSVVGQVFGKGIGAM-NSGH-ISYDVKEHGLIMC 498
Query 309 LGSITPRIDY-SQGNKWWTRLQNMDDFHKPTLDAIGFQELIteeaaawntettenYKHIY 367
+ SI P++DY ++ + R + +D+ +P + +G Q +I + + +
Sbjct 499 IYSIAPQVDYDARELDPFNRKFSREDYFQPEFENLGMQPVIQSDLCLCINSAKSDSSDQH 558
Query 368 SS-LGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNAS---TYIDPT 423
++ LG ++EY T + +GEF +G L+ + N G S +DP
Sbjct 559 NNVLGYSARYLEYKTARDIIFGEFMSGGSLSAWATPK---NNYTFEFGKLSLPDLLVDPK 615
Query 424 IYNSIFA---ESRLSSQNFWVQVAFDVTARRVMSAKQI 458
+ IFA +S+ F V FDV A R M +
Sbjct 616 VLEPIFAVKYNGSMSTDQFLVNSYFDVKAIRPMQVNDM 653
>gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis]
gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM
17361]
Length=553
Score = 61.2 bits (147), Expect = 8e-07, Method: Compositional matrix adjust.
Identities = 67/253 (26%), Positives = 111/253 (44%), Gaps = 35/253 (14%)
Query 195 INAITAVDVTDGKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRSATLPESPI- 253
+N D ++G ++ +L + +L+ T+Q A YG+ +P+S
Sbjct 283 VNFGVDTDSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYGVE---IPDSRDG 339
Query 254 ---FCGGMQSEIAFDEIVSNS---ATEEEP----LGTLAGRGVATMYKSGRG-LKIKCTE 302
+ GG S++ ++ S ATE +P LG +AG+G SGRG + E
Sbjct 340 RVNYLGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTG----SGRGRIVFDAKE 395
Query 303 PSMIMALGSITPRIDYSQGNKWWTRLQNMDD------FHKPTLDAIGFQELIteeaaawn 356
++M + S+ P+I Y TRL M D + P + +G Q L + +++
Sbjct 396 HGVLMCIYSLVPQIQYD-----CTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYISSFC 450
Query 357 tettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNA 416
T +N LG QP + EY T ++ +G+FA L+ ++R T + A
Sbjct 451 TTDPKN-----PVLGYQPRYSEYKTALDVNHGQFAQSDALSSWSVSRFRRWTTFPQLEIA 505
Query 417 STYIDPTIYNSIF 429
IDP NSIF
Sbjct 506 DFKIDPGCLNSIF 518
>gi|599087863|gb|AHN52857.1| major capsid protein, partial [uncultured Gokushovirinae]
Length=219
Score = 57.8 bits (138), Expect = 2e-06, Method: Compositional matrix adjust.
Identities = 48/168 (29%), Positives = 81/168 (48%), Gaps = 11/168 (7%)
Query 188 IDGTTGGINAITAVDVTDG-KLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRS- 245
+ G T ++ + D+T+ T++ L +I ML R A Y ++ +G+ S
Sbjct 51 VSGDTSAVSNVMYADLTEATAATINQLRQAFQIQKMLERDARGGTRYTEIIKSHFGVTSP 110
Query 246 -ATLPESPIFCGGMQSEIAFDEIVSNSATEEE---PLGTLAGRGVATMYKSGRGLKIKCT 301
A L + P + GG + + + + S T+++ P GTLA G A + G G T
Sbjct 111 DARL-QRPEYLGGGSTPVIINPVAQTSGTDQQSDTPQGTLAAIGTAQV--RGHGFTKSFT 167
Query 302 EPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELI 348
E +I+ L S+ + Y QG N+ W R Q D++ P L +G QE++
Sbjct 168 EHCIILGLVSVRADLTYQQGLNRMWNR-QTRYDYYFPALSHLGEQEIL 214
Lambda K H a alpha
0.316 0.131 0.388 0.792 4.96
Gapped
Lambda K H a alpha sigma
0.267 0.0410 0.140 1.90 42.6 43.6
Effective search space used: 3125101418307