bitscore colors: <40, 40-50 , 50-80, 80-200, >200

BLASTP 2.2.30+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
49,011,213 sequences; 17,563,301,199 total letters
Query= Contig-20_CDS_annotation_glimmer3.pl_2_5
Length=648
Score E
Sequences producing significant alignments: (Bits) Value
gi|547312923|ref|WP_022044635.1| putative uncharacterized protein 60.1 2e-06
gi|639237429|ref|WP_024568106.1| hypothetical protein 52.4 0.001
gi|444298000|dbj|GAC77839.1| major capsid protein 51.2 0.002
gi|492501782|ref|WP_005867318.1| hypothetical protein 49.7 0.006
gi|494610271|ref|WP_007368517.1| capsid protein 47.4 0.029
gi|649555287|gb|KDS61824.1| capsid family protein 46.2 0.078
gi|609718276|emb|CDN73650.1| conserved hypothetical protein 45.8 0.088
gi|649569140|gb|KDS75238.1| capsid family protein 45.1 0.13
gi|444298142|dbj|GAC77768.1| major capsid protein 44.7 0.15
gi|649557305|gb|KDS63784.1| capsid family protein 42.7 0.46
>gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
Length=338
Score = 60.1 bits (144), Expect = 2e-06, Method: Compositional matrix adjust.
Identities = 70/269 (26%), Positives = 106/269 (39%), Gaps = 48/269 (18%)
Query 357 CPSSPDRFSRLMPPGDSNS---------DVDF-TGVKT-IPQLAVATRLQEYKDLIGASG 405
P SPD F ++ G S + D++ TG +P+L + T++Q + D + SG
Sbjct 15 VPYSPDLFGNIIKQGSSPAVEIEVMNALDLNISTGFSVAVPELRLRTKIQNWMDRLFVSG 74
Query 406 SRYSDWLYTFFASKIEHVDRPKLLFsssvmvnsqvvmnqAGQSGFAGGEAAALGQMGGSI 465
R D T + +K + K F + +A +G A GE A LGQ+ +
Sbjct 75 GRVGDVFRTLWGTKSSAIYVNKPDFLGVWQASINPSNVRAMANGSASGEDANLGQLAACV 134
Query 466 ----AFNTVLGREQTYYFKEPG--YIFDMLTIRPVYFWTGIRPDYLEYRGPDYFNPIYND 519
F+ G + YY KEPG + ML P Y G+ PD D FNP N
Sbjct 135 DRYCDFSGHSGID--YYAKEPGTFMLITMLVPEPAYS-QGLHPDLASISFGDDFNPELNG 191
Query 520 IGYQDVPFWRI-----GYGWKAASESQS----------------MTVAKEPCYNEFRSSY 558
IG+Q VP R G+ + + S ++V +E ++ R+ Y
Sbjct 192 IGFQLVPRHRFSMMPRGFNFTGLDQEASPWFGHTGTGVLVDPNMVSVGEEVAWSWLRTDY 251
Query 559 DEVLGSLQSTLTPKASVPLQSYWVQQRDF 587
+ G + YWV R F
Sbjct 252 SRLHGDFAQNGNYQ-------YWVLTRRF 273
>gi|639237429|ref|WP_024568106.1| hypothetical protein [Elizabethkingia anophelis]
Length=546
Score = 52.4 bits (124), Expect = 0.001, Method: Compositional matrix adjust.
Identities = 67/273 (25%), Positives = 112/273 (41%), Gaps = 36/273 (13%)
Query 373 SNSDVDFTGVK--TIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFASKIE--HVDRPKL 428
SN VD TI L A +LQE+ + +GSRY++ + +FF K + RP+
Sbjct 286 SNLGVDLKTASGSTINDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEF 345
Query 429 LFsssvmvnsqvvmnqAGQ-----SGFAGGEAAALGQMGGSIAFNTVLGREQTYYFKEPG 483
L + + V+ Q+ G G ++G+ GG F F+E G
Sbjct 346 LGGNKTPILISEVLQQSSTDSTTPQGNMAGHGISVGKEGGFSKF-----------FEEHG 394
Query 484 YIFDMLTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPFWRIGYGWKAASESQS 542
Y+ ++++ P + GI + ++ DYF P + IG Q V I +
Sbjct 395 YVIGLMSVIPKTSYSQGIPRHFSKFDKFDYFWPQFEHIGEQPVYNKEI-FAKNVGDYDSG 453
Query 543 MTVAKEPCYNEFRSSYDEVLGSLQSTLTPKASVPLQSYWVQQRDFYMIGLSSNPNEISPS 602
P Y+E++ S + G + TL +W R F SS P +++
Sbjct 454 GVFGYVPRYSEYKYSPSTIHGDFKDTLY---------FWHLGRIFD----SSAPPKLNRD 500
Query 603 MLFTNLSTVNNPFA-SDMEDNFFVNMSYKVVVK 634
+ N S ++ FA D D F+ ++ K+ K
Sbjct 501 FIEVNKSGLSRIFAVEDNSDKFYCHLYQKITAK 533
>gi|444298000|dbj|GAC77839.1| major capsid protein [uncultured marine virus]
Length=480
Score = 51.2 bits (121), Expect = 0.002, Method: Compositional matrix adjust.
Identities = 61/278 (22%), Positives = 111/278 (40%), Gaps = 33/278 (12%)
Query 375 SDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWL-YTFFASKIEHVDRPKLLFsss 433
+D+ TI + A +Q Y++ GSRY+++L Y K + RP+ + +
Sbjct 228 ADLQAATGGTINDIRRAFAIQRYQEARSRYGSRYTEYLRYLGVNPKDARLQRPEYMGGGT 287
Query 434 vmvnsqvvmnqAGQSGFAGGEAAALGQMGGSIAFNTVLGREQTY--YFKEPGYIFDMLTI 491
+N V+ + + G + + +G R Y Y +E GYI ML++
Sbjct 288 TQINFSEVLQTSPE--IPGEDQVSQFGVGDMYGHGIAAMRSNKYRRYIEEHGYIISMLSV 345
Query 492 RPVYFWT-GIRPDYLEYRGPDYFNPIYNDIGYQDVPFWRIGYGWKAASESQSMTVAKEPC 550
RP +T GI +L DY+ IG Q++ I A +E T
Sbjct 346 RPKTMYTNGIHRSWLRLTKEDYYQKELEHIGQQEIMNNEIYADEGAGTE----TFGYNDR 401
Query 551 YNEFRSSYDEVLGSLQSTLTPKASVPLQSYWVQQRDFYMIGLSSNPNEISPSML---FTN 607
Y+E+R + V + L +YW R+F +P +L F +
Sbjct 402 YSEYRETPSHVSAEFRGIL---------NYWHMAREFE-----------APPVLNQSFVD 441
Query 608 LSTVNNPFASDMEDNFFVNMSYKVVVKNLVNKSFATRL 645
+D ++ + +K+V + L++++ A R+
Sbjct 442 CDATKRIHNEQTQDALWIMIQHKMVARRLLSRNAAPRI 479
>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis
CL09T03C24]
Length=538
Score = 49.7 bits (117), Expect = 0.006, Method: Compositional matrix adjust.
Identities = 50/206 (24%), Positives = 90/206 (44%), Gaps = 12/206 (6%)
Query 368 MPPGDSNSDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFA--SKIEHVDR 425
+ P + +VD GV +I L + LQ + + SGSRY + + + F S + R
Sbjct 282 LEPDNFQVNVDELGV-SINDLRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQR 340
Query 426 PKLLFsssvmvnsqvvmnqAGQSGFAGGEAAALGQMGGSIAFNTVLGREQTYYFKEPGYI 485
P+ L ++ V+ + + A + G S N R YF+E GYI
Sbjct 341 PQFLGGGRTPISVSEVLQTSATD--STSPQANMAGHGISAGVNHGFKR----YFEEHGYI 394
Query 486 FDMLTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPFWRIGYGWKAASESQSMT 544
+++IRP + G+ D+ ++ D++ P + +G Q++ + AS + T
Sbjct 395 IGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEVYLQQTPASNNG--T 452
Query 545 VAKEPCYNEFRSSYDEVLGSLQSTLT 570
P Y E++ S +EV G + +
Sbjct 453 FGYTPRYAEYKYSMNEVHGDFRGNMA 478
>gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis]
gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM
16608]
Length=531
Score = 47.4 bits (111), Expect = 0.029, Method: Compositional matrix adjust.
Identities = 57/216 (26%), Positives = 92/216 (43%), Gaps = 39/216 (18%)
Query 447 QSGFAGG--EAAALGQMGGSIAFNTVLGREQTYYFKEPGYIFDMLTIRPVYFWTG--IRP 502
QS F G E+ LG +GG ++ + KE G I + ++ P + G P
Sbjct 310 QSEFDRGADESPCLGDLGGK-GVGSLNSSSIDFDVKEHGIIMCIYSVVPQTEYNGTYFDP 368
Query 503 DYLEYRGPDYFNPIYNDIGYQ------------DVPF-------WRIGYGWKAAS-ESQS 542
+ R D+F P + D+GYQ D P R+ G+ +S E+ +
Sbjct 369 FNRKLRREDFFQPEFADLGYQPVVTSDLISTYLDNPVPDGPEKQKRLAAGYPLSSIEANN 428
Query 543 MTVAKEPCYNEFRSSYDEVLGSLQSTLTPKASVPLQSYWVQQR-DFYMIGLSSNPNEI-- 599
+ + YNE+++S D V G +S L+ SYW R DF G + + +
Sbjct 429 RLLGWQVRYNEYKTSRDLVFGEFESGLS-------LSYWCSPRYDFGFDGKAGDKKLVNS 481
Query 600 --SPSMLFTNLSTVNNPF--ASDMEDNFFVNMSYKV 631
SP+ + N S +N F ++ D+F VN + V
Sbjct 482 PWSPAHFYVNPSILNTIFLVSAVKADHFLVNSFFDV 517
>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=541
Score = 46.2 bits (108), Expect = 0.078, Method: Compositional matrix adjust.
Identities = 64/275 (23%), Positives = 111/275 (40%), Gaps = 25/275 (9%)
Query 312 AYPYDVQEPKVNWNNGSGTDTGTPSKVYFAATLNVPFLAAHPMA---VCPSSPDRFS--- 365
A P+ + P+V G + K FAA F P++ V S+P S
Sbjct 216 ALPWVQRGPEVTVPINGGGEIPVEMKEGFAAQKITTFPDRKPISGSEVLYSAPSVLSYGQ 275
Query 366 -------RLMPPGDSNSDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFA- 417
L+ P + + D GV I + + LQ + + SGSRY + + + F
Sbjct 276 IGSIKGQALIEPDNFVVNTDQMGV-NINDIRTSNALQRWFERNARSGSRYIEQILSHFGV 334
Query 418 -SKIEHVDRPKLLFsssvmvnsqvvmnqAGQSGFAGGEAAALGQMGGSIAFNTVLGREQT 476
S + RP+ L ++ V+ + S + A + G S N R
Sbjct 335 RSSDARLQRPQFLGGGRTPISVSEVLQTS--STDSTSPQANMAGHGISAGVNHGFTR--- 389
Query 477 YYFKEPGYIFDMLTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPFWRIGYGWK 535
YF+E GYI +++IRP + G+ D+ ++ D++ P + +G Q++ +
Sbjct 390 -YFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNES 448
Query 536 AASESQSMTVAKEPCYNEFRSSYDEVLGSLQSTLT 570
A+ T P Y E++ S +EV G + +
Sbjct 449 DAANEG--TFGYTPRYAEYKYSQNEVHGDFRGNMA 481
>gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis]
Length=537
Score = 45.8 bits (107), Expect = 0.088, Method: Compositional matrix adjust.
Identities = 61/262 (23%), Positives = 111/262 (42%), Gaps = 34/262 (13%)
Query 382 VKTIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFASKIE--HVDRPKLLFsssvmvnsq 439
V T+ L A +LQE+ + +GSRY++ + +FF K + RP+ L + +
Sbjct 288 VSTVNDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEFLGGNKSPIMIS 347
Query 440 vvmnqAGQ-----SGFAGGEAAALGQMGGSIAFNTVLGREQTYYFKEPGYIFDMLTIRP- 493
V+ Q+ G G +G+ GG + +F+E GY+ ++++ P
Sbjct 348 EVLQQSATDSTTPQGNMAGHGIGIGKDGGF-----------SRFFEEHGYVIGLMSVIPK 396
Query 494 VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPFWRIGYGWKAASESQSMTVAKEPCYNE 553
+ GI + + DYF P + IG Q V I A +S+++ P Y+E
Sbjct 397 TSYSQGIPRHFSKSDKFDYFWPQFEHIGEQPVYNKEIFAKNIDAFDSEAV-FGYLPRYSE 455
Query 554 FRSSYDEVLGSLQSTLTPKASVPLQSYWVQQRDFYMIGLSSNPNEISPSMLFTNLSTVNN 613
++ S V G + L +W R F + P ++ S + + + ++
Sbjct 456 YKFSPSTVHGDFKDDLY---------FWHLGRIFD----TDKPPVLNQSFIECDKNALSR 502
Query 614 PFA-SDMEDNFFVNMSYKVVVK 634
FA D D F+ ++ K+ K
Sbjct 503 IFAVEDDTDKFYCHLYQKITAK 524
>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str.
3999B T(B) 6]
Length=390
Score = 45.1 bits (105), Expect = 0.13, Method: Compositional matrix adjust.
Identities = 64/275 (23%), Positives = 111/275 (40%), Gaps = 25/275 (9%)
Query 312 AYPYDVQEPKVNWNNGSGTDTGTPSKVYFAATLNVPFLAAHPMA---VCPSSPDRFS--- 365
A P+ + P+V G + K FAA F P++ V S+P S
Sbjct 65 ALPWVQRGPEVTVPINGGGEIPVEMKEGFAAQKITTFPDRKPISGSEVLYSAPSVLSYGQ 124
Query 366 -------RLMPPGDSNSDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFA- 417
L+ P + + D GV I + + LQ + + SGSRY + + + F
Sbjct 125 IGSIKGQALIEPDNFVVNTDQMGV-NINDIRTSNALQRWFERNARSGSRYIEQILSHFGV 183
Query 418 -SKIEHVDRPKLLFsssvmvnsqvvmnqAGQSGFAGGEAAALGQMGGSIAFNTVLGREQT 476
S + RP+ L ++ V+ + S + A + G S N R
Sbjct 184 RSSDARLQRPQFLGGGRTPISVSEVLQTS--STDSTSPQANMAGHGISAGVNHGFTR--- 238
Query 477 YYFKEPGYIFDMLTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPFWRIGYGWK 535
YF+E GYI +++IRP + G+ D+ ++ D++ P + +G Q++ +
Sbjct 239 -YFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNES 297
Query 536 AASESQSMTVAKEPCYNEFRSSYDEVLGSLQSTLT 570
A + T P Y E++ S +EV G + +
Sbjct 298 DA--ANEGTFGYTPRYAEYKYSQNEVHGDFRGNMA 330
>gi|444298142|dbj|GAC77768.1| major capsid protein [uncultured marine virus]
Length=299
Score = 44.7 bits (104), Expect = 0.15, Method: Compositional matrix adjust.
Identities = 39/153 (25%), Positives = 70/153 (46%), Gaps = 6/153 (4%)
Query 375 SDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWL-YTFFASKIEHVDRPKLLFsss 433
+D+ G I L A LQ Y++ G+R++++L Y +S + RP+++ +
Sbjct 111 ADLSQAGAININDLREAFALQRYQEARNLYGARFTEYLRYLGISSSXGRLQRPEMISTGK 170
Query 434 vmvnsqvvmnqAGQSGFAGGEAAALGQMGGSIAFNTVLGREQTYYFKEPGYIFDMLTIRP 493
+N V+N G SG + LG+MGG V Y+ +E G+I ++++RP
Sbjct 171 SNINFSEVLNTTGPSGV---DDHPLGEMGGH-GIAGVKSNRARYFCEEHGHIISLMSVRP 226
Query 494 -VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDV 525
+ T + DY+ IG ++V
Sbjct 227 KTIYMTTQHKQFDRESKEDYWQKELQAIGMEEV 259
>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=245
Score = 42.7 bits (99), Expect = 0.46, Method: Compositional matrix adjust.
Identities = 46/189 (24%), Positives = 83/189 (44%), Gaps = 13/189 (7%)
Query 385 IPQLAVATRLQEYKDLIGASGSRYSDWLYTFFA--SKIEHVDRPKLLFsssvmvnsqvvm 442
I + + LQ + + SGSRY + + + F S + RP+ L ++ V+
Sbjct 5 INDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVL 64
Query 443 nqAGQSGFAGGEAAALGQMGGSIAFNTVLGREQTYYFKEPGYIFDMLTIRP-VYFWTGIR 501
+ S + A + G S N R YF+E GYI +++IRP + G+
Sbjct 65 QTS--STDSTSPQANMAGHGISAGVNHGFTR----YFEEHGYIMGIMSIRPRTGYQQGVP 118
Query 502 PDYLEYRGPDYFNPIYNDIGYQDVPFWRIGYGWK-AASESQSMTVAKEPCYNEFRSSYDE 560
D+ ++ D++ P + +G Q++ + AA+E T P Y E++ S +E
Sbjct 119 KDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEG---TFGYTPRYAEYKYSQNE 175
Query 561 VLGSLQSTL 569
V G + +
Sbjct 176 VHGDFRGNM 184
Lambda K H a alpha
0.320 0.136 0.425 0.792 4.96
Gapped
Lambda K H a alpha sigma
0.267 0.0410 0.140 1.90 42.6 43.6
Effective search space used: 4913515649712