bitscore colors: <40, 40-50 , 50-80, 80-200, >200

BLASTP 2.2.30+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
49,011,213 sequences; 17,563,301,199 total letters
Query= Contig-37_CDS_annotation_glimmer3.pl_2_1
Length=251
Score E
Sequences producing significant alignments: (Bits) Value
gi|492501782|ref|WP_005867318.1| hypothetical protein 77.4 6e-13
gi|649557305|gb|KDS63784.1| capsid family protein 74.7 7e-13
gi|547920049|ref|WP_022322420.1| capsid protein VP1 75.5 3e-12
gi|649569140|gb|KDS75238.1| capsid family protein 74.7 3e-12
gi|649555287|gb|KDS61824.1| capsid family protein 73.9 7e-12
gi|609718276|emb|CDN73650.1| conserved hypothetical protein 70.1 2e-10
gi|547312923|ref|WP_022044635.1| putative uncharacterized protein 67.4 8e-10
gi|639237429|ref|WP_024568106.1| hypothetical protein 66.2 3e-09
gi|599087551|gb|AHN52701.1| major capsid protein 55.1 5e-06
gi|565841287|ref|WP_023924568.1| hypothetical protein 55.8 9e-06
>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis
CL09T03C24]
Length=538
Score = 77.4 bits (189), Expect = 6e-13, Method: Compositional matrix adjust.
Identities = 71/244 (29%), Positives = 110/244 (45%), Gaps = 22/244 (9%)
Query 14 LNRIAVSGGTYQDWIQT---VYTNDYIERSETPVYEGGFSSEIIFQEVISNSATEN-EPL 69
R A SG Y + I + V ++D R + P + GG + I EV+ SAT++ P
Sbjct 311 FERNARSGSRYIEQILSHFGVRSSD--ARLQRPQFLGGGRTPISVSEVLQTSATDSTSPQ 368
Query 70 GTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDVDLDTLD--DLHKP 127
+AG G + G+ G K +E YIIGI+SI PR Y QG D D D + P
Sbjct 369 ANMAGHGISAGVNHG-FKRYFEEHGYIIGIMSIRPRTGYQQG--VPKDFRKFDNMDFYFP 425
Query 128 ALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPAWLDYMTNYNKVFGNFAIKDSEM 187
+G Q++ ++ + +G + G P + +Y + N+V G+F + +
Sbjct 426 EFAHLGEQEIKNEEVYLQQTPASNNG-----TFGYTPRYAEYKYSMNEVHGDF--RGNMA 478
Query 188 FMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTSLNAMNFWVQLGIGAKVRRKMSAKV 247
F LNR + N + TT+++ N VFA + +W+QL K R M
Sbjct 479 FWHLNRIFSESPNLN----TTFVECNPSNRVFATAETSDDKYWIQLYQDVKALRLMPKYG 534
Query 248 IPNL 251
P L
Sbjct 535 TPML 538
>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=245
Score = 74.7 bits (182), Expect = 7e-13, Method: Compositional matrix adjust.
Identities = 68/244 (28%), Positives = 111/244 (45%), Gaps = 22/244 (9%)
Query 14 LNRIAVSGGTYQDWIQT---VYTNDYIERSETPVYEGGFSSEIIFQEVISNSATEN-EPL 69
R A SG Y + I + V ++D R + P + GG + I EV+ S+T++ P
Sbjct 18 FERNARSGSRYIEQILSHFGVRSSD--ARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQ 75
Query 70 GTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDVDLDTLD--DLHKP 127
+AG G + G+ G + +E YI+GI+SI PR Y QG D D D + P
Sbjct 76 ANMAGHGISAGVNHGFTRY-FEEHGYIMGIMSIRPRTGYQQG--VPKDFRKFDNMDFYFP 132
Query 128 ALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPAWLDYMTNYNKVFGNFAIKDSEM 187
+G Q++ ++ + +G + G P + +Y + N+V G+F + +
Sbjct 133 EFAHLGEQEIKNEELYLNESDAANEG-----TFGYTPRYAEYKYSQNEVHGDF--RGNMA 185
Query 188 FMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTSLNAMNFWVQLGIGAKVRRKMSAKV 247
F LNR ++ N + TT+++ N VFA + +WVQ+ K R M
Sbjct 186 FWHLNRIFKEKPNLN----TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYG 241
Query 248 IPNL 251
P L
Sbjct 242 TPML 245
>gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
Length=553
Score = 75.5 bits (184), Expect = 3e-12, Method: Compositional matrix adjust.
Identities = 62/214 (29%), Positives = 96/214 (45%), Gaps = 13/214 (6%)
Query 39 RSETPVYEGGFSSEIIFQEVISNSAT-ENEPLGTLAGRGQNTGMKGGTVKIKIDEPSYII 97
R + P + GG I EV+ S+T E P +AG G + G+ G K +E YII
Sbjct 352 RLQRPQFLGGGRMPISVSEVLQTSSTDETSPQANMAGHGISAGINNG-FKHYFEEHGYII 410
Query 98 GIVSITPRIDYSQGNRFDVDLDTLDDLHKPALDAIGFQDLTTNKMAWWDETITADGEKQL 157
GI+SITPR Y QG D D + P + Q++ ++ ++ D
Sbjct 411 GIMSITPRSGYQQGVPRDFTKFDNMDFYFPEFAHLSEQEIKNQEL-----FVSEDAAYNN 465
Query 158 KSVGKQPAWLDYMTNYNKVFGNFAIKDSEMFMTLNRNYEMDENKSIADLTTYIDPEKYNY 217
+ G P + +Y + ++ G+F + + F LNR +E N + TT+++ + N
Sbjct 466 GTFGYTPRYAEYKYHPSEAHGDF--RGNLSFWHLNRIFEDKPNLN----TTFVECKPSNR 519
Query 218 VFADTSLNAMNFWVQLGIGAKVRRKMSAKVIPNL 251
VFA + FWVQ+ K R M P L
Sbjct 520 VFATSETEDDKFWVQMYQDVKALRLMPKYGTPML 553
>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str.
3999B T(B) 6]
Length=390
Score = 74.7 bits (182), Expect = 3e-12, Method: Compositional matrix adjust.
Identities = 68/244 (28%), Positives = 111/244 (45%), Gaps = 22/244 (9%)
Query 14 LNRIAVSGGTYQDWIQT---VYTNDYIERSETPVYEGGFSSEIIFQEVISNSATEN-EPL 69
R A SG Y + I + V ++D R + P + GG + I EV+ S+T++ P
Sbjct 163 FERNARSGSRYIEQILSHFGVRSSD--ARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQ 220
Query 70 GTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDVDLDTLD--DLHKP 127
+AG G + G+ G + +E YI+GI+SI PR Y QG D D D + P
Sbjct 221 ANMAGHGISAGVNHGFTRY-FEEHGYIMGIMSIRPRTGYQQG--VPKDFRKFDNMDFYFP 277
Query 128 ALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPAWLDYMTNYNKVFGNFAIKDSEM 187
+G Q++ ++ + +G + G P + +Y + N+V G+F + +
Sbjct 278 EFAHLGEQEIKNEELYLNESDAANEG-----TFGYTPRYAEYKYSQNEVHGDF--RGNMA 330
Query 188 FMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTSLNAMNFWVQLGIGAKVRRKMSAKV 247
F LNR ++ N + TT+++ N VFA + +WVQ+ K R M
Sbjct 331 FWHLNRIFKEKPNLN----TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYG 386
Query 248 IPNL 251
P L
Sbjct 387 TPML 390
>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=541
Score = 73.9 bits (180), Expect = 7e-12, Method: Compositional matrix adjust.
Identities = 68/244 (28%), Positives = 111/244 (45%), Gaps = 22/244 (9%)
Query 14 LNRIAVSGGTYQDWIQT---VYTNDYIERSETPVYEGGFSSEIIFQEVISNSATEN-EPL 69
R A SG Y + I + V ++D R + P + GG + I EV+ S+T++ P
Sbjct 314 FERNARSGSRYIEQILSHFGVRSSD--ARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQ 371
Query 70 GTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDVDLDTLD--DLHKP 127
+AG G + G+ G + +E YI+GI+SI PR Y QG D D D + P
Sbjct 372 ANMAGHGISAGVNHGFTRY-FEEHGYIMGIMSIRPRTGYQQG--VPKDFRKFDNMDFYFP 428
Query 128 ALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPAWLDYMTNYNKVFGNFAIKDSEM 187
+G Q++ ++ + +G + G P + +Y + N+V G+F + +
Sbjct 429 EFAHLGEQEIKNEELYLNESDAANEG-----TFGYTPRYAEYKYSQNEVHGDF--RGNMA 481
Query 188 FMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTSLNAMNFWVQLGIGAKVRRKMSAKV 247
F LNR ++ N + TT+++ N VFA + +WVQ+ K R M
Sbjct 482 FWHLNRIFKEKPNLN----TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYG 537
Query 248 IPNL 251
P L
Sbjct 538 TPML 541
>gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis]
Length=537
Score = 70.1 bits (170), Expect = 2e-10, Method: Compositional matrix adjust.
Identities = 71/251 (28%), Positives = 111/251 (44%), Gaps = 20/251 (8%)
Query 1 MDTLNLAKKVYDMLNRIAVSGGTYQDWIQTVY---TNDYIERSETPVYEGGFSSEIIFQE 57
++ L A K+ + L + A +G Y + I + + T+D R + P + GG S I+ E
Sbjct 291 VNDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSD--GRLQRPEFLGGNKSPIMISE 348
Query 58 VISNSATENE-PLGTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDV 116
V+ SAT++ P G +AG G G GG + +E Y+IG++S+ P+ YSQG
Sbjct 349 VLQQSATDSTTPQGNMAGHGIGIGKDGGFSRF-FEEHGYVIGLMSVIPKTSYSQGIPRHF 407
Query 117 DLDTLDDLHKPALDAIGFQDLTTNKMAWWDETITADGEKQLKSVGKQPAWLDYMTNYNKV 176
D P + IG Q + NK + D E G P + +Y + + V
Sbjct 408 SKSDKFDYFWPQFEHIGEQPV-YNKEIFAKNIDAFDSEAVF---GYLPRYSEYKFSPSTV 463
Query 177 FGNFAIKDSEMFMTLNRNYEMDENKSIADLTTYIDPEKYNYVFA---DTSLNAMNFWVQL 233
G+F KD F L R ++ D+ + D + +FA DT F+ L
Sbjct 464 HGDF--KDDLYFWHLGRIFDTDKPPVLNQSFIECDKNALSRIFAVEDDTD----KFYCHL 517
Query 234 GIGAKVRRKMS 244
+RKMS
Sbjct 518 YQKITAKRKMS 528
>gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
Length=338
Score = 67.4 bits (163), Expect = 8e-10, Method: Compositional matrix adjust.
Identities = 72/286 (25%), Positives = 113/286 (40%), Gaps = 44/286 (15%)
Query 4 LNLAKKVYDMLNRIAVSGGTYQDWIQTVY-TNDYIERSETPVYEGGFSSEIIFQEVISNS 62
L L K+ + ++R+ VSGG D +T++ T P + G ++Q I+ S
Sbjct 57 LRLRTKIQNWMDRLFVSGGRVGDVFRTLWGTKSSAIYVNKPDFLG------VWQASINPS 110
Query 63 ATENEPLGTLAGRGQNTGMKGGTVKIKID------------EPSYIIGIVSITPRIDYSQ 110
G+ +G N G V D EP + I + P YSQ
Sbjct 111 NVRAMANGSASGEDANLGQLAACVDRYCDFSGHSGIDYYAKEPGTFMLITMLVPEPAYSQ 170
Query 111 GNRFDVDLDTLDDLHKPALDAIGFQDLTTNKMA-----------------WWDETITAD- 152
G D+ + D P L+ IGFQ + ++ + W+ T T
Sbjct 171 GLHPDLASISFGDDFNPELNGIGFQLVPRHRFSMMPRGFNFTGLDQEASPWFGHTGTGVL 230
Query 153 GEKQLKSVGKQPAWLDYMTNYNKVFGNFAIKDSEMFMTLNRN---YEMDENKSIAD---- 205
+ + SVG++ AW T+Y+++ G+FA + + L R Y D+
Sbjct 231 VDPNMVSVGEEVAWSWLRTDYSRLHGDFAQNGNYQYWVLTRRFTTYFPDDGTGFYQDGEY 290
Query 206 LTTYIDPEKYNYVFADTSLNAMNFWVQLGIGAKVRRKMSAKVIPNL 251
TYI+P + YVF D +L A NF V +SA +P L
Sbjct 291 TGTYINPLDWQYVFVDQTLMAGNFAYYGTFDLNVTSSLSANYMPYL 336
>gi|639237429|ref|WP_024568106.1| hypothetical protein [Elizabethkingia anophelis]
Length=546
Score = 66.2 bits (160), Expect = 3e-09, Method: Compositional matrix adjust.
Identities = 67/252 (27%), Positives = 116/252 (46%), Gaps = 22/252 (9%)
Query 1 MDTLNLAKKVYDMLNRIAVSGGTYQDWIQTVY---TNDYIERSETPVYEGGFSSEIIFQE 57
++ L A K+ + L + A +G Y + I + + T+D R + P + GG + I+ E
Sbjct 300 INDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSD--GRLQRPEFLGGNKTPILISE 357
Query 58 VISNSATENE-PLGTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQG-NRFD 115
V+ S+T++ P G +AG G + G +GG K +E Y+IG++S+ P+ YSQG R
Sbjct 358 VLQQSSTDSTTPQGNMAGHGISVGKEGGFSKF-FEEHGYVIGLMSVIPKTSYSQGIPRHF 416
Query 116 VDLDTLDDLHKPALDAIGFQDLTTNKMAWWDETITADGEKQLKS---VGKQPAWLDYMTN 172
D D P + IG Q + +++ I A S G P + +Y +
Sbjct 417 SKFDKFDYFW-PQFEHIGEQPV-------YNKEIFAKNVGDYDSGGVFGYVPRYSEYKYS 468
Query 173 YNKVFGNFAIKDSEMFMTLNRNYEMDENKSIADLTTYIDPEKYNYVFADTSLNAMNFWVQ 232
+ + G+F KD+ F L R ++ + ++ + +FA N+ F+
Sbjct 469 PSTIHGDF--KDTLYFWHLGRIFDSSAPPKLNRDFIEVNKSGLSRIFA-VEDNSDKFYCH 525
Query 233 LGIGAKVRRKMS 244
L +RKMS
Sbjct 526 LYQKITAKRKMS 537
>gi|599087551|gb|AHN52701.1| major capsid protein, partial [uncultured Gokushovirinae]
Length=220
Score = 55.1 bits (131), Expect = 5e-06, Method: Compositional matrix adjust.
Identities = 44/136 (32%), Positives = 62/136 (46%), Gaps = 2/136 (1%)
Query 1 MDTLNLAKKVYDMLNRIAVSGGTYQDWIQTVY-TNDYIERSETPVYEGGFSSEIIFQEVI 59
++ L A ++ +L R A G Y + IQ + R + P Y GG ++ II +V
Sbjct 78 INQLRQAFQIQKLLERDARGGTRYTEIIQAHFGVTSPDARLQRPEYLGGGTTPIIISQVP 137
Query 60 SNSATENEPLGTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDYSQGNRFDVDLD 119
S ++ P GTLA G T K G K E IIG+ S+ + Y QG
Sbjct 138 QTSESDGTPQGTLAAYGTATMRKAGFTK-SFTEHCVIIGLASVRADLTYQQGLERMWSRQ 196
Query 120 TLDDLHKPALDAIGFQ 135
T D++ PAL IG Q
Sbjct 197 TRYDVYWPALAMIGEQ 212
>gi|565841287|ref|WP_023924568.1| hypothetical protein [Prevotella nigrescens]
gi|564729907|gb|ETD29851.1| hypothetical protein HMPREF1173_00033 [Prevotella nigrescens
CC14M]
Length=656
Score = 55.8 bits (133), Expect = 9e-06, Method: Compositional matrix adjust.
Identities = 63/248 (25%), Positives = 107/248 (43%), Gaps = 21/248 (8%)
Query 13 MLNRI-AVSGGTYQDWIQTVYTNDYIE-RSETPVYEGGFSSEIIFQEVISNS-------A 63
ML R A +G Y + I + E R + GGF ++I EV++ S A
Sbjct 405 MLERTRAANGLDYSNQIAAHFGFKVPESRKNCASFIGGFDNQISISEVVTTSNGSVDGTA 464
Query 64 TENEPLGTLAGRGQNTGMKGGTVKIKIDEPSYIIGIVSITPRIDY--SQGNRFDVDLDTL 121
+ +G + G+G M G + + E I+ I SI P++DY + + F+ +
Sbjct 465 STGSVVGQVFGKGIG-AMNSGHISYDVKEHGLIMCIYSIAPQVDYDARELDPFNRKF-SR 522
Query 122 DDLHKPALDAIGFQDLTTNKMAWWDETITADGEKQLKSV-GKQPAWLDYMTNYNKVFGNF 180
+D +P + +G Q + + + + +D Q +V G +L+Y T + +FG F
Sbjct 523 EDYFQPEFENLGMQPVIQSDLCLCINSAKSDSSDQHNNVLGYSARYLEYKTARDIIFGEF 582
Query 181 AIKDS-EMFMTLNRNYEMDENK-SIADLTTYIDPEKYNYVFA---DTSLNAMNFWVQLGI 235
S + T NY + K S+ DL +DP+ +FA + S++ F V
Sbjct 583 MSGGSLSAWATPKNNYTFEFGKLSLPDLL--VDPKVLEPIFAVKYNGSMSTDQFLVNSYF 640
Query 236 GAKVRRKM 243
K R M
Sbjct 641 DVKAIRPM 648
Lambda K H a alpha
0.316 0.134 0.392 0.792 4.96
Gapped
Lambda K H a alpha sigma
0.267 0.0410 0.140 1.90 42.6 43.6
Effective search space used: 1124108458389