bitscore colors: <40, 40-50 , 50-80, 80-200, >200

BLASTP 2.2.30+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
49,011,213 sequences; 17,563,301,199 total letters
Query= Contig-9_CDS_annotation_glimmer3.pl_2_7
Length=638
Score E
Sequences producing significant alignments: (Bits) Value
gi|492501782|ref|WP_005867318.1| hypothetical protein 85.5 4e-14
gi|547312923|ref|WP_022044635.1| putative uncharacterized protein 76.6 7e-12
gi|649557305|gb|KDS63784.1| capsid family protein 75.1 1e-11
gi|547920049|ref|WP_022322420.1| capsid protein VP1 76.6 2e-11
gi|649569140|gb|KDS75238.1| capsid family protein 75.1 4e-11
gi|649555287|gb|KDS61824.1| capsid family protein 75.1 6e-11
gi|494610271|ref|WP_007368517.1| capsid protein 66.2 4e-08
gi|599088027|gb|AHN52939.1| major capsid protein 62.4 1e-07
gi|599088021|gb|AHN52936.1| major capsid protein 61.6 2e-07
gi|565841287|ref|WP_023924568.1| hypothetical protein 63.9 2e-07
>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis
CL09T03C24]
Length=538
Score = 85.5 bits (210), Expect = 4e-14, Method: Compositional matrix adjust.
Identities = 76/265 (29%), Positives = 118/265 (45%), Gaps = 16/265 (6%)
Query 375 VDVTDGTLSMDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVS 434
V+V + +S++ L S + + R A SG Y + + + + + R + P F GG
Sbjct 289 VNVDELGVSINDLRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGR 348
Query 435 QEIVFQEVISNSATEN-EPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSITPRIDY 493
I EV+ SAT++ P +AG G++ G G + E YI+ I SI PR Y
Sbjct 349 TPISVSEVLQTSATDSTSPQANMAGHGISAGVNHG--FKRYFEEHGYIIGIMSIRPRTGY 406
Query 494 GQGNTWDTYLETMDDWHKPALDGIGYQDSLNGERAWWTDYLTADPDLKRTSAGKTVAWIN 553
QG D D++ P +G Q+ N E YL P + G T +
Sbjct 407 QQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEV-----YLQQTPASNNGTFGYTPRYAE 461
Query 554 YMTNVNRTFGNFAPGMSESFMVLNRNYSMNNSASPQIEDLTTYIDPVKFNYIFADANIDA 613
Y ++N G+F M +F LNR + S SP + TT+++ N +FA A
Sbjct 462 YKYSMNEVHGDFRGNM--AFWHLNRIF----SESPNLN--TTFVECNPSNRVFATAETSD 513
Query 614 MNFWVQTKFEIKARRLISAKQIPNL 638
+W+Q ++KA RL+ P L
Sbjct 514 DKYWIQLYQDVKALRLMPKYGTPML 538
>gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
Length=338
Score = 76.6 bits (187), Expect = 7e-12, Method: Compositional matrix adjust.
Identities = 87/335 (26%), Positives = 140/335 (42%), Gaps = 48/335 (14%)
Query 343 GLCLKTYNSDLYQNWINTEWIEGVDGINEASAVDV---TDGTLSMDALNLSQKVYNFLNR 399
GL Y+ DL+ N I V+ I +A+D+ T ++++ L L K+ N+++R
Sbjct 11 GLLSVPYSPDLFGNIIKQGSSPAVE-IEVMNALDLNISTGFSVAVPELRLRTKIQNWMDR 69
Query 400 IAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVSQEIVFQEVISNSATENEPLGTLAGR 459
+ VSGG D T++ + P F G V+Q I+ S G+ +G
Sbjct 70 LFVSGGRVGDVFRTLWGTKSSAIYVNKPDFLG------VWQASINPSNVRAMANGSASGE 123
Query 460 GVTTGRQKG---------GH--IRIKITEPCYIMCICSITPRIDYGQGNTWDTYLETMDD 508
G+ GH I EP M I + P Y QG D + D
Sbjct 124 DANLGQLAACVDRYCDFSGHSGIDYYAKEPGTFMLITMLVPEPAYSQGLHPDLASISFGD 183
Query 509 WHKPALDGIGYQ----------------DSLNGERAWWTDY----LTADPDLKRTSAGKT 548
P L+GIG+Q L+ E + W + + DP++ S G+
Sbjct 184 DFNPELNGIGFQLVPRHRFSMMPRGFNFTGLDQEASPWFGHTGTGVLVDPNM--VSVGEE 241
Query 549 VAWINYMTNVNRTFGNFAPGMSESFMVLNRNYSM----NNSASPQIEDLT-TYIDPVKFN 603
VAW T+ +R G+FA + + VL R ++ + + Q + T TYI+P+ +
Sbjct 242 VAWSWLRTDYSRLHGDFAQNGNYQYWVLTRRFTTYFPDDGTGFYQDGEYTGTYINPLDWQ 301
Query 604 YIFADANIDAMNFWVQTKFEIKARRLISAKQIPNL 638
Y+F D + A NF F++ +SA +P L
Sbjct 302 YVFVDQTLMAGNFAYYGTFDLNVTSSLSANYMPYL 336
>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=245
Score = 75.1 bits (183), Expect = 1e-11, Method: Compositional matrix adjust.
Identities = 71/258 (28%), Positives = 109/258 (42%), Gaps = 16/258 (6%)
Query 382 LSMDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVSQEIVFQE 441
++++ + S + + R A SG Y + + + + + R + P F GG I E
Sbjct 3 VNINDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSE 62
Query 442 VISNSATEN-EPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSITPRIDYGQGNTWD 500
V+ S+T++ P +AG G++ G G E YIM I SI PR Y QG D
Sbjct 63 VLQTSSTDSTSPQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKD 120
Query 501 TYLETMDDWHKPALDGIGYQDSLNGERAWWTDYLTADPDLKRTSAGKTVAWINYMTNVNR 560
D++ P +G Q+ N E YL + G T + Y + N
Sbjct 121 FRKFDNMDFYFPEFAHLGEQEIKNEEL-----YLNESDAANEGTFGYTPRYAEYKYSQNE 175
Query 561 TFGNFAPGMSESFMVLNRNYSMNNSASPQIEDLTTYIDPVKFNYIFADANIDAMNFWVQT 620
G+F M+ F LNR + P + TT+++ N +FA A +WVQ
Sbjct 176 VHGDFRGNMA--FWHLNRIFK----EKPNLN--TTFVECNPSNRVFATAETSDDKYWVQI 227
Query 621 KFEIKARRLISAKQIPNL 638
+IKA RL+ P L
Sbjct 228 YQDIKALRLMPKYGTPML 245
>gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
Length=553
Score = 76.6 bits (187), Expect = 2e-11, Method: Compositional matrix adjust.
Identities = 72/266 (27%), Positives = 118/266 (44%), Gaps = 18/266 (7%)
Query 375 VDVTDGTLSMDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVS 434
V+V + ++++ L S + + R A G Y + + + + + R + P F GG
Sbjct 304 VNVDEMGININDLRTSNALQRWFERNARGGSRYIEQILSHFGVRSSDARLQRPQFLGGGR 363
Query 435 QEIVFQEVISNSAT-ENEPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSITPRIDY 493
I EV+ S+T E P +AG G++ G G + E YI+ I SITPR Y
Sbjct 364 MPISVSEVLQTSSTDETSPQANMAGHGISAGINNG--FKHYFEEHGYIIGIMSITPRSGY 421
Query 494 GQGNTWD-TYLETMDDWHKPALDGIGYQDSLNGERAWWTDYLTADPDLKRTSAGKTVAWI 552
QG D T + MD ++ P + Q+ N E +++ D + G T +
Sbjct 422 QQGVPRDFTKFDNMD-FYFPEFAHLSEQEIKNQEL-----FVSEDAAYNNGTFGYTPRYA 475
Query 553 NYMTNVNRTFGNFAPGMSESFMVLNRNYSMNNSASPQIEDLTTYIDPVKFNYIFADANID 612
Y + + G+F +S F LNR + P + TT+++ N +FA + +
Sbjct 476 EYKYHPSEAHGDFRGNLS--FWHLNRIFE----DKPNLN--TTFVECKPSNRVFATSETE 527
Query 613 AMNFWVQTKFEIKARRLISAKQIPNL 638
FWVQ ++KA RL+ P L
Sbjct 528 DDKFWVQMYQDVKALRLMPKYGTPML 553
>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str.
3999B T(B) 6]
Length=390
Score = 75.1 bits (183), Expect = 4e-11, Method: Compositional matrix adjust.
Identities = 71/258 (28%), Positives = 109/258 (42%), Gaps = 16/258 (6%)
Query 382 LSMDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVSQEIVFQE 441
++++ + S + + R A SG Y + + + + + R + P F GG I E
Sbjct 148 VNINDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSE 207
Query 442 VISNSATEN-EPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSITPRIDYGQGNTWD 500
V+ S+T++ P +AG G++ G G E YIM I SI PR Y QG D
Sbjct 208 VLQTSSTDSTSPQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKD 265
Query 501 TYLETMDDWHKPALDGIGYQDSLNGERAWWTDYLTADPDLKRTSAGKTVAWINYMTNVNR 560
D++ P +G Q+ N E YL + G T + Y + N
Sbjct 266 FRKFDNMDFYFPEFAHLGEQEIKNEEL-----YLNESDAANEGTFGYTPRYAEYKYSQNE 320
Query 561 TFGNFAPGMSESFMVLNRNYSMNNSASPQIEDLTTYIDPVKFNYIFADANIDAMNFWVQT 620
G+F M+ F LNR + P + TT+++ N +FA A +WVQ
Sbjct 321 VHGDFRGNMA--FWHLNRIFK----EKPNLN--TTFVECNPSNRVFATAETSDDKYWVQI 372
Query 621 KFEIKARRLISAKQIPNL 638
+IKA RL+ P L
Sbjct 373 YQDIKALRLMPKYGTPML 390
>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 4]
gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B
T(B) 6]
Length=541
Score = 75.1 bits (183), Expect = 6e-11, Method: Compositional matrix adjust.
Identities = 71/258 (28%), Positives = 109/258 (42%), Gaps = 16/258 (6%)
Query 382 LSMDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVSQEIVFQE 441
++++ + S + + R A SG Y + + + + + R + P F GG I E
Sbjct 299 VNINDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSE 358
Query 442 VISNSATEN-EPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSITPRIDYGQGNTWD 500
V+ S+T++ P +AG G++ G G E YIM I SI PR Y QG D
Sbjct 359 VLQTSSTDSTSPQANMAGHGISAGVNHG--FTRYFEEHGYIMGIMSIRPRTGYQQGVPKD 416
Query 501 TYLETMDDWHKPALDGIGYQDSLNGERAWWTDYLTADPDLKRTSAGKTVAWINYMTNVNR 560
D++ P +G Q+ N E YL + G T + Y + N
Sbjct 417 FRKFDNMDFYFPEFAHLGEQEIKNEEL-----YLNESDAANEGTFGYTPRYAEYKYSQNE 471
Query 561 TFGNFAPGMSESFMVLNRNYSMNNSASPQIEDLTTYIDPVKFNYIFADANIDAMNFWVQT 620
G+F M+ F LNR + P + TT+++ N +FA A +WVQ
Sbjct 472 VHGDFRGNMA--FWHLNRIFK----EKPNLN--TTFVECNPSNRVFATAETSDDKYWVQI 523
Query 621 KFEIKARRLISAKQIPNL 638
+IKA RL+ P L
Sbjct 524 YQDIKALRLMPKYGTPML 541
>gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis]
gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM
16608]
Length=531
Score = 66.2 bits (160), Expect = 4e-08, Method: Compositional matrix adjust.
Identities = 63/248 (25%), Positives = 106/248 (43%), Gaps = 34/248 (14%)
Query 422 ERCETPMFEGGVSQEIVFQEVISNS-----ATENEPLGTLAGRGVTTGRQKGGHIRIKIT 476
R F GG +V EV++ S A E+ LG L G+GV G I +
Sbjct 287 SRAGDARFIGGFDNPVVISEVVNQSEFDRGADESPCLGDLGGKGV--GSLNSSSIDFDVK 344
Query 477 EPCYIMCICSITPRIDYGQGNTWDTYLETM--DDWHKPALDGIGYQDSLNGE--RAWWTD 532
E IMCI S+ P+ +Y G +D + + +D+ +P +GYQ + + + +
Sbjct 345 EHGIIMCIYSVVPQTEY-NGTYFDPFNRKLRREDFFQPEFADLGYQPVVTSDLISTYLDN 403
Query 533 YLTADPD-LKRTSAGKTVAWIN--------------YMTNVNRTFGNFAPGMSESFMVLN 577
+ P+ KR +AG ++ I Y T+ + FG F G+S S+
Sbjct 404 PVPDGPEKQKRLAAGYPLSSIEANNRLLGWQVRYNEYKTSRDLVFGEFESGLSLSYWCSP 463
Query 578 R-NYSMNNSASPQI------EDLTTYIDPVKFNYIFADANIDAMNFWVQTKFEIKARRLI 630
R ++ + A + Y++P N IF + + A +F V + F++KA R +
Sbjct 464 RYDFGFDGKAGDKKLVNSPWSPAHFYVNPSILNTIFLVSAVKADHFLVNSFFDVKAVRPM 523
Query 631 SAKQIPNL 638
S + L
Sbjct 524 SVSGLAGL 531
>gi|599088027|gb|AHN52939.1| major capsid protein, partial [uncultured Gokushovirinae]
Length=219
Score = 62.4 bits (150), Expect = 1e-07, Method: Compositional matrix adjust.
Identities = 47/144 (33%), Positives = 65/144 (45%), Gaps = 3/144 (2%)
Query 383 SMDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPMFEGGVSQEIVFQEV 442
+++ L + ++ L R A SG Y + ++ + G N+M+ P F GG S I V
Sbjct 77 TINQLRQAFQIQKLLERDARSGTRYAEIVKAHF-GVNFMDVTYRPEFLGGTSTPINVTSV 135
Query 443 ISNSATENEPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSITPRIDYGQGNTWDTY 502
S + P GTLA G T GG TE C +M I S+ + Y QG
Sbjct 136 PQTSESGTTPQGTLAAFGTAT--VNGGGFTKSFTEHCIVMGIASVRADLTYQQGLNRMFS 193
Query 503 LETMDDWHKPALDGIGYQDSLNGE 526
T D++ PAL IG Q LN E
Sbjct 194 RSTRYDFYFPALAHIGEQAVLNKE 217
>gi|599088021|gb|AHN52936.1| major capsid protein, partial [uncultured Gokushovirinae]
Length=220
Score = 61.6 bits (148), Expect = 2e-07, Method: Compositional matrix adjust.
Identities = 50/158 (32%), Positives = 71/158 (45%), Gaps = 4/158 (3%)
Query 370 NEASAVDVTDGTLS-MDALNLSQKVYNFLNRIAVSGGSYRDWLETVYTGGNYMERCETPM 428
N A D++ T + ++ L + ++ L R A SG Y + ++ + G N+M+ P
Sbjct 64 NRALYADLSSATAATINQLRQAFQIQKLLERDARSGTRYSEIVKAHF-GVNFMDVTYRPE 122
Query 429 FEGGVSQEIVFQEVISNSATENEPLGTLAGRGVTTGRQKGGHIRIKITEPCYIMCICSIT 488
F GG S + V S + P GTLA G T GG TE C +M I S+
Sbjct 123 FLGGSSTPVNVTSVPQTSESGTTPQGTLAAFGTAT--INGGGFTKSFTEHCIVMGIASVR 180
Query 489 PRIDYGQGNTWDTYLETMDDWHKPALDGIGYQDSLNGE 526
+ Y QG T D++ PAL IG Q LN E
Sbjct 181 ADLTYQQGLNRMFSRSTRYDFYFPALAHIGEQSVLNKE 218
>gi|565841287|ref|WP_023924568.1| hypothetical protein [Prevotella nigrescens]
gi|564729907|gb|ETD29851.1| hypothetical protein HMPREF1173_00033 [Prevotella nigrescens
CC14M]
Length=656
Score = 63.9 bits (154), Expect = 2e-07, Method: Compositional matrix adjust.
Identities = 56/219 (26%), Positives = 91/219 (42%), Gaps = 18/219 (8%)
Query 423 RCETPMFEGGVSQEIVFQEVISNS-------ATENEPLGTLAGRGVTTGRQKGGHIRIKI 475
R F GG +I EV++ S A+ +G + G+G+ G GHI +
Sbjct 433 RKNCASFIGGFDNQISISEVVTTSNGSVDGTASTGSVVGQVFGKGI--GAMNSGHISYDV 490
Query 476 TEPCYIMCICSITPRIDYGQGNTWDTYLE--TMDDWHKPALDGIGYQDSLNGERAWWTDY 533
E IMCI SI P++DY D + + +D+ +P + +G Q + + +
Sbjct 491 KEHGLIMCIYSIAPQVDY-DARELDPFNRKFSREDYFQPEFENLGMQPVIQSDLCLCINS 549
Query 534 LTAD-PDLKRTSAGKTVAWINYMTNVNRTFGNFAPGMSESFMVLNRNYSMNNSASPQIED 592
+D D G + ++ Y T + FG F G S S +N + D
Sbjct 550 AKSDSSDQHNNVLGYSARYLEYKTARDIIFGEFMSGGSLSAWATPKNNYTFEFGKLSLPD 609
Query 593 LTTYIDPVKFNYIFA---DANIDAMNFWVQTKFEIKARR 628
L +DP IFA + ++ F V + F++KA R
Sbjct 610 LL--VDPKVLEPIFAVKYNGSMSTDQFLVNSYFDVKAIR 646
Lambda K H a alpha
0.317 0.134 0.401 0.792 4.96
Gapped
Lambda K H a alpha sigma
0.267 0.0410 0.140 1.90 42.6 43.6
Effective search space used: 4813850017872