bitscore colors: <40, 40-50 , 50-80, 80-200, >200




           BLASTP 2.2.30+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.



Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
           49,011,213 sequences; 17,563,301,199 total letters





Query= Contig-20_CDS_annotation_glimmer3.pl_2_5

Length=648
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

gi|547312923|ref|WP_022044635.1|  putative uncharacterized protein    60.1    2e-06
gi|639237429|ref|WP_024568106.1|  hypothetical protein                52.4    0.001
gi|444298000|dbj|GAC77839.1|  major capsid protein                    51.2    0.002
gi|492501782|ref|WP_005867318.1|  hypothetical protein                49.7    0.006
gi|494610271|ref|WP_007368517.1|  capsid protein                      47.4    0.029
gi|649555287|gb|KDS61824.1|  capsid family protein                    46.2    0.078
gi|609718276|emb|CDN73650.1|  conserved hypothetical protein          45.8    0.088
gi|649569140|gb|KDS75238.1|  capsid family protein                    45.1    0.13
gi|444298142|dbj|GAC77768.1|  major capsid protein                    44.7    0.15
gi|649557305|gb|KDS63784.1|  capsid family protein                    42.7    0.46


>gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
 gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68]
Length=338

 Score = 60.1 bits (144),  Expect = 2e-06, Method: Compositional matrix adjust.
 Identities = 70/269 (26%), Positives = 106/269 (39%), Gaps = 48/269 (18%)

Query  357  CPSSPDRFSRLMPPGDSNS---------DVDF-TGVKT-IPQLAVATRLQEYKDLIGASG  405
             P SPD F  ++  G S +         D++  TG    +P+L + T++Q + D +  SG
Sbjct  15   VPYSPDLFGNIIKQGSSPAVEIEVMNALDLNISTGFSVAVPELRLRTKIQNWMDRLFVSG  74

Query  406  SRYSDWLYTFFASKIEHVDRPKLLFsssvmvnsqvvmnqAGQSGFAGGEAAALGQMGGSI  465
             R  D   T + +K   +   K  F      +      +A  +G A GE A LGQ+   +
Sbjct  75   GRVGDVFRTLWGTKSSAIYVNKPDFLGVWQASINPSNVRAMANGSASGEDANLGQLAACV  134

Query  466  ----AFNTVLGREQTYYFKEPG--YIFDMLTIRPVYFWTGIRPDYLEYRGPDYFNPIYND  519
                 F+   G +  YY KEPG   +  ML   P Y   G+ PD       D FNP  N 
Sbjct  135  DRYCDFSGHSGID--YYAKEPGTFMLITMLVPEPAYS-QGLHPDLASISFGDDFNPELNG  191

Query  520  IGYQDVPFWRI-----GYGWKAASESQS----------------MTVAKEPCYNEFRSSY  558
            IG+Q VP  R      G+ +    +  S                ++V +E  ++  R+ Y
Sbjct  192  IGFQLVPRHRFSMMPRGFNFTGLDQEASPWFGHTGTGVLVDPNMVSVGEEVAWSWLRTDY  251

Query  559  DEVLGSLQSTLTPKASVPLQSYWVQQRDF  587
              + G        +       YWV  R F
Sbjct  252  SRLHGDFAQNGNYQ-------YWVLTRRF  273


>gi|639237429|ref|WP_024568106.1| hypothetical protein [Elizabethkingia anophelis]
Length=546

 Score = 52.4 bits (124),  Expect = 0.001, Method: Compositional matrix adjust.
 Identities = 67/273 (25%), Positives = 112/273 (41%), Gaps = 36/273 (13%)

Query  373  SNSDVDFTGVK--TIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFASKIE--HVDRPKL  428
            SN  VD       TI  L  A +LQE+ +    +GSRY++ + +FF  K     + RP+ 
Sbjct  286  SNLGVDLKTASGSTINDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEF  345

Query  429  LFsssvmvnsqvvmnqAGQ-----SGFAGGEAAALGQMGGSIAFNTVLGREQTYYFKEPG  483
            L  +   +    V+ Q+        G   G   ++G+ GG   F           F+E G
Sbjct  346  LGGNKTPILISEVLQQSSTDSTTPQGNMAGHGISVGKEGGFSKF-----------FEEHG  394

Query  484  YIFDMLTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPFWRIGYGWKAASESQS  542
            Y+  ++++ P   +  GI   + ++   DYF P +  IG Q V    I +          
Sbjct  395  YVIGLMSVIPKTSYSQGIPRHFSKFDKFDYFWPQFEHIGEQPVYNKEI-FAKNVGDYDSG  453

Query  543  MTVAKEPCYNEFRSSYDEVLGSLQSTLTPKASVPLQSYWVQQRDFYMIGLSSNPNEISPS  602
                  P Y+E++ S   + G  + TL          +W   R F     SS P +++  
Sbjct  454  GVFGYVPRYSEYKYSPSTIHGDFKDTLY---------FWHLGRIFD----SSAPPKLNRD  500

Query  603  MLFTNLSTVNNPFA-SDMEDNFFVNMSYKVVVK  634
             +  N S ++  FA  D  D F+ ++  K+  K
Sbjct  501  FIEVNKSGLSRIFAVEDNSDKFYCHLYQKITAK  533


>gi|444298000|dbj|GAC77839.1| major capsid protein [uncultured marine virus]
Length=480

 Score = 51.2 bits (121),  Expect = 0.002, Method: Compositional matrix adjust.
 Identities = 61/278 (22%), Positives = 111/278 (40%), Gaps = 33/278 (12%)

Query  375  SDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWL-YTFFASKIEHVDRPKLLFsss  433
            +D+      TI  +  A  +Q Y++     GSRY+++L Y     K   + RP+ +   +
Sbjct  228  ADLQAATGGTINDIRRAFAIQRYQEARSRYGSRYTEYLRYLGVNPKDARLQRPEYMGGGT  287

Query  434  vmvnsqvvmnqAGQSGFAGGEAAALGQMGGSIAFNTVLGREQTY--YFKEPGYIFDMLTI  491
              +N   V+  + +    G +  +   +G          R   Y  Y +E GYI  ML++
Sbjct  288  TQINFSEVLQTSPE--IPGEDQVSQFGVGDMYGHGIAAMRSNKYRRYIEEHGYIISMLSV  345

Query  492  RPVYFWT-GIRPDYLEYRGPDYFNPIYNDIGYQDVPFWRIGYGWKAASESQSMTVAKEPC  550
            RP   +T GI   +L     DY+      IG Q++    I     A +E    T      
Sbjct  346  RPKTMYTNGIHRSWLRLTKEDYYQKELEHIGQQEIMNNEIYADEGAGTE----TFGYNDR  401

Query  551  YNEFRSSYDEVLGSLQSTLTPKASVPLQSYWVQQRDFYMIGLSSNPNEISPSML---FTN  607
            Y+E+R +   V    +  L         +YW   R+F            +P +L   F +
Sbjct  402  YSEYRETPSHVSAEFRGIL---------NYWHMAREFE-----------APPVLNQSFVD  441

Query  608  LSTVNNPFASDMEDNFFVNMSYKVVVKNLVNKSFATRL  645
                        +D  ++ + +K+V + L++++ A R+
Sbjct  442  CDATKRIHNEQTQDALWIMIQHKMVARRLLSRNAAPRI  479


>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
 gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis 
CL09T03C24]
Length=538

 Score = 49.7 bits (117),  Expect = 0.006, Method: Compositional matrix adjust.
 Identities = 50/206 (24%), Positives = 90/206 (44%), Gaps = 12/206 (6%)

Query  368  MPPGDSNSDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFA--SKIEHVDR  425
            + P +   +VD  GV +I  L  +  LQ + +    SGSRY + + + F   S    + R
Sbjct  282  LEPDNFQVNVDELGV-SINDLRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQR  340

Query  426  PKLLFsssvmvnsqvvmnqAGQSGFAGGEAAALGQMGGSIAFNTVLGREQTYYFKEPGYI  485
            P+ L      ++   V+  +     +    A +   G S   N    R    YF+E GYI
Sbjct  341  PQFLGGGRTPISVSEVLQTSATD--STSPQANMAGHGISAGVNHGFKR----YFEEHGYI  394

Query  486  FDMLTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPFWRIGYGWKAASESQSMT  544
              +++IRP   +  G+  D+ ++   D++ P +  +G Q++    +      AS +   T
Sbjct  395  IGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEVYLQQTPASNNG--T  452

Query  545  VAKEPCYNEFRSSYDEVLGSLQSTLT  570
                P Y E++ S +EV G  +  + 
Sbjct  453  FGYTPRYAEYKYSMNEVHGDFRGNMA  478


>gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis]
 gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM 
16608]
Length=531

 Score = 47.4 bits (111),  Expect = 0.029, Method: Compositional matrix adjust.
 Identities = 57/216 (26%), Positives = 92/216 (43%), Gaps = 39/216 (18%)

Query  447  QSGFAGG--EAAALGQMGGSIAFNTVLGREQTYYFKEPGYIFDMLTIRPVYFWTG--IRP  502
            QS F  G  E+  LG +GG     ++      +  KE G I  + ++ P   + G    P
Sbjct  310  QSEFDRGADESPCLGDLGGK-GVGSLNSSSIDFDVKEHGIIMCIYSVVPQTEYNGTYFDP  368

Query  503  DYLEYRGPDYFNPIYNDIGYQ------------DVPF-------WRIGYGWKAAS-ESQS  542
               + R  D+F P + D+GYQ            D P         R+  G+  +S E+ +
Sbjct  369  FNRKLRREDFFQPEFADLGYQPVVTSDLISTYLDNPVPDGPEKQKRLAAGYPLSSIEANN  428

Query  543  MTVAKEPCYNEFRSSYDEVLGSLQSTLTPKASVPLQSYWVQQR-DFYMIGLSSNPNEI--  599
              +  +  YNE+++S D V G  +S L+        SYW   R DF   G + +   +  
Sbjct  429  RLLGWQVRYNEYKTSRDLVFGEFESGLS-------LSYWCSPRYDFGFDGKAGDKKLVNS  481

Query  600  --SPSMLFTNLSTVNNPF--ASDMEDNFFVNMSYKV  631
              SP+  + N S +N  F  ++   D+F VN  + V
Sbjct  482  PWSPAHFYVNPSILNTIFLVSAVKADHFLVNSFFDV  517


>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=541

 Score = 46.2 bits (108),  Expect = 0.078, Method: Compositional matrix adjust.
 Identities = 64/275 (23%), Positives = 111/275 (40%), Gaps = 25/275 (9%)

Query  312  AYPYDVQEPKVNWNNGSGTDTGTPSKVYFAATLNVPFLAAHPMA---VCPSSPDRFS---  365
            A P+  + P+V      G +     K  FAA     F    P++   V  S+P   S   
Sbjct  216  ALPWVQRGPEVTVPINGGGEIPVEMKEGFAAQKITTFPDRKPISGSEVLYSAPSVLSYGQ  275

Query  366  -------RLMPPGDSNSDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFA-  417
                    L+ P +   + D  GV  I  +  +  LQ + +    SGSRY + + + F  
Sbjct  276  IGSIKGQALIEPDNFVVNTDQMGV-NINDIRTSNALQRWFERNARSGSRYIEQILSHFGV  334

Query  418  -SKIEHVDRPKLLFsssvmvnsqvvmnqAGQSGFAGGEAAALGQMGGSIAFNTVLGREQT  476
             S    + RP+ L      ++   V+  +  S  +    A +   G S   N    R   
Sbjct  335  RSSDARLQRPQFLGGGRTPISVSEVLQTS--STDSTSPQANMAGHGISAGVNHGFTR---  389

Query  477  YYFKEPGYIFDMLTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPFWRIGYGWK  535
             YF+E GYI  +++IRP   +  G+  D+ ++   D++ P +  +G Q++    +     
Sbjct  390  -YFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNES  448

Query  536  AASESQSMTVAKEPCYNEFRSSYDEVLGSLQSTLT  570
             A+     T    P Y E++ S +EV G  +  + 
Sbjct  449  DAANEG--TFGYTPRYAEYKYSQNEVHGDFRGNMA  481


>gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis]
Length=537

 Score = 45.8 bits (107),  Expect = 0.088, Method: Compositional matrix adjust.
 Identities = 61/262 (23%), Positives = 111/262 (42%), Gaps = 34/262 (13%)

Query  382  VKTIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFASKIE--HVDRPKLLFsssvmvnsq  439
            V T+  L  A +LQE+ +    +GSRY++ + +FF  K     + RP+ L  +   +   
Sbjct  288  VSTVNDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSDGRLQRPEFLGGNKSPIMIS  347

Query  440  vvmnqAGQ-----SGFAGGEAAALGQMGGSIAFNTVLGREQTYYFKEPGYIFDMLTIRP-  493
             V+ Q+        G   G    +G+ GG            + +F+E GY+  ++++ P 
Sbjct  348  EVLQQSATDSTTPQGNMAGHGIGIGKDGGF-----------SRFFEEHGYVIGLMSVIPK  396

Query  494  VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPFWRIGYGWKAASESQSMTVAKEPCYNE  553
              +  GI   + +    DYF P +  IG Q V    I      A +S+++     P Y+E
Sbjct  397  TSYSQGIPRHFSKSDKFDYFWPQFEHIGEQPVYNKEIFAKNIDAFDSEAV-FGYLPRYSE  455

Query  554  FRSSYDEVLGSLQSTLTPKASVPLQSYWVQQRDFYMIGLSSNPNEISPSMLFTNLSTVNN  613
            ++ S   V G  +  L          +W   R F     +  P  ++ S +  + + ++ 
Sbjct  456  YKFSPSTVHGDFKDDLY---------FWHLGRIFD----TDKPPVLNQSFIECDKNALSR  502

Query  614  PFA-SDMEDNFFVNMSYKVVVK  634
             FA  D  D F+ ++  K+  K
Sbjct  503  IFAVEDDTDKFYCHLYQKITAK  524


>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 
3999B T(B) 6]
Length=390

 Score = 45.1 bits (105),  Expect = 0.13, Method: Compositional matrix adjust.
 Identities = 64/275 (23%), Positives = 111/275 (40%), Gaps = 25/275 (9%)

Query  312  AYPYDVQEPKVNWNNGSGTDTGTPSKVYFAATLNVPFLAAHPMA---VCPSSPDRFS---  365
            A P+  + P+V      G +     K  FAA     F    P++   V  S+P   S   
Sbjct  65   ALPWVQRGPEVTVPINGGGEIPVEMKEGFAAQKITTFPDRKPISGSEVLYSAPSVLSYGQ  124

Query  366  -------RLMPPGDSNSDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWLYTFFA-  417
                    L+ P +   + D  GV  I  +  +  LQ + +    SGSRY + + + F  
Sbjct  125  IGSIKGQALIEPDNFVVNTDQMGV-NINDIRTSNALQRWFERNARSGSRYIEQILSHFGV  183

Query  418  -SKIEHVDRPKLLFsssvmvnsqvvmnqAGQSGFAGGEAAALGQMGGSIAFNTVLGREQT  476
             S    + RP+ L      ++   V+  +  S  +    A +   G S   N    R   
Sbjct  184  RSSDARLQRPQFLGGGRTPISVSEVLQTS--STDSTSPQANMAGHGISAGVNHGFTR---  238

Query  477  YYFKEPGYIFDMLTIRP-VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDVPFWRIGYGWK  535
             YF+E GYI  +++IRP   +  G+  D+ ++   D++ P +  +G Q++    +     
Sbjct  239  -YFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNES  297

Query  536  AASESQSMTVAKEPCYNEFRSSYDEVLGSLQSTLT  570
             A  +   T    P Y E++ S +EV G  +  + 
Sbjct  298  DA--ANEGTFGYTPRYAEYKYSQNEVHGDFRGNMA  330


>gi|444298142|dbj|GAC77768.1| major capsid protein [uncultured marine virus]
Length=299

 Score = 44.7 bits (104),  Expect = 0.15, Method: Compositional matrix adjust.
 Identities = 39/153 (25%), Positives = 70/153 (46%), Gaps = 6/153 (4%)

Query  375  SDVDFTGVKTIPQLAVATRLQEYKDLIGASGSRYSDWL-YTFFASKIEHVDRPKLLFsss  433
            +D+   G   I  L  A  LQ Y++     G+R++++L Y   +S    + RP+++ +  
Sbjct  111  ADLSQAGAININDLREAFALQRYQEARNLYGARFTEYLRYLGISSSXGRLQRPEMISTGK  170

Query  434  vmvnsqvvmnqAGQSGFAGGEAAALGQMGGSIAFNTVLGREQTYYFKEPGYIFDMLTIRP  493
              +N   V+N  G SG    +   LG+MGG      V      Y+ +E G+I  ++++RP
Sbjct  171  SNINFSEVLNTTGPSGV---DDHPLGEMGGH-GIAGVKSNRARYFCEEHGHIISLMSVRP  226

Query  494  -VYFWTGIRPDYLEYRGPDYFNPIYNDIGYQDV  525
               + T     +      DY+      IG ++V
Sbjct  227  KTIYMTTQHKQFDRESKEDYWQKELQAIGMEEV  259


>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=245

 Score = 42.7 bits (99),  Expect = 0.46, Method: Compositional matrix adjust.
 Identities = 46/189 (24%), Positives = 83/189 (44%), Gaps = 13/189 (7%)

Query  385  IPQLAVATRLQEYKDLIGASGSRYSDWLYTFFA--SKIEHVDRPKLLFsssvmvnsqvvm  442
            I  +  +  LQ + +    SGSRY + + + F   S    + RP+ L      ++   V+
Sbjct  5    INDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSEVL  64

Query  443  nqAGQSGFAGGEAAALGQMGGSIAFNTVLGREQTYYFKEPGYIFDMLTIRP-VYFWTGIR  501
              +  S  +    A +   G S   N    R    YF+E GYI  +++IRP   +  G+ 
Sbjct  65   QTS--STDSTSPQANMAGHGISAGVNHGFTR----YFEEHGYIMGIMSIRPRTGYQQGVP  118

Query  502  PDYLEYRGPDYFNPIYNDIGYQDVPFWRIGYGWK-AASESQSMTVAKEPCYNEFRSSYDE  560
             D+ ++   D++ P +  +G Q++    +      AA+E    T    P Y E++ S +E
Sbjct  119  KDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEG---TFGYTPRYAEYKYSQNE  175

Query  561  VLGSLQSTL  569
            V G  +  +
Sbjct  176  VHGDFRGNM  184



Lambda      K        H        a         alpha
   0.320    0.136    0.425    0.792     4.96 

Gapped
Lambda      K        H        a         alpha    sigma
   0.267   0.0410    0.140     1.90     42.6     43.6 

Effective search space used: 4913515649712