bitscore colors: <40, 40-50 , 50-80, 80-200, >200




           BLASTP 2.2.30+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.



Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
           49,011,213 sequences; 17,563,301,199 total letters





Query= Contig-46_CDS_annotation_glimmer3.pl_2_2

Length=461
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

gi|649557305|gb|KDS63784.1|  capsid family protein                    87.4    4e-16
gi|492501782|ref|WP_005867318.1|  hypothetical protein                89.7    7e-16
gi|649569140|gb|KDS75238.1|  capsid family protein                    86.7    3e-15
gi|649555287|gb|KDS61824.1|  capsid family protein                    87.0    5e-15
gi|494610271|ref|WP_007368517.1|  capsid protein                      87.0    5e-15
gi|547920049|ref|WP_022322420.1|  capsid protein VP1                  85.9    1e-14
gi|647452987|ref|WP_025792807.1|  hypothetical protein                70.5    1e-09
gi|565841287|ref|WP_023924568.1|  hypothetical protein                65.5    4e-08
gi|494308783|ref|WP_007173938.1|  hypothetical protein                61.2    8e-07
gi|599087863|gb|AHN52857.1|  major capsid protein                     57.8    2e-06


>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=245

 Score = 87.4 bits (215),  Expect = 4e-16, Method: Compositional matrix adjust.
 Identities = 66/226 (29%), Positives = 107/226 (47%), Gaps = 17/226 (8%)

Query  239  ATYGIRSATLP-ESPIFCGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGL  296
            + +G+RS+    + P F GG ++ I+  E++  S+T+   P   +AG G++       G 
Sbjct  34   SHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGF  91

Query  297  KIKCTEPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaaw  355
                 E   IM + SI PR  Y QG  K + +  NMD F+ P    +G QE+  EE    
Sbjct  92   TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLN  150

Query  356  ntettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGN  415
             ++          + G  P + EY    NE +G+F   M  AF  LNR+++E  +     
Sbjct  151  ESDAANE-----GTFGYTPRYAEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN---  200

Query  416  ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDVTARRVMSAKQIPNL  461
             +T+++    N +FA +  S   +WVQ+  D+ A R+M     P L
Sbjct  201  -TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYGTPML  245


>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
 gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis 
CL09T03C24]
Length=538

 Score = 89.7 bits (221),  Expect = 7e-16, Method: Compositional matrix adjust.
 Identities = 75/264 (28%), Positives = 124/264 (47%), Gaps = 17/264 (6%)

Query  201  VDVTDGKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRSATLP-ESPIFCGGMQ  259
            V+V +  ++++ L     +     R A +   Y     + +G+RS+    + P F GG +
Sbjct  289  VNVDELGVSINDLRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGR  348

Query  260  SEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSITPRIDY  318
            + I+  E++  SAT+   P   +AG G++       G K    E   I+ + SI PR  Y
Sbjct  349  TPISVSEVLQTSATDSTSPQANMAGHGISA--GVNHGFKRYFEEHGYIIGIMSIRPRTGY  406

Query  319  SQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaawntettenYKHIYSSLGKQPSWI  377
             QG  K + +  NMD F+ P    +G QE+  EE     T  + N      + G  P + 
Sbjct  407  QQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEEVYLQQTPASNN-----GTFGYTPRYA  460

Query  378  EYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNASTYIDPTIYNSIFAESRLSSQ  437
            EY   +NE +G+F   M  AF  LNR++ E+ +      +T+++    N +FA +  S  
Sbjct  461  EYKYSMNEVHGDFRGNM--AFWHLNRIFSESPNLN----TTFVECNPSNRVFATAETSDD  514

Query  438  NFWVQVAFDVTARRVMSAKQIPNL  461
             +W+Q+  DV A R+M     P L
Sbjct  515  KYWIQLYQDVKALRLMPKYGTPML  538


>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 
3999B T(B) 6]
Length=390

 Score = 86.7 bits (213),  Expect = 3e-15, Method: Compositional matrix adjust.
 Identities = 66/226 (29%), Positives = 107/226 (47%), Gaps = 17/226 (8%)

Query  239  ATYGIRSATLP-ESPIFCGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGL  296
            + +G+RS+    + P F GG ++ I+  E++  S+T+   P   +AG G++       G 
Sbjct  179  SHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGF  236

Query  297  KIKCTEPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaaw  355
                 E   IM + SI PR  Y QG  K + +  NMD F+ P    +G QE+  EE    
Sbjct  237  TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLN  295

Query  356  ntettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGN  415
             ++          + G  P + EY    NE +G+F   M  AF  LNR+++E  +     
Sbjct  296  ESDAANE-----GTFGYTPRYAEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN---  345

Query  416  ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDVTARRVMSAKQIPNL  461
             +T+++    N +FA +  S   +WVQ+  D+ A R+M     P L
Sbjct  346  -TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYGTPML  390


>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=541

 Score = 87.0 bits (214),  Expect = 5e-15, Method: Compositional matrix adjust.
 Identities = 66/226 (29%), Positives = 107/226 (47%), Gaps = 17/226 (8%)

Query  239  ATYGIRSATLP-ESPIFCGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGL  296
            + +G+RS+    + P F GG ++ I+  E++  S+T+   P   +AG G++       G 
Sbjct  330  SHFGVRSSDARLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISA--GVNHGF  387

Query  297  KIKCTEPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaaw  355
                 E   IM + SI PR  Y QG  K + +  NMD F+ P    +G QE+  EE    
Sbjct  388  TRYFEEHGYIMGIMSIRPRTGYQQGVPKDFRKFDNMD-FYFPEFAHLGEQEIKNEELYLN  446

Query  356  ntettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGN  415
             ++          + G  P + EY    NE +G+F   M  AF  LNR+++E  +     
Sbjct  447  ESDAANE-----GTFGYTPRYAEYKYSQNEVHGDFRGNM--AFWHLNRIFKEKPNLN---  496

Query  416  ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDVTARRVMSAKQIPNL  461
             +T+++    N +FA +  S   +WVQ+  D+ A R+M     P L
Sbjct  497  -TTFVECNPSNRVFATAETSDDKYWVQIYQDIKALRLMPKYGTPML  541


>gi|494610271|ref|WP_007368517.1| capsid protein [Prevotella multiformis]
 gi|324988543|gb|EGC20506.1| putative capsid protein (F protein) [Prevotella multiformis DSM 
16608]
Length=531

 Score = 87.0 bits (214),  Expect = 5e-15, Method: Compositional matrix adjust.
 Identities = 83/314 (26%), Positives = 138/314 (44%), Gaps = 55/314 (18%)

Query  188  IDGTTGGINAITAVDVTD--GKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRS  245
            + G +  IN ++ + V D      +D ++   +  N L+        Y +  EA +G R 
Sbjct  233  VSGASTFINGVSVLSVNDLRAAFALDKMLEATRRANGLD--------YSSQIEAHFGFR-  283

Query  246  ATLPESPI----FCGGMQSEIAFDEIVSNS-----ATEEEPLGTLAGRGVATMYKSGRGL  296
              +PES      F GG  + +   E+V+ S     A E   LG L G+GV ++  S    
Sbjct  284  --VPESRAGDARFIGGFDNPVVISEVVNQSEFDRGADESPCLGDLGGKGVGSLNSSSIDF  341

Query  297  KIKCTEPSMIMALGSITPRIDYSQGNKW--WTRLQNMDDFHKPTLDAIGFQ-----ELIt  349
             +K  E  +IM + S+ P+ +Y+ G  +  + R    +DF +P    +G+Q     +LI+
Sbjct  342  DVK--EHGIIMCIYSVVPQTEYN-GTYFDPFNRKLRREDFFQPEFADLGYQPVVTSDLIS  398

Query  350  eeaaawntettenYKHIYSS------------LGKQPSWIEYTTDVNETYGEFAAGMPLA  397
                    +  E  K + +             LG Q  + EY T  +  +GEF +G+ L+
Sbjct  399  TYLDNPVPDGPEKQKRLAAGYPLSSIEANNRLLGWQVRYNEYKTSRDLVFGEFESGLSLS  458

Query  398  FMCLNRVYEENTDHTIGN----------ASTYIDPTIYNSIFAESRLSSQNFWVQVAFDV  447
            + C  R Y+   D   G+          A  Y++P+I N+IF  S + + +F V   FDV
Sbjct  459  YWCSPR-YDFGFDGKAGDKKLVNSPWSPAHFYVNPSILNTIFLVSAVKADHFLVNSFFDV  517

Query  448  TARRVMSAKQIPNL  461
             A R MS   +  L
Sbjct  518  KAVRPMSVSGLAGL  531


>gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
 gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
Length=553

 Score = 85.9 bits (211),  Expect = 1e-14, Method: Compositional matrix adjust.
 Identities = 73/269 (27%), Positives = 121/269 (45%), Gaps = 17/269 (6%)

Query  196  NAITAVDVTDGKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRSATLP-ESPIF  254
            N    V+V +  + ++ L     +     R A     Y     + +G+RS+    + P F
Sbjct  299  NGTLKVNVDEMGININDLRTSNALQRWFERNARGGSRYIEQILSHFGVRSSDARLQRPQF  358

Query  255  CGGMQSEIAFDEIVSNSATEE-EPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMALGSIT  313
             GG +  I+  E++  S+T+E  P   +AG G++    +G   K    E   I+ + SIT
Sbjct  359  LGGGRMPISVSEVLQTSSTDETSPQANMAGHGISAGINNG--FKHYFEEHGYIIGIMSIT  416

Query  314  PRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELIteeaaawntettenYKHIYSSLGK  372
            PR  Y QG  + +T+  NMD F+ P    +  QE+   +           Y +   + G 
Sbjct  417  PRSGYQQGVPRDFTKFDNMD-FYFPEFAHLSEQEI---KNQELFVSEDAAYNN--GTFGY  470

Query  373  QPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNASTYIDPTIYNSIFAES  432
             P + EY    +E +G+F     L+F  LNR++E+  +      +T+++    N +FA S
Sbjct  471  TPRYAEYKYHPSEAHGDFRGN--LSFWHLNRIFEDKPNLN----TTFVECKPSNRVFATS  524

Query  433  RLSSQNFWVQVAFDVTARRVMSAKQIPNL  461
                  FWVQ+  DV A R+M     P L
Sbjct  525  ETEDDKFWVQMYQDVKALRLMPKYGTPML  553


>gi|647452987|ref|WP_025792807.1| hypothetical protein [Prevotella histicola]
Length=584

 Score = 70.5 bits (171),  Expect = 1e-09, Method: Compositional matrix adjust.
 Identities = 74/270 (27%), Positives = 118/270 (44%), Gaps = 50/270 (19%)

Query  233  YQAWREATYGIRSATLPESPI----FCGGMQSEIAFDEIVS---NSATE--EEPLGTLAG  283
            Y +  EA +G +   +PES      F GG  + I   E+VS   N+A++     +G L G
Sbjct  324  YASQIEAHFGFK---VPESRANDARFLGGFDNSIVVSEVVSTNGNAASDGSHASIGDLGG  380

Query  284  RGVATMYKSGRGLKIKCTEPSMIMALGSITPRIDYSQG-----NKWWTRLQNMDDFHKPT  338
            +G+ +M  S   ++   TE  +IM + S+ P+ +Y+       N+  TR Q    F++P 
Sbjct  381  KGIGSM--SSGTIEFDSTEHGIIMCIYSVAPQSEYNASYLDPFNRKLTREQ----FYQPE  434

Query  339  LDAIGFQELIte---eaaawntettenYKHIYSS---LGKQPSWIEYTTDVNETYGEFAA  392
               +G+Q LI      +     E    +  I  +   LG Q  + EY T  +  +G+F +
Sbjct  435  FADLGYQALIGSDLICSTLGMNEKQAGFSDIELNNNLLGYQVRYNEYKTARDLVFGDFES  494

Query  393  GMPLAFMCLNR-----------VYEENTD----HTIGNAST------YIDPTIYNSIFAE  431
            G  L++ C  R           +  EN         GN S       YI+P + N IF  
Sbjct  495  GKSLSYWCTPRFDFGYGDTEKKIAPENKGGADYRKKGNRSHWSSRNFYINPNLVNPIFLT  554

Query  432  SRLSSQNFWVQVAFDVTARRVMSAKQIPNL  461
            S + + +F V    DV A R MS   + +L
Sbjct  555  SAVQADHFIVNSFLDVKAVRPMSVTGLSSL  584


>gi|565841287|ref|WP_023924568.1| hypothetical protein [Prevotella nigrescens]
 gi|564729907|gb|ETD29851.1| hypothetical protein HMPREF1173_00033 [Prevotella nigrescens 
CC14M]
Length=656

 Score = 65.5 bits (158),  Expect = 4e-08, Method: Compositional matrix adjust.
 Identities = 83/338 (25%), Positives = 138/338 (41%), Gaps = 49/338 (14%)

Query  144  GTIELPNYNRTKVYRSSNAWFSQAGLAVKTYLSDRFNN---WLNTEWIDGTTGGINAITA  200
            G  ELP+Y                G A      D  NN    L  + +D  + G N I+ 
Sbjct  342  GIFELPDYIN-----------GNTGFATTEVKRDVVNNRGSQLEIKSMDAGSLGSNNISY  390

Query  201  VDVTDGKLTMDALILQKKIFNMLNRVAITDGT-YQAWREATYGIRSATLPES----PIFC  255
            +   D    + A+   +K   ML R    +G  Y     A +G +   +PES      F 
Sbjct  391  ISPND----IRAMFALEK---MLERTRAANGLDYSNQIAAHFGFK---VPESRKNCASFI  440

Query  256  GGMQSEIAFDEIVSNS-------ATEEEPLGTLAGRGVATMYKSGRGLKIKCTEPSMIMA  308
            GG  ++I+  E+V+ S       A+    +G + G+G+  M  SG  +     E  +IM 
Sbjct  441  GGFDNQISISEVVTTSNGSVDGTASTGSVVGQVFGKGIGAM-NSGH-ISYDVKEHGLIMC  498

Query  309  LGSITPRIDY-SQGNKWWTRLQNMDDFHKPTLDAIGFQELIteeaaawntettenYKHIY  367
            + SI P++DY ++    + R  + +D+ +P  + +G Q +I  +          +    +
Sbjct  499  IYSIAPQVDYDARELDPFNRKFSREDYFQPEFENLGMQPVIQSDLCLCINSAKSDSSDQH  558

Query  368  SS-LGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNAS---TYIDPT  423
            ++ LG    ++EY T  +  +GEF +G  L+     +    N     G  S     +DP 
Sbjct  559  NNVLGYSARYLEYKTARDIIFGEFMSGGSLSAWATPK---NNYTFEFGKLSLPDLLVDPK  615

Query  424  IYNSIFA---ESRLSSQNFWVQVAFDVTARRVMSAKQI  458
            +   IFA      +S+  F V   FDV A R M    +
Sbjct  616  VLEPIFAVKYNGSMSTDQFLVNSYFDVKAIRPMQVNDM  653


>gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis]
 gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 
17361]
Length=553

 Score = 61.2 bits (147),  Expect = 8e-07, Method: Compositional matrix adjust.
 Identities = 67/253 (26%), Positives = 111/253 (44%), Gaps = 35/253 (14%)

Query  195  INAITAVDVTDGKLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRSATLPESPI-  253
            +N     D ++G  ++ +L     +  +L+       T+Q    A YG+    +P+S   
Sbjct  283  VNFGVDTDSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYGVE---IPDSRDG  339

Query  254  ---FCGGMQSEIAFDEIVSNS---ATEEEP----LGTLAGRGVATMYKSGRG-LKIKCTE  302
               + GG  S++   ++   S   ATE +P    LG +AG+G      SGRG +     E
Sbjct  340  RVNYLGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTG----SGRGRIVFDAKE  395

Query  303  PSMIMALGSITPRIDYSQGNKWWTRLQNMDD------FHKPTLDAIGFQELIteeaaawn  356
              ++M + S+ P+I Y       TRL  M D      +  P  + +G Q L +   +++ 
Sbjct  396  HGVLMCIYSLVPQIQYD-----CTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYISSFC  450

Query  357  tettenYKHIYSSLGKQPSWIEYTTDVNETYGEFAAGMPLAFMCLNRVYEENTDHTIGNA  416
            T   +N       LG QP + EY T ++  +G+FA    L+   ++R     T   +  A
Sbjct  451  TTDPKN-----PVLGYQPRYSEYKTALDVNHGQFAQSDALSSWSVSRFRRWTTFPQLEIA  505

Query  417  STYIDPTIYNSIF  429
               IDP   NSIF
Sbjct  506  DFKIDPGCLNSIF  518


>gi|599087863|gb|AHN52857.1| major capsid protein, partial [uncultured Gokushovirinae]
Length=219

 Score = 57.8 bits (138),  Expect = 2e-06, Method: Compositional matrix adjust.
 Identities = 48/168 (29%), Positives = 81/168 (48%), Gaps = 11/168 (7%)

Query  188  IDGTTGGINAITAVDVTDG-KLTMDALILQKKIFNMLNRVAITDGTYQAWREATYGIRS-  245
            + G T  ++ +   D+T+    T++ L    +I  ML R A     Y    ++ +G+ S 
Sbjct  51   VSGDTSAVSNVMYADLTEATAATINQLRQAFQIQKMLERDARGGTRYTEIIKSHFGVTSP  110

Query  246  -ATLPESPIFCGGMQSEIAFDEIVSNSATEEE---PLGTLAGRGVATMYKSGRGLKIKCT  301
             A L + P + GG  + +  + +   S T+++   P GTLA  G A +   G G     T
Sbjct  111  DARL-QRPEYLGGGSTPVIINPVAQTSGTDQQSDTPQGTLAAIGTAQV--RGHGFTKSFT  167

Query  302  EPSMIMALGSITPRIDYSQG-NKWWTRLQNMDDFHKPTLDAIGFQELI  348
            E  +I+ L S+   + Y QG N+ W R Q   D++ P L  +G QE++
Sbjct  168  EHCIILGLVSVRADLTYQQGLNRMWNR-QTRYDYYFPALSHLGEQEIL  214



Lambda      K        H        a         alpha
   0.316    0.131    0.388    0.792     4.96 

Gapped
Lambda      K        H        a         alpha    sigma
   0.267   0.0410    0.140     1.90     42.6     43.6 

Effective search space used: 3125101418307