bitscore colors: <40, 40-50 , 50-80, 80-200, >200




           BLASTP 2.2.30+


Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.


Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.



Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF
excluding environmental samples from WGS projects
           49,011,213 sequences; 17,563,301,199 total letters





Query= Contig-23_CDS_annotation_glimmer3.pl_2_3

Length=658
                                                                      Score     E
Sequences producing significant alignments:                          (Bits)  Value

gi|494308783|ref|WP_007173938.1|  hypothetical protein                71.2    1e-09
gi|547920049|ref|WP_022322420.1|  capsid protein VP1                  70.1    3e-09
gi|494306153|ref|WP_007173049.1|  hypothetical protein                68.9    5e-09
gi|492501782|ref|WP_005867318.1|  hypothetical protein                68.9    5e-09
gi|649557305|gb|KDS63784.1|  capsid family protein                    65.5    1e-08
gi|649569140|gb|KDS75238.1|  capsid family protein                    65.9    4e-08
gi|649555287|gb|KDS61824.1|  capsid family protein                    65.9    5e-08
gi|609718276|emb|CDN73650.1|  conserved hypothetical protein          65.1    1e-07
gi|517172762|ref|WP_018361580.1|  hypothetical protein                63.9    2e-07
gi|639237429|ref|WP_024568106.1|  hypothetical protein                60.8    2e-06


>gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis]
 gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 
17361]
Length=553

 Score = 71.2 bits (173),  Expect = 1e-09, Method: Compositional matrix adjust.
 Identities = 57/200 (29%), Positives = 94/200 (47%), Gaps = 25/200 (13%)

Query  426  DTSGGSFTLDTLNLAKKVYTMLNRIAISDGSYNAWIQTVYTSGGLNHIETPI-------Y  478
            D+S G F++ +L  A  V  +L+    +  ++   ++  Y       +E P        Y
Sbjct  290  DSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYG------VEIPDSRDGRVNY  343

Query  479  LGGSSLEIEFQEVVNNSGT---EDQP----LGTLAGRGVATNHKGGNIVFKADEPGYLFC  531
            LGG   +++  +V   SGT   E +P    LG +AG+G  +    G IVF A E G L C
Sbjct  344  LGGFDSDMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTGSGR--GRIVFDAKEHGVLMC  401

Query  532  ITSITPRVDYFQGNEWDMYLESLD--DLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVG  589
            I S+ P++ Y      D  ++ LD  D   P+ + +G Q     +I++   +   N  +G
Sbjct  402  IYSLVPQIQY-DCTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYISSFCTTDPKNPVLG  460

Query  590  KQPAWIEYMTNVNKTYGNFA  609
             QP + EY T ++  +G FA
Sbjct  461  YQPRYSEYKTALDVNHGQFA  480


>gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
 gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48]
Length=553

 Score = 70.1 bits (170),  Expect = 3e-09, Method: Compositional matrix adjust.
 Identities = 57/185 (31%), Positives = 87/185 (47%), Gaps = 9/185 (5%)

Query  472  HIETPIYLGGSSLEIEFQEVVNNSGT-EDQPLGTLAGRGVATNHKGGNIVFKA--DEPGY  528
             ++ P +LGG  + I   EV+  S T E  P   +AG G++    G N  FK   +E GY
Sbjct  352  RLQRPQFLGGGRMPISVSEVLQTSSTDETSPQANMAGHGISA---GINNGFKHYFEEHGY  408

Query  529  LFCITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTV  588
            +  I SITPR  Y QG   D       D + P+   +  Q+   + +  + D+   N T 
Sbjct  409  IIGIMSITPRSGYQQGVPRDFTKFDNMDFYFPEFAHLSEQEIKNQELFVSEDAAYNNGTF  468

Query  589  GKQPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD-INTYTTYIFPHLYNNIFADTDV  647
            G  P + EY  + ++ +G+F    N  +  LNRIF D  N  TT++     N +FA ++ 
Sbjct  469  GYTPRYAEYKYHPSEAHGDFR--GNLSFWHLNRIFEDKPNLNTTFVECKPSNRVFATSET  526

Query  648  TSLKL  652
               K 
Sbjct  527  EDDKF  531


 Score = 50.1 bits (118),  Expect = 0.004, Method: Compositional matrix adjust.
 Identities = 46/177 (26%), Positives = 77/177 (44%), Gaps = 19/177 (11%)

Query  19   RLNNYNRSTHDLSFVMRTSMAPGVLVPTLKMLMLPGDTFPVKTRCHTLTHPTVGPLFGSF  78
            R+    R+  +LS+  + ++  G LVP + M ++ GD F VKT       P V P+    
Sbjct  9    RMKRPRRNAFNLSYESKLTLNMGELVPIMCMPVVSGDKFRVKTESLVRLAPLVAPMMHRV  68

Query  79   KQQNDFFFCPIRL-YNAMLHNNALNIGLDMKKVKLPIVRIIASDLDLTKKMKGSNGTLKK  137
                 +FF P RL +N     + +  G+D +   +P+   I  + D    +  S   +K+
Sbjct  69   NVFTHYFFVPNRLVWNEW--EDFITKGVDGE--DMPMFPKIQINQD--SHLVSSASLIKE  122

Query  138  MIHPSSLVKTLGLSNLEKNNSQQWD------------WNAIPILAYFDIFKNYYANK  182
                SSL   LGL  L    ++ +D             +A+P  AY  I+  YY ++
Sbjct  123  YFGDSSLWDYLGLPTLSACGNKSYDVVNGVKVPSGFQVSALPFRAYQLIYNEYYRDQ  179


>gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis]
 gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM 
17361]
Length=519

 Score = 68.9 bits (167),  Expect = 5e-09, Method: Compositional matrix adjust.
 Identities = 60/234 (26%), Positives = 104/234 (44%), Gaps = 29/234 (12%)

Query  396  LKTYQSDVNTNWVNTEWLDGDSG----INSITAIDTSGGSFTLDTLNLAKKVYTMLNRIA  451
            +  +  D + N+   ++ D        +N    +D + G F++ +L  A  V  +L+   
Sbjct  222  IPEFSDDEHLNFDRDQYADQSKSNFTQLNFPVDVDNNLGYFSVSSLRSAFAVDKLLSVTM  281

Query  452  ISDGSYNAWIQTVYTSGGLNHIETPI-------YLGGSSLEIEFQEVVNNSGT---EDQP  501
             +  ++   ++  Y       +E P        YLGG   +++  +V   SGT   E +P
Sbjct  282  RAGKTFQDQMRAHYG------VEIPDSRDGRVNYLGGFDSDLQVSDVTQTSGTTATEYKP  335

Query  502  ----LGTLAGRGVATNHKGGNIVFKADEPGYLFCITSITPRVDYFQGNEWDMYLESLD--  555
                LG +AG+G  +    G IVF A E G L CI S+ P++ Y      D  ++ LD  
Sbjct  336  EAGYLGRIAGKGTGSGR--GRIVFDAKEHGVLMCIYSLVPQIQY-DCTRLDPMVDKLDRF  392

Query  556  DLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVGKQPAWIEYMTNVNKTYGNFA  609
            D   P+ + +G Q     +I++       N  +G QP + EY T ++  +G FA
Sbjct  393  DFFTPEFENLGMQPLNSSYISSFCTPDPKNPVLGYQPRYSEYKTALDINHGQFA  446


>gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis]
 gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis 
CL09T03C24]
Length=538

 Score = 68.9 bits (167),  Expect = 5e-09, Method: Compositional matrix adjust.
 Identities = 55/184 (30%), Positives = 85/184 (46%), Gaps = 9/184 (5%)

Query  472  HIETPIYLGGSSLEIEFQEVVNNSGTED-QPLGTLAGRGVATNHKGGNIVFKA--DEPGY  528
             ++ P +LGG    I   EV+  S T+   P   +AG G++    G N  FK   +E GY
Sbjct  337  RLQRPQFLGGGRTPISVSEVLQTSATDSTSPQANMAGHGISA---GVNHGFKRYFEEHGY  393

Query  529  LFCITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTV  588
            +  I SI PR  Y QG   D       D + P+   +G Q+   + +         N T 
Sbjct  394  IIGIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEVYLQQTPASNNGTF  453

Query  589  GKQPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD-INTYTTYIFPHLYNNIFADTDV  647
            G  P + EY  ++N+ +G+F    N  +  LNRIF +  N  TT++  +  N +FA  + 
Sbjct  454  GYTPRYAEYKYSMNEVHGDFR--GNMAFWHLNRIFSESPNLNTTFVECNPSNRVFATAET  511

Query  648  TSLK  651
            +  K
Sbjct  512  SDDK  515


 Score = 39.7 bits (91),  Expect = 8.1, Method: Compositional matrix adjust.
 Identities = 22/74 (30%), Positives = 33/74 (45%), Gaps = 0/74 (0%)

Query  18  TRLNNYNRSTHDLSFVMRTSMAPGVLVPTLKMLMLPGDTFPVKTRCHTLTHPTVGPLFGS  77
            +L    R+  +LS+  + +   G LVP +   ++PGD F V T       P V P+   
Sbjct  8   VKLKRPRRNVFNLSYENKLTANAGELVPIMCKPVVPGDKFRVNTEMLVRLAPLVAPMMHR  67

Query  78  FKQQNDFFFCPIRL  91
                 +FF P RL
Sbjct  68  VDVFTHYFFVPNRL  81


>gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=245

 Score = 65.5 bits (158),  Expect = 1e-08, Method: Compositional matrix adjust.
 Identities = 52/181 (29%), Positives = 82/181 (45%), Gaps = 5/181 (3%)

Query  473  IETPIYLGGSSLEIEFQEVVNNSGTED-QPLGTLAGRGVATNHKGGNIVFKADEPGYLFC  531
            ++ P +LGG    I   EV+  S T+   P   +AG G++     G   +  +E GY+  
Sbjct  45   LQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRY-FEEHGYIMG  103

Query  532  ITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVGKQ  591
            I SI PR  Y QG   D       D + P+   +G Q+   + +  N        T G  
Sbjct  104  IMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEGTFGYT  163

Query  592  PAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGDI-NTYTTYIFPHLYNNIFADTDVTSL  650
            P + EY  + N+ +G+F    N  +  LNRIF +  N  TT++  +  N +FA  + +  
Sbjct  164  PRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAETSDD  221

Query  651  K  651
            K
Sbjct  222  K  222


>gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 
3999B T(B) 6]
Length=390

 Score = 65.9 bits (159),  Expect = 4e-08, Method: Compositional matrix adjust.
 Identities = 52/182 (29%), Positives = 82/182 (45%), Gaps = 5/182 (3%)

Query  472  HIETPIYLGGSSLEIEFQEVVNNSGTED-QPLGTLAGRGVATNHKGGNIVFKADEPGYLF  530
             ++ P +LGG    I   EV+  S T+   P   +AG G++     G   +  +E GY+ 
Sbjct  189  RLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRY-FEEHGYIM  247

Query  531  CITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVGK  590
             I SI PR  Y QG   D       D + P+   +G Q+   + +  N        T G 
Sbjct  248  GIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEGTFGY  307

Query  591  QPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD-INTYTTYIFPHLYNNIFADTDVTS  649
             P + EY  + N+ +G+F    N  +  LNRIF +  N  TT++  +  N +FA  + + 
Sbjct  308  TPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAETSD  365

Query  650  LK  651
             K
Sbjct  366  DK  367


>gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 4]
 gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B 
T(B) 6]
Length=541

 Score = 65.9 bits (159),  Expect = 5e-08, Method: Compositional matrix adjust.
 Identities = 52/182 (29%), Positives = 82/182 (45%), Gaps = 5/182 (3%)

Query  472  HIETPIYLGGSSLEIEFQEVVNNSGTED-QPLGTLAGRGVATNHKGGNIVFKADEPGYLF  530
             ++ P +LGG    I   EV+  S T+   P   +AG G++     G   +  +E GY+ 
Sbjct  340  RLQRPQFLGGGRTPISVSEVLQTSSTDSTSPQANMAGHGISAGVNHGFTRY-FEEHGYIM  398

Query  531  CITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGFQDRLYKHINANTDSTEFNKTVGK  590
             I SI PR  Y QG   D       D + P+   +G Q+   + +  N        T G 
Sbjct  399  GIMSIRPRTGYQQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEELYLNESDAANEGTFGY  458

Query  591  QPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD-INTYTTYIFPHLYNNIFADTDVTS  649
             P + EY  + N+ +G+F    N  +  LNRIF +  N  TT++  +  N +FA  + + 
Sbjct  459  TPRYAEYKYSQNEVHGDFR--GNMAFWHLNRIFKEKPNLNTTFVECNPSNRVFATAETSD  516

Query  650  LK  651
             K
Sbjct  517  DK  518


 Score = 40.0 bits (92),  Expect = 5.7, Method: Compositional matrix adjust.
 Identities = 21/74 (28%), Positives = 34/74 (46%), Gaps = 0/74 (0%)

Query  18  TRLNNYNRSTHDLSFVMRTSMAPGVLVPTLKMLMLPGDTFPVKTRCHTLTHPTVGPLFGS  77
            +L    R+  +LS+  + ++  G L+P +   ++PGD F V T       P V P+   
Sbjct  8   VKLKRPRRNVFNLSYENKLTVNAGELIPIMCKPVVPGDKFRVNTEMLVRLAPLVAPMMHR  67

Query  78  FKQQNDFFFCPIRL  91
                 +FF P RL
Sbjct  68  VDVFTHYFFVPNRL  81


>gi|609718276|emb|CDN73650.1| conserved hypothetical protein [Elizabethkingia anophelis]
Length=537

 Score = 65.1 bits (157),  Expect = 1e-07, Method: Compositional matrix adjust.
 Identities = 72/254 (28%), Positives = 111/254 (44%), Gaps = 23/254 (9%)

Query  377  EDISGRHIPNCAYPMVGLALKTYQSDVNTNW--VNTEWLDGDSGINSITAIDTSGGSFTL  434
            +D++G   PN          K  +SDVN N   V+ + L  D   N    + +   S T+
Sbjct  243  KDMAGNPAPN----------KDLRSDVNGNLQDVSGQPLSLDPSKNLKLNMASENVS-TV  291

Query  435  DTLNLAKKVYTMLNRIAISDGSYNAWIQTVY---TSGGLNHIETPIYLGGSSLEIEFQEV  491
            + L  A K+   L + A +   Y   I + +   TS G   ++ P +LGG+   I   EV
Sbjct  292  NDLRRAFKLQEWLEKNARAGSRYAESILSFFGVKTSDG--RLQRPEFLGGNKSPIMISEV  349

Query  492  VNNSGTEDQ-PLGTLAGRGVATNHKGGNIVFKADEPGYLFCITSITPRVDYFQGNEWDMY  550
            +  S T+   P G +AG G+     GG   F  +E GY+  + S+ P+  Y QG      
Sbjct  350  LQQSATDSTTPQGNMAGHGIGIGKDGGFSRF-FEEHGYVIGLMSVIPKTSYSQGIPRHFS  408

Query  551  LESLDDLHKPQLDGIGFQDRLYKHINA-NTDSTEFNKTVGKQPAWIEYMTNVNKTYGNFA  609
                 D   PQ + IG Q    K I A N D+ +     G  P + EY  + +  +G+F 
Sbjct  409  KSDKFDYFWPQFEHIGEQPVYNKEIFAKNIDAFDSEAVFGYLPRYSEYKFSPSTVHGDFK  468

Query  610  LVENEGWMCLNRIF  623
              ++  +  L RIF
Sbjct  469  --DDLYFWHLGRIF  480


>gi|517172762|ref|WP_018361580.1| hypothetical protein [Prevotella nanceiensis]
Length=568

 Score = 63.9 bits (154),  Expect = 2e-07, Method: Compositional matrix adjust.
 Identities = 53/185 (29%), Positives = 83/185 (45%), Gaps = 20/185 (11%)

Query  478  YLGGSSLEIEFQEVVNNSGT-----EDQPLGTLAGR--GVATNHKGGNIVFKADEPGYLF  530
            Y+GG    I+  +V  +SGT     +D   G   GR  G AT    G+I F A E G L 
Sbjct  351  YIGGFDSNIQVGDVTQSSGTTVTGTKDTSFGGYLGRTTGKATGSGSGHIRFDAKEHGILM  410

Query  531  CITSITPRVDYFQGNEWDMYLESLD--DLHKPQLDGIGFQDRLYKHINANTDSTEFNKTV  588
            CI S+ P V Y      D +++ ++  D   P+ + +G Q    K+I+   ++   N  +
Sbjct  411  CIYSLVPDVQY-DSKRVDPFVQKIERGDFFVPEFENLGMQPLFAKNISYKYNNNTANSRI  469

Query  589  ------GKQPAWIEYMTNVNKTYGNFALVENEGWMCLNRIFGD----INTYTTYIFPHLY  638
                  G QP + EY T ++  +G F   E   +  + R  G+     N  T  I P   
Sbjct  470  KNLGAFGWQPRYSEYKTALDINHGQFVHQEPLSYWTVARARGESMSNFNISTFKINPKWL  529

Query  639  NNIFA  643
            +++FA
Sbjct  530  DDVFA  534


>gi|639237429|ref|WP_024568106.1| hypothetical protein [Elizabethkingia anophelis]
Length=546

 Score = 60.8 bits (146),  Expect = 2e-06, Method: Compositional matrix adjust.
 Identities = 67/237 (28%), Positives = 104/237 (44%), Gaps = 16/237 (7%)

Query  392  VGLALKTYQSDVNTNWVNTEWLDGDSGINSITAIDTSGGSFTLDTLNLAKKVYTMLNRIA  451
            +G  +    S  N+N VN      D+  N    + T+ GS T++ L  A K+   L + A
Sbjct  264  IGHLMVETSSTGNSNPVNI-----DNSSNLGVDLKTASGS-TINDLRRAFKLQEWLEKNA  317

Query  452  ISDGSYNAWIQTVY---TSGGLNHIETPIYLGGSSLEIEFQEVVNNSGTEDQ-PLGTLAG  507
             +   Y   I + +   TS G   ++ P +LGG+   I   EV+  S T+   P G +AG
Sbjct  318  RAGSRYAESILSFFGVKTSDG--RLQRPEFLGGNKTPILISEVLQQSSTDSTTPQGNMAG  375

Query  508  RGVATNHKGGNIVFKADEPGYLFCITSITPRVDYFQGNEWDMYLESLDDLHKPQLDGIGF  567
             G++   +GG   F  +E GY+  + S+ P+  Y QG           D   PQ + IG 
Sbjct  376  HGISVGKEGGFSKF-FEEHGYVIGLMSVIPKTSYSQGIPRHFSKFDKFDYFWPQFEHIGE  434

Query  568  QDRLYKHINA-NTDSTEFNKTVGKQPAWIEYMTNVNKTYGNFALVENEGWMCLNRIF  623
            Q    K I A N    +     G  P + EY  + +  +G+F   +   +  L RIF
Sbjct  435  QPVYNKEIFAKNVGDYDSGGVFGYVPRYSEYKYSPSTIHGDFK--DTLYFWHLGRIF  489



Lambda      K        H        a         alpha
   0.317    0.135    0.400    0.792     4.96 

Gapped
Lambda      K        H        a         alpha    sigma
   0.267   0.0410    0.140     1.90     42.6     43.6 

Effective search space used: 5013181281552