bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-7_CDS_annotation_glimmer3.pl_2_4 Length=568 Score E Sequences producing significant alignments: (Bits) Value gi|547226430|ref|WP_021963493.1| putative uncharacterized protein 338 2e-104 gi|575094354|emb|CDL65742.1| unnamed protein product 325 4e-99 gi|490418709|ref|WP_004291032.1| hypothetical protein 318 4e-97 gi|496050829|ref|WP_008775336.1| hypothetical protein 297 7e-89 gi|494822885|ref|WP_007558293.1| hypothetical protein 277 5e-81 gi|575094321|emb|CDL65708.1| unnamed protein product 214 2e-57 gi|575094339|emb|CDL65730.1| unnamed protein product 181 6e-46 gi|517172762|ref|WP_018361580.1| hypothetical protein 153 8e-37 gi|647452987|ref|WP_025792807.1| hypothetical protein 150 6e-36 gi|494308783|ref|WP_007173938.1| hypothetical protein 150 1e-35 >gi|547226430|ref|WP_021963493.1| putative uncharacterized protein [Prevotella sp. CAG:1185] gi|524103382|emb|CCY83994.1| putative uncharacterized protein [Prevotella sp. CAG:1185] Length=573 Score = 338 bits (866), Expect = 2e-104, Method: Compositional matrix adjust. Identities = 205/567 (36%), Positives = 305/567 (54%), Gaps = 76/567 (13%) Query 1 MAGLFSYGDIKNKPRRSGFDLGNKNAFTAKVGELLPVYWKFCLPGDKFKISQEWFTRTQP 60 M+ + S +KN +R+GFDL KNAFTAKVGELLP+ K PGDKF I + FTRTQP Sbjct 1 MSSVMSLTALKNSVKRNGFDLSFKNAFTAKVGELLPIMCKEVYPGDKFNIRGQAFTRTQP 60 Query 61 VDTSAFTRIREYYEWFFVPLHLMYRNSNEAIMSMENQPNYAASGTQSITFNRKLPWVDLL 120 V+++A++R+REYY+++FVP L++ + +M + P++AA S+ +++ PW Sbjct 61 VNSAAYSRLREYYDFYFVPYRLLWNMAPTFFTNMPD-PHHAADLVSSVNLSQRHPWFTFF 119 Query 121 TLNNAVENVKA-----STYHDNMFGFSRALGFAKLYNYLGVG---QFDPSKTLA---NLR 169 + + N+ + Y N FGFSR KL NYL G ++ K + ++ Sbjct 120 DIMEYLGNLNSLSGAYEKYQKNFFGFSRVELSVKLLNYLNYGFGKDYESVKVPSDSDDIV 179 Query 170 ISVFPFYAYQKIYNDYYRNSQWEVNKPWTYNCDFWNGEDT---TPVASLKDLFDTNPNDS 226 +S FP AYQKI DY+R+ QW+ P+ YN D+ G+ + P++S + D N + Sbjct 180 LSPFPLLAYQKICEDYFRDDQWQSAAPYRYNLDYLYGKSSGFHIPMSSFTN--DAFKNPT 237 Query 227 VFELRYANWNKDLYMGAMPNAQFGDVAFVPVDSSGKLPVSLPSIEVGGVAPIYNTGAGGV 286 +F+L Y N+ KD + G +P AQ+GDV+ +PI+ Sbjct 238 MFDLNYCNFQKDYFTGMLPRAQYGDVSV--------------------ASPIFG------ 271 Query 287 QPDAQIGLRGAVT--GAPDNGQTVTAYGADKTDAARPYFYAVPDGSVAHLKTNAKTIQVP 344 D IG ++T AP G G V + N+ T Sbjct 272 --DLDIGDSSSLTFASAPQQGANTIQSG------------------VLVVNNNSNT---- 307 Query 345 YEFSSKFDVLQLRAAECLQKWKEIAQANGQNYASQVKAHFGVSPNPMTSHRCQRVCGFDG 404 ++ VL LR AECLQKW+EIAQ+ +Y +Q++ HF VSP+ S C+ + G+ Sbjct 308 ---TAGLSVLALRQAECLQKWREIAQSGKMDYQTQMQKHFNVSPSATLSGHCKYLGGWTS 364 Query 405 SIDISAVENTNLSSD-EAIIRGKGIGGYRVNKPETFKTTEHGVLMCIYHAVPLLDYAPTG 463 ++DIS V NTNL+ D +A I+GKG G NK + F+++EHG++MCIYH +PLLD++ Sbjct 365 NLDISEVVNTNLTGDNQADIQGKGTGTLNGNKVD-FESSEHGIIMCIYHCLPLLDWSINR 423 Query 464 PDLQFMTTVDGDSWPVPELDSVGFEEL-PSYSLLNTSDVQPIKEPRPFGYVPRYISWKTS 522 Q T D + +PE DSVG ++L PS + D+ GYVPRY KTS Sbjct 424 IARQNFKTTFTD-YAIPEFDSVGMQQLYPSEMIFGLEDLPSDPSSINMGYVPRYADLKTS 482 Query 523 VDVVRGAFIDTLKSWTAPIGEDYMKIY 549 +D + G+FIDTL SW +P+ + Y+ Y Sbjct 483 IDEIHGSFIDTLVSWVSPLTDSYISAY 509 >gi|575094354|emb|CDL65742.1| unnamed protein product [uncultured bacterium] Length=615 Score = 325 bits (832), Expect = 4e-99, Method: Compositional matrix adjust. Identities = 210/575 (37%), Positives = 310/575 (54%), Gaps = 68/575 (12%) Query 5 FSYGDIKNKPRRSGFDLGNKNAFTAKVGELLPVYWKFCLPGDKFKISQEWFTRTQPVDTS 64 S DIKN+P R+GFDL K FTAK GELLPV K LPGD F I+ FTRTQP++TS Sbjct 1 MSMADIKNRPSRNGFDLSFKKNFTAKAGELLPVMTKVVLPGDSFNINLRSFTRTQPLNTS 60 Query 65 AFTRIREYYEWFFVPLHLMYRNSNEAIMSMENQPNYAASGT--QSITFNRKLPWVDLLTL 122 AF R+REYY+++FVP M+ + I M +A+ T + + ++P+ + Sbjct 61 AFARMREYYDFYFVPFEQMWNKFDSCITQMNANVQHASGPTLDDNTPLSGRMPYFTSEQI 120 Query 123 NNAVENVKASTYHDNMFGFSRALGFAKLYNYLGVGQFDP--SKT--------LANLRISV 172 + + N +A+ N FGF+R+ KL YLG G ++ S+T L NL +S Sbjct 121 ADYL-NDQATAARKNPFGFNRSTLTCKLLQYLGYGDYNSFDSETNTWSAKPLLYNLELSP 179 Query 173 FPFYAYQKIYNDYYRNSQWEVNKPWTYNCDFWNGEDTTPVASLKDLFDTNPND--SVFEL 230 FP AYQKIY+D+YR +QWE P T+N D+ G + L+ P+D + F++ Sbjct 180 FPLLAYQKIYSDFYRYTQWEKTNPSTFNLDYIKG-----TSDLQMDLTGLPSDDNNFFDI 234 Query 231 RYANWNKDLYMGAMPNAQFGDVAFVPVDSSGKLPVSLPSIEVGGVAPIYNT--------G 282 RY N+ KD++ G +P AQ+G + VP++ G+L V I G PI+ T G Sbjct 235 RYCNYQKDMFHGVLPVAQYGSASVVPIN--GQLNV----ISNGDSGPIFKTSTPDPGTPG 288 Query 283 AGGVQPDAQIGLRG---AVTGAPDN-GQTV--TAYGADKTDAARPYFYAVPDGSVAHLKT 336 V IG+ V+G+ N G++ + YG + R + P+ + Sbjct 289 TSYVTVGGNIGVDNRSFGVSGSTLNVGKSADPSGYGFPSNASTRSLLWENPN----LIIE 344 Query 337 NAKTIQVPYEFSSKFDVLQLRAAECLQKWKEIAQANGQNYASQVKAHFGVSPNPMTSHRC 396 N + VP +L LR AE LQKWKE++ + ++Y SQ++ H+G+ + SH+ Sbjct 345 NNQGFYVP--------ILALRQAEFLQKWKEVSVSGEEDYKSQIEKHWGIKVSDFLSHQA 396 Query 397 QRVCGFDGSIDISAVENTNLSSDEAI-IRGKGIGGYRVNKPETFKTT-EHGVLMCIYHAV 454 + + G S+DI+ V N N++ D A I GKG + N F++ E+G++MCIYH + Sbjct 397 RYLGGCATSLDINEVINNNITGDNAADIAGKGT--FTGNGSIRFESKGEYGIIMCIYHVL 454 Query 455 PLLDYAPTGPDLQFMTTVDGDSWPVPELDSVGFEELPSYSLLNTSDVQPIKEPRP----- 509 P++DY +G D T VD S+P+PELD +G E +P +N P+KE Sbjct 455 PIVDYVGSGVD-HSCTLVDATSFPIPELDQIGMESVPLVRAMN-----PVKESDTPSADT 508 Query 510 -FGYVPRYISWKTSVDVVRGAFIDTLKSWTAPIGE 543 GY PRYI WKTSVD G F D+L++W P+G+ Sbjct 509 FLGYAPRYIDWKTSVDRSVGDFADSLRTWCLPVGD 543 >gi|490418709|ref|WP_004291032.1| hypothetical protein [Bacteroides eggerthii] gi|217986636|gb|EEC52970.1| putative capsid protein (F protein) [Bacteroides eggerthii DSM 20697] Length=578 Score = 318 bits (816), Expect = 4e-97, Method: Compositional matrix adjust. Identities = 209/560 (37%), Positives = 285/560 (51%), Gaps = 79/560 (14%) Query 1 MAGLFSYGDIKNKPRRSGFDLGNKNAFTAKVGELLPVYWKFCLPGDKFKISQEWFTRTQP 60 MA + S I+NKP R+GFDL K FTAK GELLPV K LPGD FKI+ + FTRTQP Sbjct 1 MANIMSLKSIRNKPSRNGFDLSFKKNFTAKAGELLPVMVKEVLPGDTFKINLKAFTRTQP 60 Query 61 VDTSAFTRIREYYEWFFVPLHLMYRNSNEAIMSMENQPNYAAS--GTQSITFNRKLPWVD 118 V+T+AF RIREYY++FFVP L++ +N + M + P +A S T++ + ++P++ Sbjct 61 VNTAAFARIREYYDFFFVPYDLLWNKANTVLTQMYDNPQHAVSIDPTRNFVLSGEMPYMT 120 Query 119 ---LLTLNNAVENVKA-STYHDNMFGFSRALGFAKLYNYLGVGQFDPSKT--------LA 166 + + NA+ A + Y N FG++R+ KL YLG G ++ T +A Sbjct 121 SEAIASYINALSTASALADYKSNYFGYNRSKSSVKLLEYLGYGNYESFLTDDWNTAPLMA 180 Query 167 NLRISVFPFYAYQKIYNDYYRNSQWEVNKPWTYNCDFWNGEDTTPVASLKDLFDTNPNDS 226 NL ++F AYQKIY+D+YR+SQWE P T+N D+ +G + F N N Sbjct 181 NLNHNIFGLLAYQKIYSDFYRDSQWERVSPSTFNVDYLDGSSMNLDNAYSTEFYQNYN-- 238 Query 227 VFELRYANWNKDLYMGAMPNAQFGDVAFVPV--DSSGKLPVSLPSIEVGGVAPIYNTGAG 284 F+LRY NW KDL+ G +P+ Q+G+ A + D +GKL +S N Sbjct 239 FFDLRYCNWQKDLFHGVLPHQQYGETAVASITPDVTGKLTLS-------------NFSTV 285 Query 285 GVQPDAQIGLRGAVTGAPDNGQTVTAYGADKTDAARPYFYAVPDGSVAHLKTNAKTIQVP 344 G P TA G + P F V D S Sbjct 286 GTSP-------------------TTASGTATKNL--PAFDTVGDLS-------------- 310 Query 345 YEFSSKFDVLQLRAAECLQKWKEIAQANGQNYASQVKAHFGVSPNPMTSHRCQRVCGFDG 404 +L LR AE LQKWKEI Q+ ++Y Q++ H+GVS S C + G Sbjct 311 --------ILVLRQAEFLQKWKEITQSGNKDYKDQLEKHWGVSVGDGFSELCTYLGGVSS 362 Query 405 SIDISAVENTNLS-SDEAIIRGKGIGGYRVNKPETFKTT-EHGVLMCIYHAVPLLDYAPT 462 SIDI+ V NTN++ S A I GKG+G N F + +G++MCIYH +PLLDY Sbjct 363 SIDINEVINTNITGSAAADIAGKGVG--VANGEINFNSNGRYGLIMCIYHCLPLLDYTTD 420 Query 463 GPDLQFMTTVDGDSWPVPELDSVGFEELPSYSLLNTSDVQPIKEPRPFGYVPRYISWKTS 522 D F+ V+ + +PE D VG + +P L+N GYVPRYI +KTS Sbjct 421 MLDPAFL-KVNSTDYAIPEFDRVGMQSMPLVQLMNPLRSFANASGLVLGYVPRYIDYKTS 479 Query 523 VDVVRGAFIDTLKSWTAPIG 542 VD G F TL SW G Sbjct 480 VDQSVGGFKRTLNSWVISYG 499 >gi|496050829|ref|WP_008775336.1| hypothetical protein [Bacteroides sp. 2_2_4] gi|229448893|gb|EEO54684.1| putative capsid protein (F protein) [Bacteroides sp. 2_2_4] Length=580 Score = 297 bits (760), Expect = 7e-89, Method: Compositional matrix adjust. Identities = 205/568 (36%), Positives = 284/568 (50%), Gaps = 96/568 (17%) Query 1 MAGLFSYGDIKNKPRRSGFDLGNKNAFTAKVGELLPVYWKFCLPGDKFKISQEWFTRTQP 60 MA + S ++NK R+GFDL +K FTAK GELLPV LPGDK+ I + FTRTQP Sbjct 1 MANIMSLKSLRNKTSRNGFDLSSKRNFTAKPGELLPVKCWEVLPGDKWSIDLKSFTRTQP 60 Query 61 VDTSAFTRIREYYEWFFVPLHLMYRNSNEAIMSMENQPNYAASGTQSITFNRKLPWV--- 117 ++T+AF R+REYY+++FVP +L++ +N + M + P +A S S N+ L V Sbjct 61 LNTAAFARMREYYDFYFVPYNLLWNKANTVLTQMYDNPQHATSYIPSA--NQALAGVMPN 118 Query 118 -------DLLTLNNAVENVKASTYHDNMFGFSRALGFAKLYNYLGVGQF----------- 159 D L L A + ++Y N FG+SR+LG AKL YLG G F Sbjct 119 VTCKGIADYLNL-VAPDVTTTNSYEKNYFGYSRSLGTAKLLEYLGYGNFYTYATSKNNTW 177 Query 160 DPSKTLANLRISVFPFYAYQKIYNDYYRNSQWEVNKPWTYNCDFWNG--EDTTPVASLKD 217 S +NL+++++ AYQKIY D+ R+SQWE P +N D+ +G + + S+ Sbjct 178 TKSPLSSNLQLNIYGVLAYQKIYADHIRDSQWEKVSPSCFNVDYLSGTVDSAMTIDSMIT 237 Query 218 LFDTNPNDSVFELRYANWNKDLYMGAMPNAQFGDVAFVPVDSSGKLP----VSLPSIEVG 273 P ++F+LRY NW KDL+ G +P Q+GD A V V+ S L V P + Sbjct 238 GQGFAPFYNMFDLRYCNWQKDLFHGVLPRQQYGDTAAVNVNLSNVLSAQYMVQTPDGDPV 297 Query 274 GVAPIYNTGAGGVQPDAQIGLRGAVTGAPDNGQTVTAYGADKTDAARPYFYAVPDGSVAH 333 G +P +TG N QTV G Sbjct 298 GGSPFSSTGV--------------------NLQTVNGSGT-------------------- 317 Query 334 LKTNAKTIQVPYEFSSKFDVLQLRAAECLQKWKEIAQANGQNYASQVKAHFGVSPNPMTS 393 F VL LR AE LQKWKEI Q+ ++Y Q++ H+ VS S Sbjct 318 -----------------FTVLALRQAEFLQKWKEITQSGNKDYKDQIEKHWNVSVGEAYS 360 Query 394 HRCQRVCGFDGSIDISAVENTNLS-SDEAIIRGKG--IGGYRVNKPETFKTTE-HGVLMC 449 + G S+DI+ V N N++ S+ A I GKG +G R+ +F E +G++MC Sbjct 361 EMSLYLGGTTASLDINEVVNNNITGSNAADIAGKGVVVGNGRI----SFDAGERYGLIMC 416 Query 450 IYHAVPLLDYAPTGPDLQFMTTVDGDSWPVPELDSVGFEELPSYSLLNTSDVQPIKEPRP 509 IYH++PLLDY + F T ++ + +PE D VG E +P SL+N Sbjct 417 IYHSLPLLDYTTDLVNPAF-TKINSTDFAIPEFDRVGMESVPLVSLMNPLQSSYNVGSSI 475 Query 510 FGYVPRYISWKTSVDVVRGAFIDTLKSW 537 GY PRYIS+KT VD GAF TLKSW Sbjct 476 LGYAPRYISYKTDVDSSVGAFKTTLKSW 503 >gi|494822885|ref|WP_007558293.1| hypothetical protein [Bacteroides plebeius] gi|198272099|gb|EDY96368.1| putative capsid protein (F protein) [Bacteroides plebeius DSM 17135] Length=613 Score = 277 bits (709), Expect = 5e-81, Method: Compositional matrix adjust. Identities = 193/581 (33%), Positives = 289/581 (50%), Gaps = 79/581 (14%) Query 1 MAGLFSYGDIKNKPRRSGFDLGNKNAFTAKVGELLPVYWKFCLPGDKFKISQEWFTRTQP 60 MA + S ++NKP R+G+DL K FTAK G L+PV+W LP D + + F RTQP Sbjct 8 MANIMSMKSVRNKPTRAGYDLTQKINFTAKAGSLIPVWWTPVLPFDDLNATVKSFVRTQP 67 Query 61 VDTSAFTRIREYYEWFFVPLHLMYRNSNEAIMSMENQPNYAASGT--QSITFNRKLPWVD 118 ++T+AF R+R Y++++FVP M+ AI M +A+ ++ + +LP+ Sbjct 68 LNTAAFARMRGYFDFYFVPFRQMWNKFPTAITQMRTNLLHASGPVLADNVPLSDELPY-- 125 Query 119 LLTLNNAVENVKASTYHDNMFGFSRALGFAKLYNYLGVGQFDP---------------SK 163 T + + + N FG+ RA + YLG G F P Sbjct 126 -FTAEQVADYIVSLADSKNQFGYYRAWLVCIILEYLGYGDFYPYIVEAAGGEGATWATRP 184 Query 164 TLANLRISVFPFYAYQKIYNDYYRNSQWEVNKPWTYNCDFWNGEDTT-----PVASLKDL 218 L NL+ S FP +AYQKIY D+ R +QWE + P T+N D+ +G + V KD Sbjct 185 MLNNLKFSPFPLFAYQKIYADFNRYTQWERSNPSTFNIDYISGSADSLQLDFTVEGFKDS 244 Query 219 FDTNPNDSVFELRYANWNKDLYMGAMPNAQFGDVAFVPVDSSGKLPVSLPSIEVGGVAPI 278 F+ +F++RY+NW +DL G +P AQ+G+ + VPV SG + V +E G P Sbjct 245 FN------LFDMRYSNWQRDLLHGTIPQAQYGEASAVPV--SGSMQV----VE-GPTPPA 291 Query 279 YNTGAGGVQPDAQIGLRGAVTGAPDNG--QTVTAYGADKTDAARPYFYAVPDGSVAHLKT 336 + TG GV L G VT +G Q T+ G + L+ Sbjct 292 FTTGQDGVA-----FLNGNVTIQGSSGYLQAQTSVGESRI-----------------LRF 329 Query 337 NAKTIQVPYEFSSKFDV--LQLRAAECLQKWKEIAQANGQNYASQVKAHFGVSPNPMTSH 394 N + E S F V L LR AE QKWKE+A A+ ++Y SQ++AH+G S N S Sbjct 330 NNTNSGLIVEGDSSFGVSILALRRAEAAQKWKEVALASEEDYPSQIEAHWGQSVNKAYSD 389 Query 395 RCQRVCGFDGSIDISAVENTNLSSDEAI-IRGKGIGGYRVNKPETFKT-TEHGVLMCIYH 452 CQ + + + I+ V N N++ + A I GKG N F ++G++MC++H Sbjct 390 MCQWLGSINIDLSINEVVNNNITGENAADIAGKGT--MSGNGSINFNVGGQYGIVMCVFH 447 Query 453 AVPLLDYAPTGPDLQFMTTVDGD-SWPVPELDSVGFEELPSYSLLNTSDVQP------IK 505 +P LDY + P F TT+ +P+PE D +G E++P LN V+P + Sbjct 448 VLPQLDYITSAP--HFGTTLTNVLDFPIPEFDKIGMEQVPVIRGLNP--VKPKDGDFKVS 503 Query 506 EPRPFGYVPRYISWKTSVDVVRGAFIDTLKSWTAPIGEDYM 546 FGY P+Y +WKT++D G F +LK+W P ++ + Sbjct 504 PNLYFGYAPQYYNWKTTLDKSMGEFRRSLKTWIIPFDDEAL 544 >gi|575094321|emb|CDL65708.1| unnamed protein product [uncultured bacterium] Length=642 Score = 214 bits (545), Expect = 2e-57, Method: Compositional matrix adjust. Identities = 173/601 (29%), Positives = 267/601 (44%), Gaps = 97/601 (16%) Query 10 IKNKPRRSGFDLGNKNAFTAKVGELLPVYWKFCLPGDKFKISQEWFTRTQPVDTSAFTRI 69 +KNKP R+ FDL ++N FTAKVGELLP + + PGD K+S +FTRT P+ ++AFTR+ Sbjct 13 LKNKPSRNSFDLSHRNMFTAKVGELLPCFVQELNPGDSVKVSSSYFTRTAPLQSNAFTRL 72 Query 70 REYYEWFFVPLHLMYRNSNEAIMSMENQPN------YAAS--GTQSITFNRKLPWVDLLT 121 RE ++FFVP +++ + +++M N A+S G Q +T ++P V+ T Sbjct 73 RENVQYFFVPYSALWKYFDSQVLNMTKNANGGDISRIASSLVGNQKVT--TQMPCVNYKT 130 Query 122 L--------NNAVENVKASTYHDNMFGFSRALGFAKLYNYLGVGQFDPSKTLANLRI--- 170 L N + S + G R AKL LG G F + AN ++ Sbjct 131 LHAYLLKFINRSTVGSDGSVGPEFNRGCYRHAESAKLLQLLGYGNF--PEQFANFKVNND 188 Query 171 --------------------SVFPFYAYQKIYNDYYRNSQWEVNKPWTYNCDFWNGEDTT 210 S+F AY KI ND+Y QW+ N D+ +++ Sbjct 189 KHNQSGQNFKDVTYNNSPYLSIFRLLAYHKICNDHYLYRQWQPYNASLCNVDYLT-PNSS 247 Query 211 PVASLKDLFDTNPNDSV-------FELRYANWNKDLYMGAMPNAQFGDVAFVPVDSSGKL 263 + S+ D + P+DS+ ++R++N D + G +P +QFG + V ++ Sbjct 248 SLLSIDDALLSIPDDSIKAEKLNLLDMRFSNLPLDYFTGVLPTSQFGSESVVNLNLG--- 304 Query 264 PVSLPSIEVGGVAPIYNTGAGGVQPDAQIGLRGAVTGAPDNGQTV--TAYGADKTDAARP 321 G A + G D+ G TG + Q V +A G K D + Sbjct 305 -------NASGSAVL----NGTTSKDS--GRWRTTTGEWEMEQRVASSANGNLKLDNSNG 351 Query 322 YFYAVPDGSVAHLKTNAKTIQVPYEFSSKFDVLQLRAAECLQKWKEIAQANGQNYASQVK 381 F ++H T + + + S ++ LR A QK+KEI AN ++ SQV+ Sbjct 352 TF-------ISHDHTFSGNVAINTSLSGNLSIIALRNALAAQKYKEIQLANDVDFQSQVE 404 Query 382 AHFGVSPNPMTSHRCQRVCGFDGSIDISAVENTNLSSDEAIIRG---KGIGGYRVNKPET 438 AHFG+ P+ + + G I+I+ N NLS D G +G G + Sbjct 405 AHFGIKPDEKNENSL-FIGGSSSMININEQINQNLSGDNKATYGAAPQGNGSASI----K 459 Query 439 FKTTEHGVLMCIYHAVPLLDYAPTGPDLQFMTTVDGDSWPVPELDSVGFEEL-------- 490 F +GV++ IY P+LD+A G D T D + +PE+DS+G ++ Sbjct 460 FTAKTYGVVIGIYRCTPVLDFAHLGIDRTLFKT-DASDFVIPEMDSIGMQQTFRCEVAAP 518 Query 491 ----PSYSLLNTSDVQPIKEPRPFGYVPRYISWKTSVDVVRGAFIDTLKSWTAPIGEDYM 546 + D +GY PRY +KTS D GAF +LKSW I D + Sbjct 519 APYNDEFKAFRVGDGSSPDMSETYGYAPRYSEFKTSYDRYNGAFCHSLKSWVTGINFDAI 578 Query 547 K 547 + Sbjct 579 Q 579 >gi|575094339|emb|CDL65730.1| unnamed protein product [uncultured bacterium] Length=588 Score = 181 bits (458), Expect = 6e-46, Method: Compositional matrix adjust. Identities = 153/548 (28%), Positives = 238/548 (43%), Gaps = 87/548 (16%) Query 16 RSGFDLGNKNAFTAKVGELLPVYWKFCLPGDKFKISQEWFTRTQPVDTSAFTRIREYYEW 75 ++GFD+ ++ FT+ VG+LLPV++ + PGDK +IS FTRTQP+ ++A R+ E+ E+ Sbjct 16 KNGFDMSQRHPFTSSVGQLLPVFYDYLNPGDKIRISANLFTRTQPMKSTAMARLTEHIEY 75 Query 76 FFVPLHLMYRNSNEAIMSMENQPNYAASGTQSITFNRKLPWVDLLTLNNAVE-------- 127 FFVP M+ +++ + + ++T +P+ ++ A+E Sbjct 76 FFVPFEQMFSLFGSVFYGIDDYNSSSLVKHNNLT----MPFFKSDAVSAALEAAYTSFSS 131 Query 128 NVKASTYHDNMFGFSRALGFAKLYNYLGVGQFDPSKTLA---NLRISVFPFYAYQKIYND 184 ++ +M G R G +L LG G S + +SVF F AYQKI+ND Sbjct 132 SINRKVLTPDMMGQPRVYGILRLSEMLGYGSLLLSNDNNLLPHADMSVFLFTAYQKIFND 191 Query 185 YYRNSQWEVNKPWTYNCDFWNGEDTTPVASLKDLFDTNPNDSVFELRYANWNKDLYMGAM 244 +YR + + +YN D+ G+ T ++S+FEL Y W KD + + Sbjct 192 FYRLDDYTSVQHKSYNVDYAQGQPIT-------------DNSMFELHYRPWKKDYFTNVI 238 Query 245 PNAQFGDVAFVPVDSSGKLPVSLPSIEVGGVAPIYNTGAGGVQPDAQIGLRGAVTGAPDN 304 PN F VD+ GAG D +GL Sbjct 239 PNPYFSS-----VDNKSSF-----------------GGAGLF--DRPVGL---------- 264 Query 305 GQTVTAYGADKTDAARPYFYAVPDGSVAHLKTNAKTIQ-VPYEFSSK----FDVLQLRAA 359 ++T++ D +D F P ++ ++ N Q +P +S V LR Sbjct 265 --SITSFNFDGSD-----FLQAP-SDLSTMENNQPIFQELPVNLTSASSAGLSVSDLRYL 316 Query 360 ECLQKWKEIAQANGQNYASQVKAHFGVSPNPMTSHRCQRVCGFDGSIDISAVENTNLSSD 419 K I Q G++Y +Q AHFG S + G + IS+VE+T + D Sbjct 317 YATDKLLRITQFAGKHYDAQTLAHFGKRVPQGVSGEVYYIGGQSQPLQISSVESTATTFD 376 Query 420 EAIIRGKGIG-----GYRV---NKPETFKTTEHGVLMCIYHAVPLLDYAPTGPDLQFMTT 471 + G +G GY K +F+ HGVLM IY AVP DY D T Sbjct 377 SGDVVGSVLGELAGKGYSQTGNQKDFSFEAPCHGVLMAIYSAVPEADYLDERIDY-LNTL 435 Query 472 VDGDSWPVPELDSVGFEELPSYSLLNTSDVQPIKEPRPFGYVPRYISWKTSVDVVRGAFI 531 + + + PE DS+G E P+Y L + + G+ RY K+ D++ GAF Sbjct 436 IQSNDFYKPEFDSLGMEPFPNYEL---DQYRMVGNNSRLGWRYRYSGLKSKPDLISGAFK 492 Query 532 DTLKSWTA 539 TL+ W A Sbjct 493 YTLRDWVA 500 >gi|517172762|ref|WP_018361580.1| hypothetical protein [Prevotella nanceiensis] Length=568 Score = 153 bits (387), Expect = 8e-37, Method: Compositional matrix adjust. Identities = 150/548 (27%), Positives = 233/548 (43%), Gaps = 83/548 (15%) Query 16 RSGFDLGNKNAFTAKVGELLPVYWKFCLPGDKFKISQEWFTRTQPVDTSAFTRIREYYEW 75 R+ FD+ ++ FTA G LLPV LP D +I+ F RT P++++AF +R YE+ Sbjct 18 RNAFDISQRHLFTAPAGALLPVLSLDLLPHDHVEINASDFMRTLPMNSAAFMSMRGVYEF 77 Query 76 FFVPLHLMYRNSNEAIMSMENQPN---YAASGT---QSITFN-RKLPWVDLLTLNNAVEN 128 +FVP ++ ++ I M + + YA G ++F+ +KL VD N A ++ Sbjct 78 YFVPYKQLWSGFDQFITGMSDYKSSFMYAFKGKTPPSCVSFDVQKL--VDWCKTNTA-KD 134 Query 129 VKASTYHDNMFGFSRALGFAKLYNYLGVGQFDPSKTLANLRISVFPFYAYQKIYNDYYRN 188 + + ++ LG+ K N GV +P+ T + + F AYQKIYND+YRN Sbjct 135 IHGFDKNKGVYRILDLLGYGKYANSAGVPYTNPTSTTMG-KCTPFRGLAYQKIYNDFYRN 193 Query 189 SQWEVNKPWTYNCDFWNGEDTTPVASLKDLFDTNPND-SVFELRYANWNKDLYMGAMPNA 247 + +E + ++N D + G +K+ P D F LRY N KDL P Sbjct 194 TTYEEYQLESFNVDMFYGS-----GKVKETIPNEPWDYDWFTLRYRNAQKDLLTNVRPT- 247 Query 248 QFGDVAFVPVDSSGKLPVSLPSIEVGGVAPIYNTGAGGVQPDAQIGLRGAVTGAPDNGQT 307 P + P + TG + Sbjct 248 --------------------PLFSIDDFNPQFFTGGSDI--------------------- 266 Query 308 VTAYGADKTDAARPYFYAVPDGSVAHLKTNAKTIQVPYEFSSKFDVLQLRAAECLQKWKE 367 V G + T Y SV + N K V + + V +R A L+K Sbjct 267 VMEKGPNVTGGTHEY-----RDSVVIVGKNLKENGVDSK-RTMISVADIRNAFALEKLAS 320 Query 368 IAQANGQNYASQVKAHFGVSPNPMTSHRCQRVCGFDGSIDISAVENTNLSSDEAIIRGKG 427 + G+ Y Q++AHFG+S RC + GFD +I + V ++ ++ + Sbjct 321 VTMRAGKTYKEQMEAHFGISVEEGRDGRCTYIGGFDSNIQVGDVTQSSGTTVTG-TKDTS 379 Query 428 IGGY--RVNKPET--------FKTTEHGVLMCIYHAVPLLDYAPTGPDLQFMTTVDGDSW 477 GGY R T F EHG+LMCIY VP + Y D F+ ++ + Sbjct 380 FGGYLGRTTGKATGSGSGHIRFDAKEHGILMCIYSLVPDVQYDSKRVD-PFVQKIERGDF 438 Query 478 PVPELDSVGFEEL----PSYSLLNTSDVQPIKEPRPFGYVPRYISWKTSVDVVRGAFI-- 531 VPE +++G + L SY N + IK FG+ PRY +KT++D+ G F+ Sbjct 439 FVPEFENLGMQPLFAKNISYKYNNNTANSRIKNLGAFGWQPRYSEYKTALDINHGQFVHQ 498 Query 532 DTLKSWTA 539 + L WT Sbjct 499 EPLSYWTV 506 >gi|647452987|ref|WP_025792807.1| hypothetical protein [Prevotella histicola] Length=584 Score = 150 bits (380), Expect = 6e-36, Method: Compositional matrix adjust. Identities = 158/579 (27%), Positives = 241/579 (42%), Gaps = 130/579 (22%) Query 13 KPR--RSGFDLGNKNAFTAKVGELLPVYWKFCLPGDKFKISQEWFTRTQPVDTSAFTRIR 70 KPR R+GFDL ++ F+AK G+LLP+ P + FK S + RT ++T+++ R++ Sbjct 5 KPRLARNGFDLSSRRIFSAKAGQLLPIGCWEVNPSEHFKFSVQDLVRTTTLNTASYARMK 64 Query 71 EYYEWFFVPLHLMYRNSNEAIMSMENQPNYAASGTQ---SITFNRKLPWVDLLTLNNAVE 127 EYY +FFV +++ ++ I+ N P+ A +G + + +N+ V L + Sbjct 65 EYYHFFFVSYRSLWQWFDQFIVGT-NNPHSALNGVKKNGTTNYNQICSSVPTFDLGKLIT 123 Query 128 NVKASTYHDNMFGFSRALGFAKLYNYLGVGQFDPSK--TLANL----------------- 168 +K S F +S G AKL N L G + K L NL Sbjct 124 RLKTSDMDSQGFNYSE--GAAKLLNMLNYGVTNKGKFMNLENLITSTSYLPSKDDKEPSS 181 Query 169 ----RISVFPFYAYQKIYNDYYRNSQWEVNKPWTYNCDFWNGEDT---TPVASLKDLFDT 221 ++S F AYQKI+ND+YRN W + ++N D + + P +LK Sbjct 182 IYACKVSPFRLLAYQKIFNDFYRNQDWTPSDVRSFNVDDYADDSNLTIEPDVALK----- 236 Query 222 NPNDSVFELRYANWNKDLYMGAMPNAQFGDVAFVPVDSSGKLPVSLPSIEVG-GVAPIYN 280 ++RY + KD P + D F +LP G G + N Sbjct 237 -----FCQMRYRPYAKDWLTSMKPTPNYSDGIF-----------NLPEYVRGNGNVILTN 280 Query 281 TGAGGVQPDAQIGLRGAVTGAPDNGQTVTAYGADKTDAARPYFYAVPDGSVAHLKTNAKT 340 +G V D+ G+V+ Sbjct 281 NKSGSVSLDS--------------------------------------GTVS-------- 294 Query 341 IQVPYEFSSKFDVLQLRAAECLQKWKEIA-QANGQNYASQVKAHFGVSPNPMTSHRCQRV 399 P FS V LRAA L K E +ANG +YASQ++AHFG ++ + + Sbjct 295 ---PSSFS----VNDLRAAFALDKMLEATRRANGLDYASQIEAHFGFKVPESRANDARFL 347 Query 400 CGFDGSIDISAV--ENTNLSSDEAI-----IRGKGIGGYRVNKPETFKTTEHGVLMCIYH 452 GFD SI +S V N N +SD + + GKGIG E F +TEHG++MCIY Sbjct 348 GGFDNSIVVSEVVSTNGNAASDGSHASIGDLGGKGIGSMSSGTIE-FDSTEHGIIMCIYS 406 Query 453 AVPLLDYAPTGPDLQFMTTVDGDSWPVPELDSVGFEELPSYSLLNTSDVQPIKEP----- 507 P +Y + D F + + + PE +G++ L L+ ++ K+ Sbjct 407 VAPQSEYNASYLD-PFNRKLTREQFYQPEFADLGYQALIGSDLICSTLGMNEKQAGFSDI 465 Query 508 ----RPFGYVPRYISWKTSVDVVRGAFID--TLKSWTAP 540 GY RY +KT+ D+V G F +L W P Sbjct 466 ELNNNLLGYQVRYNEYKTARDLVFGDFESGKSLSYWCTP 504 >gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis] gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=553 Score = 150 bits (378), Expect = 1e-35, Method: Compositional matrix adjust. Identities = 151/549 (28%), Positives = 225/549 (41%), Gaps = 100/549 (18%) Query 16 RSGFDLGNKNAFTAKVGELLPVYWKFCLPGDKFKISQEWFTRTQPVDTSAFTRIREYYEW 75 R+ FDL ++ FTA G LLPV +P D +I+ + F RT P++T+AF +R YE+ Sbjct 17 RNAFDLSQRHLFTAHAGMLLPVLNLDLIPHDHVEINAQDFMRTLPMNTAAFASMRGVYEF 76 Query 76 FFVPLHLMYRNSNEAIMSMENQPNYAASGTQSITFNRKLPWVDLLTLNNAVENVKAS--- 132 FFVP H ++ ++ I M + + A Q T ++P+ ++ ++ N++ K S Sbjct 77 FFVPYHQLWAQFDQFITGMNDFHSSANKSIQGGTSPLQVPYFNVDSVFNSLNTGKESGSG 136 Query 133 -------TYHDNMFGFSRALGFAKLYNYLGVGQFDPSKTLAN---LRISVFPFYAYQKIY 182 + F LG+ + ++ G D L N SVF AY KIY Sbjct 137 STDDLQYKFKYGAFRLLDLLGYGRKFDSFGTAYPDNVSGLKNNLDYNCSVFRILAYNKIY 196 Query 183 NDYYRNSQWEVNKPWTYNCDFWNGEDTTPVASLKDLFDTNPNDSVFELRYANWNKDLYMG 242 DYYRNS +E ++N D + G L D +F+LRY N D + Sbjct 197 QDYYRNSNYENFDTDSFNFDKFKG----------GLVDAKVVADLFKLRYRNAQTDYFTN 246 Query 243 AMPNAQFG-DVAFVPVDSSGKLPVSLPSIEVGGVAPIYNTGAGGVQPDAQIGLRGAVTGA 301 + F AF VD+ +AP R V Sbjct 247 LRQSQLFSFTTAFEDVDNI-------------NIAP-----------------RDYVKSD 276 Query 302 PDNGQTVTAYGADKTDAARPYFYAVPDGSVAHLKTNAKTIQVPYEFSSKFDVLQLRAAEC 361 N V +G D TD++ D SV+ L+ K + +RA + Sbjct 277 GSNFTRVN-FGVD-TDSSE------GDFSVSSLRAAFAV--------DKLLSVTMRAGKT 320 Query 362 LQKWKEIAQANGQNYASQVKAHFGVSPNPMTSHRCQRVCGFDGSIDISAVENTNLSSDEA 421 Q Q++AH+GV R + GFD + +S V T+ ++ Sbjct 321 FQ--------------DQMRAHYGVEIPDSRDGRVNYLGGFDSDMQVSDVTQTSGTTATE 366 Query 422 I---------IRGKGIGGYRVNKPETFKTTEHGVLMCIYHAVPLLDYAPTGPDLQFMTTV 472 + GKG G R F EHGVLMCIY VP + Y T D + + Sbjct 367 YKPEAGYLGRVAGKGTGSGR--GRIVFDAKEHGVLMCIYSLVPQIQYDCTRLD-PMVDKL 423 Query 473 DGDSWPVPELDSVGFEELPSYSLLNTSDVQPIKEPRPFGYVPRYISWKTSVDVVRGAFI- 531 D + PE +++G + L S + + P K P GY PRY +KT++DV G F Sbjct 424 DRFDYFTPEFENLGMQPLNSSYISSFCTTDP-KNP-VLGYQPRYSEYKTALDVNHGQFAQ 481 Query 532 -DTLKSWTA 539 D L SW+ Sbjct 482 SDALSSWSV 490 Lambda K H a alpha 0.318 0.136 0.423 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 4146447800358