bitscore colors: <40, 40-50 , 50-80, 80-200, >200
BLASTP 2.2.30+ Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for composition-based statistics: Alejandro A. Schaffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005. Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 49,011,213 sequences; 17,563,301,199 total letters Query= Contig-1_CDS_annotation_glimmer3.pl_2_4 Length=591 Score E Sequences producing significant alignments: (Bits) Value gi|648626869|ref|WP_026318620.1| hypothetical protein 233 9e-69 gi|547312923|ref|WP_022044635.1| putative uncharacterized protein 63.5 1e-07 gi|517172762|ref|WP_018361580.1| hypothetical protein 62.0 8e-07 gi|547920049|ref|WP_022322420.1| capsid protein VP1 57.0 3e-05 gi|492501782|ref|WP_005867318.1| hypothetical protein 57.0 3e-05 gi|649555287|gb|KDS61824.1| capsid family protein 52.8 5e-04 gi|649569140|gb|KDS75238.1| capsid family protein 51.6 0.001 gi|649557305|gb|KDS63784.1| capsid family protein 50.8 0.001 gi|494308783|ref|WP_007173938.1| hypothetical protein 50.1 0.004 gi|494306153|ref|WP_007173049.1| hypothetical protein 49.3 0.007 >gi|648626869|ref|WP_026318620.1| hypothetical protein [Alistipes onderdonkii] Length=231 Score = 233 bits (595), Expect = 9e-69, Method: Compositional matrix adjust. Identities = 120/231 (52%), Positives = 149/231 (65%), Gaps = 9/231 (4%) Query 370 LSGGTRFRRRNYHFNDDGYFMEITSIVPRVYYPSYINPTSRQISLGQQYAPALDNIAMQG 429 +SGG F+RR +HFN+ GYFMEITS+VP V YP+Y+NPT Q +LGQ+YAPALDNI MQ Sbjct 1 MSGGDSFKRRTFHFNESGYFMEITSVVPTVMYPNYLNPTLLQTNLGQRYAPALDNIQMQP 60 Query 430 LKASTVFGEVQ-NLGANTVTYANS--------TLSIPGFKLQESNYVGYEPAWSELMTAV 480 L T+ G N G+ + ++ + T+++ E VGY+PAW+ELMT V Sbjct 61 LTVPTLLGNAYFNTGSGSYSHVLNHMGTGELRTVAVDKLSAAEGIAVGYQPAWAELMTGV 120 Query 481 SKPHGRLCNDLDYWVLSRDYGRNLASVMDTPAYSDFIKAAGTYVDELSLQRLTAFLKRIY 540 SKPHGRLCNDLDYW R YG L S D S F++ G VD L ++ A+LK Y Sbjct 121 SKPHGRLCNDLDYWAFQRRYGTVLYSSNDAQDASVFLEELGNEVDTLDVETFNAWLKNTY 180 Query 541 VSPSSCPYILCGDFNYVFYDQRPTAENFVLDNVADIVVFREKSKVNVATTL 591 VS PYIL +NYVF D P A+NFVLDN A+I V+REKSKVNV TL Sbjct 181 VSTDFVPYILPAMYNYVFADTDPNAQNFVLDNSAEISVYREKSKVNVPNTL 231 >gi|547312923|ref|WP_022044635.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] gi|524208404|emb|CCZ76639.1| putative uncharacterized protein [Alistipes finegoldii CAG:68] Length=338 Score = 63.5 bits (153), Expect = 1e-07, Method: Compositional matrix adjust. Identities = 82/326 (25%), Positives = 123/326 (38%), Gaps = 70/326 (21%) Query 281 AVDVSVS-GNSVSMRNITFASRMQRYMDLAFAGGGRNSDFYESQFDVKLSQDNTC-PAFL 338 A+D+++S G SV++ + +++Q +MD F GGR D + + + K S P FL Sbjct 41 ALDLNISTGFSVAVPELRLRTKIQNWMDRLFVSGGRVGDVFRTLWGTKSSAIYVNKPDFL 100 Query 339 GSDSFDMNVNTLY-----QTTGFEDNSSPLGAFSGQLSGGTRFRRRNYHFNDDGYFMEIT 393 G +N + + +G + N L A + + +Y+ + G FM IT Sbjct 101 GVWQASINPSNVRAMANGSASGEDANLGQLAACVDRYCDFSGHSGIDYYAKEPGTFMLIT 160 Query 394 SIVPRVYYPSYINPTSRQISLGQQYAPALDNIAMQ-----------------GL--KAST 434 +VP Y ++P IS G + P L+ I Q GL +AS Sbjct 161 MLVPEPAYSQGLHPDLASISFGDDFNPELNGIGFQLVPRHRFSMMPRGFNFTGLDQEASP 220 Query 435 VFGEVQNLGANTVTYANSTLSIPGFKLQESNYVGYEPAWSELMTAVSKPHGRLC--NDLD 492 FG G + N VG E AWS L T S+ HG + Sbjct 221 WFGHT---GTGVLVDPNMV------------SVGEEVAWSWLRTDYSRLHGDFAQNGNYQ 265 Query 493 YWVLSRDYGRNLASVMDTPAYSDFIKAAGTYVDELSLQRLTAFLKRIYVSPSSCPYILCG 552 YWVL+R + D + + GTY++ L Sbjct 266 YWVLTRRFTTYFPD--DGTGFYQDGEYTGTYINPL------------------------- 298 Query 553 DFNYVFYDQRPTAENFVLDNVADIVV 578 D+ YVF DQ A NF D+ V Sbjct 299 DWQYVFVDQTLMAGNFAYYGTFDLNV 324 >gi|517172762|ref|WP_018361580.1| hypothetical protein [Prevotella nanceiensis] Length=568 Score = 62.0 bits (149), Expect = 8e-07, Method: Compositional matrix adjust. Identities = 64/261 (25%), Positives = 112/261 (43%), Gaps = 40/261 (15%) Query 291 VSMRNITFASRMQRYMDLAFAGGGRNSDFYESQFDVKLSQ--DNTCPAFLGSDSFDMNVN 348 +S+ +I A +++ + G + E+ F + + + D C G DS ++ V Sbjct 304 ISVADIRNAFALEKLASVTMRAGKTYKEQMEAHFGISVEEGRDGRCTYIGGFDS-NIQVG 362 Query 349 TLYQT-----TGFEDNS------SPLGAFSGQLSGGTRFRRRNYHFNDDGYFMEITSIVP 397 + Q+ TG +D S G +G SG RF + + G M I S+VP Sbjct 363 DVTQSSGTTVTGTKDTSFGGYLGRTTGKATGSGSGHIRFDAKEH-----GILMCIYSLVP 417 Query 398 RVYYPSY-INPTSRQISLGQQYAPALDNIAMQGLKASTVFGEVQNLGANTVTYANSTLSI 456 V Y S ++P ++I G + P +N+ MQ L A + + N AN+ Sbjct 418 DVQYDSKRVDPFVQKIERGDFFVPEFENLGMQPLFAKNISYKYNNNTANS---------- 467 Query 457 PGFKLQESNYVGYEPAWSELMTAVSKPHGRLCND--LDYWVLSRDYGR-----NLASVMD 509 +++ G++P +SE TA+ HG+ + L YW ++R G N+++ Sbjct 468 ---RIKNLGAFGWQPRYSEYKTALDINHGQFVHQEPLSYWTVARARGESMSNFNISTFKI 524 Query 510 TPAYSDFIKAAGTYVDELSLQ 530 P + D + A EL+ Q Sbjct 525 NPKWLDDVFAVNYNGTELTDQ 545 >gi|547920049|ref|WP_022322420.1| capsid protein VP1 [Parabacteroides merdae CAG:48] gi|524592961|emb|CDD13573.1| capsid protein VP1 [Parabacteroides merdae CAG:48] Length=553 Score = 57.0 bits (136), Expect = 3e-05, Method: Compositional matrix adjust. Identities = 53/222 (24%), Positives = 94/222 (42%), Gaps = 23/222 (10%) Query 279 DAAVDVSVSGNSVSMRNITFASRMQRYMDLAFAGGGRNSDFYESQFDVKLSQDN-TCPAF 337 + + V+V +++ ++ ++ +QR+ + GG R + S F V+ S P F Sbjct 299 NGTLKVNVDEMGININDLRTSNALQRWFERNARGGSRYIEQILSHFGVRSSDARLQRPQF 358 Query 338 LGSDSFDMNVNTLYQTTGFEDNSSPLGAFSGQ-LSGGTRFRRRNYHFNDDGYFMEITSIV 396 LG ++V+ + QT+ D +SP +G +S G ++Y F + GY + I SI Sbjct 359 LGGGRMPISVSEVLQTSS-TDETSPQANMAGHGISAGINNGFKHY-FEEHGYIIGIMSIT 416 Query 397 PRVYYPSYINPTSRQISLGQQYAPALDNIAMQGLKASTVFGEVQNLGANTVTYANSTLSI 456 PR Y + + Y P +++ Q +K +F + Y N T Sbjct 417 PRSGYQQGVPRDFTKFDNMDFYFPEFAHLSEQEIKNQELFV------SEDAAYNNGTF-- 468 Query 457 PGFKLQESNYVGYEPAWSELMTAVSKPHGRLCNDLDYWVLSR 498 GY P ++E S+ HG +L +W L+R Sbjct 469 -----------GYTPRYAEYKYHPSEAHGDFRGNLSFWHLNR 499 Score = 48.9 bits (115), Expect = 0.008, Method: Compositional matrix adjust. Identities = 25/71 (35%), Positives = 39/71 (55%), Gaps = 0/71 (0%) Query 7 KRNKKSRFKLFSGNPTSASWGTLIPTNVTRVVAGDDFSFQPGVGVQALPIVAPFMGSVCV 66 KR +++ F L + + + G L+P VV+GD F + V+ P+VAP M V V Sbjct 11 KRPRRNAFNLSYESKLTLNMGELVPIMCMPVVSGDKFRVKTESLVRLAPLVAPMMHRVNV 70 Query 67 KKEYFFIPDRI 77 YFF+P+R+ Sbjct 71 FTHYFFVPNRL 81 >gi|492501782|ref|WP_005867318.1| hypothetical protein [Parabacteroides distasonis] gi|409230408|gb|EKN23272.1| hypothetical protein HMPREF1059_03257 [Parabacteroides distasonis CL09T03C24] Length=538 Score = 57.0 bits (136), Expect = 3e-05, Method: Compositional matrix adjust. Identities = 53/217 (24%), Positives = 95/217 (44%), Gaps = 23/217 (11%) Query 284 VSVSGNSVSMRNITFASRMQRYMDLAFAGGGRNSDFYESQFDVKLSQDN-TCPAFLGSDS 342 V+V VS+ ++ ++ +QR+ + G R + S F V+ S P FLG Sbjct 289 VNVDELGVSINDLRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGR 348 Query 343 FDMNVNTLYQTTGFEDNSSPLGAFSGQ-LSGGTRFRRRNYHFNDDGYFMEITSIVPRVYY 401 ++V+ + QT+ D++SP +G +S G + Y F + GY + I SI PR Y Sbjct 349 TPISVSEVLQTSA-TDSTSPQANMAGHGISAGVNHGFKRY-FEEHGYIIGIMSIRPRTGY 406 Query 402 PSYINPTSRQISLGQQYAPALDNIAMQGLKASTVFGEVQNLGANTVTYANSTLSIPGFKL 461 + R+ Y P ++ Q +K V+ + Q +N T+ Sbjct 407 QQGVPKDFRKFDNMDFYFPEFAHLGEQEIKNEEVYLQ-QTPASNNGTF------------ 453 Query 462 QESNYVGYEPAWSELMTAVSKPHGRLCNDLDYWVLSR 498 GY P ++E ++++ HG ++ +W L+R Sbjct 454 ------GYTPRYAEYKYSMNEVHGDFRGNMAFWHLNR 484 Score = 53.1 bits (126), Expect = 4e-04, Method: Compositional matrix adjust. Identities = 34/90 (38%), Positives = 48/90 (53%), Gaps = 3/90 (3%) Query 7 KRNKKSRFKLFSGNPTSASWGTLIPTNVTRVVAGDDFSFQPGVGVQALPIVAPFMGSVCV 66 KR +++ F L N +A+ G L+P VV GD F + V+ P+VAP M V V Sbjct 11 KRPRRNVFNLSYENKLTANAGELVPIMCKPVVPGDKFRVNTEMLVRLAPLVAPMMHRVDV 70 Query 67 KKEYFFIPDR-IYNIDRQLNFQGV--TDTP 93 YFF+P+R ++N +GV TDTP Sbjct 71 FTHYFFVPNRLLWNQWEDFITKGVDGTDTP 100 >gi|649555287|gb|KDS61824.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649560568|gb|KDS66876.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649561020|gb|KDS67307.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649562724|gb|KDS68908.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=541 Score = 52.8 bits (125), Expect = 5e-04, Method: Compositional matrix adjust. Identities = 39/118 (33%), Positives = 58/118 (49%), Gaps = 2/118 (2%) Query 7 KRNKKSRFKLFSGNPTSASWGTLIPTNVTRVVAGDDFSFQPGVGVQALPIVAPFMGSVCV 66 KR +++ F L N + + G LIP VV GD F + V+ P+VAP M V V Sbjct 11 KRPRRNVFNLSYENKLTVNAGELIPIMCKPVVPGDKFRVNTEMLVRLAPLVAPMMHRVDV 70 Query 67 KKEYFFIPDR-IYNIDRQLNFQGVTDTPNTVYKPSMAPPIPFDISKSMGDKVDISVSD 123 YFF+P+R I+N +GV T + V+ P+ + P D + + D S+ D Sbjct 71 FTHYFFVPNRLIWNKWEDFITKGVDGTDSPVF-PTYSFPSTVDTANAHNSFGDGSLWD 127 Score = 52.4 bits (124), Expect = 8e-04, Method: Compositional matrix adjust. Identities = 51/210 (24%), Positives = 90/210 (43%), Gaps = 23/210 (11%) Query 291 VSMRNITFASRMQRYMDLAFAGGGRNSDFYESQFDVKLSQDN-TCPAFLGSDSFDMNVNT 349 V++ +I ++ +QR+ + G R + S F V+ S P FLG ++V+ Sbjct 299 VNINDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSE 358 Query 350 LYQTTGFEDNSSPLGAFSGQ-LSGGTRFRRRNYHFNDDGYFMEITSIVPRVYYPSYINPT 408 + QT+ D++SP +G +S G Y F + GY M I SI PR Y + Sbjct 359 VLQTSS-TDSTSPQANMAGHGISAGVNHGFTRY-FEEHGYIMGIMSIRPRTGYQQGVPKD 416 Query 409 SRQISLGQQYAPALDNIAMQGLKASTVFGEVQNLGANTVTYANSTLSIPGFKLQESNYVG 468 R+ Y P ++ Q +K ++ ++ AN T+ G Sbjct 417 FRKFDNMDFYFPEFAHLGEQEIKNEELYLN-ESDAANEGTF------------------G 457 Query 469 YEPAWSELMTAVSKPHGRLCNDLDYWVLSR 498 Y P ++E + ++ HG ++ +W L+R Sbjct 458 YTPRYAEYKYSQNEVHGDFRGNMAFWHLNR 487 >gi|649569140|gb|KDS75238.1| capsid family protein, partial [Parabacteroides distasonis str. 3999B T(B) 6] Length=390 Score = 51.6 bits (122), Expect = 0.001, Method: Compositional matrix adjust. Identities = 51/210 (24%), Positives = 90/210 (43%), Gaps = 23/210 (11%) Query 291 VSMRNITFASRMQRYMDLAFAGGGRNSDFYESQFDVKLSQDN-TCPAFLGSDSFDMNVNT 349 V++ +I ++ +QR+ + G R + S F V+ S P FLG ++V+ Sbjct 148 VNINDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSE 207 Query 350 LYQTTGFEDNSSPLGAFSGQ-LSGGTRFRRRNYHFNDDGYFMEITSIVPRVYYPSYINPT 408 + QT+ D++SP +G +S G Y F + GY M I SI PR Y + Sbjct 208 VLQTSS-TDSTSPQANMAGHGISAGVNHGFTRY-FEEHGYIMGIMSIRPRTGYQQGVPKD 265 Query 409 SRQISLGQQYAPALDNIAMQGLKASTVFGEVQNLGANTVTYANSTLSIPGFKLQESNYVG 468 R+ Y P ++ Q +K ++ ++ AN T+ G Sbjct 266 FRKFDNMDFYFPEFAHLGEQEIKNEELYLN-ESDAANEGTF------------------G 306 Query 469 YEPAWSELMTAVSKPHGRLCNDLDYWVLSR 498 Y P ++E + ++ HG ++ +W L+R Sbjct 307 YTPRYAEYKYSQNEVHGDFRGNMAFWHLNR 336 >gi|649557305|gb|KDS63784.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 4] gi|649559156|gb|KDS65543.1| capsid family protein [Parabacteroides distasonis str. 3999B T(B) 6] Length=245 Score = 50.8 bits (120), Expect = 0.001, Method: Compositional matrix adjust. Identities = 51/210 (24%), Positives = 90/210 (43%), Gaps = 23/210 (11%) Query 291 VSMRNITFASRMQRYMDLAFAGGGRNSDFYESQFDVKLSQDN-TCPAFLGSDSFDMNVNT 349 V++ +I ++ +QR+ + G R + S F V+ S P FLG ++V+ Sbjct 3 VNINDIRTSNALQRWFERNARSGSRYIEQILSHFGVRSSDARLQRPQFLGGGRTPISVSE 62 Query 350 LYQTTGFEDNSSPLGAFSGQ-LSGGTRFRRRNYHFNDDGYFMEITSIVPRVYYPSYINPT 408 + QT+ D++SP +G +S G Y F + GY M I SI PR Y + Sbjct 63 VLQTSS-TDSTSPQANMAGHGISAGVNHGFTRY-FEEHGYIMGIMSIRPRTGYQQGVPKD 120 Query 409 SRQISLGQQYAPALDNIAMQGLKASTVFGEVQNLGANTVTYANSTLSIPGFKLQESNYVG 468 R+ Y P ++ Q +K ++ ++ AN T+ G Sbjct 121 FRKFDNMDFYFPEFAHLGEQEIKNEELYLN-ESDAANEGTF------------------G 161 Query 469 YEPAWSELMTAVSKPHGRLCNDLDYWVLSR 498 Y P ++E + ++ HG ++ +W L+R Sbjct 162 YTPRYAEYKYSQNEVHGDFRGNMAFWHLNR 191 >gi|494308783|ref|WP_007173938.1| hypothetical protein [Prevotella bergensis] gi|270333035|gb|EFA43821.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=553 Score = 50.1 bits (118), Expect = 0.004, Method: Compositional matrix adjust. Identities = 54/228 (24%), Positives = 97/228 (43%), Gaps = 31/228 (14%) Query 281 AVDVSVSGNSVSMRNITFASRMQRYMDLAFAGGGRNSDFYESQFDVKL--SQDNTCPAFL 338 VD S S+ ++ A + + + + G D + + V++ S+D Sbjct 286 GVDTDSSEGDFSVSSLRAAFAVDKLLSVTMRAGKTFQDQMRAHYGVEIPDSRDGRVNYLG 345 Query 339 GSDSFDMNVNTLYQTTG-----FEDNSSPLGAFSGQLSGGTRFRRRNYHFNDDGYFMEIT 393 G DS DM V+ + QT+G ++ + LG +G+ +G R R + + G M I Sbjct 346 GFDS-DMQVSDVTQTSGTTATEYKPEAGYLGRVAGKGTGSGR-GRIVFDAKEHGVLMCIY 403 Query 394 SIVPRVYYP-SYINPTSRQISLGQQYAPALDNIAMQGLKASTVFGEVQNLGANTVTYANS 452 S+VP++ Y + ++P ++ + P +N+ MQ L +S + N V Sbjct 404 SLVPQIQYDCTRLDPMVDKLDRFDYFTPEFENLGMQPLNSSYISSFCTTDPKNPV----- 458 Query 453 TLSIPGFKLQESNYVGYEPAWSELMTAVSKPHGRLC--NDLDYWVLSR 498 +GY+P +SE TA+ HG+ + L W +SR Sbjct 459 --------------LGYQPRYSEYKTALDVNHGQFAQSDALSSWSVSR 492 >gi|494306153|ref|WP_007173049.1| hypothetical protein [Prevotella bergensis] gi|270333881|gb|EFA44667.1| putative capsid protein (F protein) [Prevotella bergensis DSM 17361] Length=519 Score = 49.3 bits (116), Expect = 0.007, Method: Compositional matrix adjust. Identities = 60/238 (25%), Positives = 104/238 (44%), Gaps = 37/238 (16%) Query 275 SSFTDAAVDVSVSGN----SVSMRNITFASRMQRYMDLAFAGGGRNSDFYESQFDVKL-- 328 S+FT V V N SVS FA + + + + G D + + V++ Sbjct 244 SNFTQLNFPVDVDNNLGYFSVSSLRSAFA--VDKLLSVTMRAGKTFQDQMRAHYGVEIPD 301 Query 329 SQDNTCPAFLGSDSFDMNVNTLYQTTG-----FEDNSSPLGAFSGQLSGGTRFRRRNYHF 383 S+D G DS D+ V+ + QT+G ++ + LG +G+ +G R R + Sbjct 302 SRDGRVNYLGGFDS-DLQVSDVTQTSGTTATEYKPEAGYLGRIAGKGTGSGR-GRIVFDA 359 Query 384 NDDGYFMEITSIVPRVYYP-SYINPTSRQISLGQQYAPALDNIAMQGLKASTVFGEVQNL 442 + G M I S+VP++ Y + ++P ++ + P +N+ MQ L +S + Sbjct 360 KEHGVLMCIYSLVPQIQYDCTRLDPMVDKLDRFDFFTPEFENLGMQPLNSSYI------- 412 Query 443 GANTVTYANSTLSIPGFKLQESNYVGYEPAWSELMTAVSKPHGRLCND--LDYWVLSR 498 S+ P K + +GY+P +SE TA+ HG+ + L W +SR Sbjct 413 ---------SSFCTPDPK---NPVLGYQPRYSEYKTALDINHGQFAQNDALSSWSVSR 458 Lambda K H a alpha 0.318 0.133 0.394 0.792 4.96 Gapped Lambda K H a alpha sigma 0.267 0.0410 0.140 1.90 42.6 43.6 Effective search space used: 4376806011489