BLASTP 2.2.24 [Aug-08-2010] Reference for compositional score matrix adjustment: Altschul, Stephen F., John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis, Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109. Query= Eace_2137_orf2 (156 letters) Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 14,777,732 sequences; 5,058,227,080 total letters
Score E Sequences producing significant alignments: (bits) Value gi|221488780|gb|EEE26994.1| conserved hypothetical protein [Toxo... 46 0.002 gi|237837341|ref|XP_002367968.1| hypothetical protein TGME49_030... 46 0.002 gi|325117770|emb|CBZ53321.1| conserved hypothetical protein [Neo... 44 0.011 gi|145532222|ref|XP_001451872.1| hypothetical protein [Parameciu... 41 0.056 gi|145473743|ref|XP_001462535.1| hypothetical protein [Parameciu... 40 0.099 gi|156088135|ref|XP_001611474.1| transporter, major facilitator ... 39 0.18 gi|145493435|ref|XP_001432713.1| hypothetical protein [Parameciu... 39 0.24 gi|118347505|ref|XP_001007229.1| Major Facilitator Superfamily p... 39 0.26 gi|71663806|ref|XP_818891.1| dispersed gene family protein 1 (DG... 39 0.30 gi|168001733|ref|XP_001753569.1| predicted protein [Physcomitrel... 39 0.34 gi|322825128|gb|EFZ30244.1| dispersed gene family protein 1 (DGF... 37 1.3 gi|71399351|ref|XP_802762.1| dispersed gene family protein 1 (DG... 36 1.8 gi|67621406|ref|XP_667762.1| hypothetical protein [Cryptosporidi... 36 1.8 gi|66363086|ref|XP_628509.1| major facilitator superfamily trans... 36 1.8 gi|322817740|gb|EFZ25375.1| dispersed gene family protein 1 (DGF... 36 1.9 gi|322830662|gb|EFZ33608.1| dispersed gene family protein 1 (DGF... 36 2.1 gi|301094336|ref|XP_002896274.1| Major Facilitator Superfamily (... 36 2.4 gi|338174074|ref|YP_004650884.1| hypothetical protein PUV_00800 ... 35 3.2 gi|282890430|ref|ZP_06298957.1| hypothetical protein pah_c017o01... 35 3.6 gi|145486746|ref|XP_001429379.1| hypothetical protein [Parameciu... 35 3.7 gi|340503314|gb|EGR29914.1| major facilitator superfamily protei... 35 3.7 gi|340505347|gb|EGR31685.1| major facilitator superfamily protei... 34 7.8 gi|209876496|ref|XP_002139690.1| major facilitator superfamily t... 34 8.1 >gi|221488780|gb|EEE26994.1| conserved hypothetical protein [Toxoplasma gondii GT1] Length = 1162 Score = 46.2 bits (108), Expect = 0.002, Method: Compositional matrix adjust. Identities = 25/67 (37%), Positives = 35/67 (52%), Gaps = 5/67 (7%) Query: 92 LVGVVCMHLNIYTLLPCVRVCS-----AVIGSFSSRVYVCVACLWLLLCFGAAMLPPLTG 146 L G + + TL C+ + +G+ ++ V LW LLCFG A+LPPLTG Sbjct: 1050 LGGYKTLDAKLKTLQACLAAAAAAVVCGFVGAVTTDALTFVLSLWFLLCFGGALLPPLTG 1109 Query: 147 IQIDAVS 153 +QI AV Sbjct: 1110 LQIAAVE 1116 >gi|237837341|ref|XP_002367968.1| hypothetical protein TGME49_030570 [Toxoplasma gondii ME49] gi|211965632|gb|EEB00828.1| hypothetical protein TGME49_030570 [Toxoplasma gondii ME49] gi|221509270|gb|EEE34839.1| conserved hypothetical protein [Toxoplasma gondii VEG] Length = 1154 Score = 46.2 bits (108), Expect = 0.002, Method: Compositional matrix adjust. Identities = 25/67 (37%), Positives = 35/67 (52%), Gaps = 5/67 (7%) Query: 92 LVGVVCMHLNIYTLLPCVRVCS-----AVIGSFSSRVYVCVACLWLLLCFGAAMLPPLTG 146 L G + + TL C+ + +G+ ++ V LW LLCFG A+LPPLTG Sbjct: 1042 LGGYKTLDAKLKTLQACLAAAAAAVVCGFVGAVTTDALTFVLSLWFLLCFGGALLPPLTG 1101 Query: 147 IQIDAVS 153 +QI AV Sbjct: 1102 LQIAAVE 1108 >gi|325117770|emb|CBZ53321.1| conserved hypothetical protein [Neospora caninum Liverpool] Length = 1153 Score = 43.5 bits (101), Expect = 0.011, Method: Compositional matrix adjust. Identities = 19/29 (65%), Positives = 22/29 (75%) Query: 125 VCVACLWLLLCFGAAMLPPLTGIQIDAVS 153 + V LW LLCFG A+LPPLTG+QI AV Sbjct: 1052 IFVLALWFLLCFGGALLPPLTGLQIAAVE 1080 >gi|145532222|ref|XP_001451872.1| hypothetical protein [Paramecium tetraurelia strain d4-2] gi|124419538|emb|CAK84475.1| unnamed protein product [Paramecium tetraurelia] Length = 489 Score = 41.2 bits (95), Expect = 0.056, Method: Composition-based stats. Identities = 27/76 (35%), Positives = 36/76 (47%), Gaps = 9/76 (11%) Query: 90 CSLVGVVCMHLNIY----TLLPCVRVCS-----AVIGSFSSRVYVCVACLWLLLCFGAAM 140 C G++ L Y +L CV CS A F+ + CLWLLL FG A+ Sbjct: 285 CITGGLIAQKLGGYQRAKSLYVCVLYCSLCCISAAPVPFTETFWFGALCLWLLLFFGGAI 344 Query: 141 LPPLTGIQIDAVSPDL 156 +PPL GI + +V L Sbjct: 345 VPPLMGIMLSSVPKHL 360 >gi|145473743|ref|XP_001462535.1| hypothetical protein [Paramecium tetraurelia strain d4-2] gi|124430375|emb|CAK95162.1| unnamed protein product [Paramecium tetraurelia] Length = 489 Score = 40.0 bits (92), Expect = 0.099, Method: Composition-based stats. Identities = 26/76 (34%), Positives = 35/76 (46%), Gaps = 9/76 (11%) Query: 90 CSLVGVVCMHLNIY----TLLPCVRVCS-----AVIGSFSSRVYVCVACLWLLLCFGAAM 140 C G++ L Y +L CV CS A F+ + CLW LL FG A+ Sbjct: 285 CITGGLIAQKLGGYQRTKSLYVCVLYCSLCCISAAPVPFTETFWFGALCLWFLLFFGGAI 344 Query: 141 LPPLTGIQIDAVSPDL 156 +PPL GI + +V L Sbjct: 345 VPPLMGIMLSSVPKHL 360 >gi|156088135|ref|XP_001611474.1| transporter, major facilitator family [Babesia bovis] gi|154798728|gb|EDO07906.1| transporter, major facilitator family [Babesia bovis] Length = 655 Score = 39.3 bits (90), Expect = 0.18, Method: Compositional matrix adjust. Identities = 19/33 (57%), Positives = 21/33 (63%) Query: 124 YVCVACLWLLLCFGAAMLPPLTGIQIDAVSPDL 156 Y V C+WL+L FG MLPPLT I I VS L Sbjct: 547 YNLVGCIWLILFFGGGMLPPLTLITISNVSERL 579 >gi|145493435|ref|XP_001432713.1| hypothetical protein [Paramecium tetraurelia strain d4-2] gi|124399827|emb|CAK65316.1| unnamed protein product [Paramecium tetraurelia] Length = 485 Score = 38.9 bits (89), Expect = 0.24, Method: Composition-based stats. Identities = 25/76 (32%), Positives = 35/76 (46%), Gaps = 9/76 (11%) Query: 90 CSLVGVVCMHLNIY----TLLPCVRVC-----SAVIGSFSSRVYVCVACLWLLLCFGAAM 140 C G++ L Y +L CV C SA F+ + C+W LL FG A+ Sbjct: 285 CITGGLIAQKLGGYERTKSLYICVVYCFICCLSATPVPFTETFWFGALCVWFLLFFGGAI 344 Query: 141 LPPLTGIQIDAVSPDL 156 +PPL GI + +V L Sbjct: 345 VPPLMGIMLSSVPKHL 360 >gi|118347505|ref|XP_001007229.1| Major Facilitator Superfamily protein [Tetrahymena thermophila] gi|89288996|gb|EAR86984.1| Major Facilitator Superfamily protein [Tetrahymena thermophila SB210] Length = 554 Score = 38.9 bits (89), Expect = 0.26, Method: Composition-based stats. Identities = 16/38 (42%), Positives = 23/38 (60%) Query: 119 FSSRVYVCVACLWLLLCFGAAMLPPLTGIQIDAVSPDL 156 F Y + +WLLL FG AM+P LTG+ + A+ +L Sbjct: 428 FIDAFYASASLVWLLLFFGGAMVPALTGMMLSAIQTEL 465 >gi|71663806|ref|XP_818891.1| dispersed gene family protein 1 (DGF-1) [Trypanosoma cruzi strain CL Brener] gi|70884167|gb|EAN97040.1| dispersed gene family protein 1 (DGF-1), putative [Trypanosoma cruzi] Length = 3483 Score = 38.5 bits (88), Expect = 0.30, Method: Compositional matrix adjust. Identities = 34/112 (30%), Positives = 45/112 (40%), Gaps = 11/112 (9%) Query: 34 ASFGYGSWRRLRGRGVNLCVLRCTYRPVVCVEGDFCAALLFVYARVHGKSFWRVSFCSLV 93 ASFG GSW +RG ++ +L P D + L +Y +LV Sbjct: 2720 ASFGSGSWLSVRGNSISGKILSLPSYP---RSADLVQSTLTLYGNAGSGPVVMDGTVALV 2776 Query: 94 G------VVCMHLNIYTLLPCVRVCSAVIGSFSSRVYVCVACLWLLLCFGAA 139 G V C+ LN TL P + +IG F R C C + CF AA Sbjct: 2777 GAGRRFVVGCLTLNGQTLRPMEYRSAGIIGEF--RPVACGVCDADVRCFAAA 2826 >gi|168001733|ref|XP_001753569.1| predicted protein [Physcomitrella patens subsp. patens] gi|162695448|gb|EDQ81792.1| predicted protein [Physcomitrella patens subsp. patens] Length = 399 Score = 38.5 bits (88), Expect = 0.34, Method: Compositional matrix adjust. Identities = 20/72 (27%), Positives = 37/72 (51%), Gaps = 2/72 (2%) Query: 63 CVEGDFCAALLFVYARVHGKSFWRVSFCSL-VGVVCMHLNIYTLLPCVRVCSAVIGSFSS 121 C+EGD + F +++ HG WRV + + GVV M + + + +C+++IG + Sbjct: 220 CIEGDHIINI-FSFSKAHGMMGWRVGYIAYPAGVVGMGAQLLKVQDNIPICASIIGQKLA 278 Query: 122 RVYVCVACLWLL 133 + V W+L Sbjct: 279 LAALQVGPEWVL 290 >gi|322825128|gb|EFZ30244.1| dispersed gene family protein 1 (DGF-1), putative [Trypanosoma cruzi] Length = 442 Score = 36.6 bits (83), Expect = 1.3, Method: Compositional matrix adjust. Identities = 33/113 (29%), Positives = 45/113 (39%), Gaps = 11/113 (9%) Query: 34 ASFGYGSWRRLRGRGVNLCVLRCTYRPVVCVEGDFCAALLFVYARVHGKSFWRVSFCSLV 93 ASF GSW +RG ++ +L P D + L ++ S +LV Sbjct: 95 ASFVSGSWLSVRGNSISGRLLSVPSYPRTA---DLAQSTLTLHGNAGSGSVVMEGTVALV 151 Query: 94 G------VVCMHLNIYTLLPCVRVCSAVIGSFSSRVYVCVACLWLLLCFGAAM 140 G V C+ LN L P + +IG F R C C + CFGAA Sbjct: 152 GAGRRFVVGCLTLNGQALQPINYRSAGIIGEF--RPVACGVCDADVRCFGAAT 202 >gi|71399351|ref|XP_802762.1| dispersed gene family protein 1 (DGF-1) [Trypanosoma cruzi strain CL Brener] gi|70864828|gb|EAN81316.1| dispersed gene family protein 1 (DGF-1), putative [Trypanosoma cruzi] Length = 878 Score = 36.2 bits (82), Expect = 1.8, Method: Compositional matrix adjust. Identities = 32/112 (28%), Positives = 46/112 (41%), Gaps = 11/112 (9%) Query: 34 ASFGYGSWRRLRGRGVNLCVLRCTYRPVVCVEGDFCAALLFVYARVHGKSFWRVSFCSLV 93 ASF +GSW +RG ++ +L P +F + L ++ S +L Sbjct: 364 ASFAFGSWLSVRGNSISGRLLSVPSYP---RSAEFAQSTLTLHGNAGSGSVVMDGTVTLG 420 Query: 94 G------VVCMHLNIYTLLPCVRVCSAVIGSFSSRVYVCVACLWLLLCFGAA 139 G V C+ LN L P + +IG F R C AC + CF AA Sbjct: 421 GAGRSFVVGCLTLNGQALQPMDYRSAGIIGKF--RPVACGACDADVHCFAAA 470 >gi|67621406|ref|XP_667762.1| hypothetical protein [Cryptosporidium hominis TU502] gi|54658917|gb|EAL37524.1| hypothetical protein Chro.70384 [Cryptosporidium hominis] Length = 635 Score = 36.2 bits (82), Expect = 1.8, Method: Composition-based stats. Identities = 18/57 (31%), Positives = 32/57 (56%), Gaps = 5/57 (8%) Query: 103 YTLLPCV--RVCSAVIGSFS---SRVYVCVACLWLLLCFGAAMLPPLTGIQIDAVSP 154 YT+L C+ + + + G+ + V + +W LL FG+ ++PP+TGI + V P Sbjct: 505 YTMLFCMVSAIFATLFGALALIIDNFTVTIVGVWGLLFFGSFLVPPITGISVGVVEP 561 >gi|66363086|ref|XP_628509.1| major facilitator superfamily transporter [Cryptosporidium parvum Iowa II] gi|46229527|gb|EAK90345.1| major facilitator (MFS) superfamily transporter containing 12 transmembrane domains [Cryptosporidium parvum Iowa II] Length = 635 Score = 36.2 bits (82), Expect = 1.8, Method: Composition-based stats. Identities = 18/57 (31%), Positives = 32/57 (56%), Gaps = 5/57 (8%) Query: 103 YTLLPCV--RVCSAVIGSFS---SRVYVCVACLWLLLCFGAAMLPPLTGIQIDAVSP 154 YT+L C+ + + + G+ + V + +W LL FG+ ++PP+TGI + V P Sbjct: 505 YTMLFCMVSAIFATLFGALALIVDNFTVTIVGVWGLLFFGSFLVPPITGISVGVVEP 561 >gi|322817740|gb|EFZ25375.1| dispersed gene family protein 1 (DGF-1), putative [Trypanosoma cruzi] Length = 459 Score = 35.8 bits (81), Expect = 1.9, Method: Compositional matrix adjust. Identities = 34/112 (30%), Positives = 44/112 (39%), Gaps = 11/112 (9%) Query: 34 ASFGYGSWRRLRGRGVNLCVLRCTYRPVVCVEGDFCAALLFVYARVHGKSFWRVSFCSLV 93 ASF GSW +RG V+ +L P D + L +Y S +LV Sbjct: 216 ASFVSGSWLSVRGNSVSGKILSVPSYP---RSADLVQSTLTLYGNAGSGSVVMDGTVALV 272 Query: 94 G------VVCMHLNIYTLLPCVRVCSAVIGSFSSRVYVCVACLWLLLCFGAA 139 G V C+ LN L P + +IG F R C C + CF AA Sbjct: 273 GAGRRFVVGCLTLNGQALQPMDYRSAGIIGEF--RPVACGVCDAGVHCFAAA 322 >gi|322830662|gb|EFZ33608.1| dispersed gene family protein 1 (DGF-1), putative [Trypanosoma cruzi] Length = 634 Score = 35.8 bits (81), Expect = 2.1, Method: Compositional matrix adjust. Identities = 33/112 (29%), Positives = 45/112 (40%), Gaps = 11/112 (9%) Query: 34 ASFGYGSWRRLRGRGVNLCVLRCTYRPVVCVEGDFCAALLFVYARVHGKSFWRVSFCSLV 93 ASF GSW +RG ++ +L P DF + L +Y S +L Sbjct: 171 ASFVSGSWLSVRGNSISGRLLSVPSYP---RNADFAQSTLTLYGNAGSGSVVMDGTVALG 227 Query: 94 G------VVCMHLNIYTLLPCVRVCSAVIGSFSSRVYVCVACLWLLLCFGAA 139 G V C+ LN L P + + +IG F R C C + CF AA Sbjct: 228 GAGRKFVVGCLTLNGQALQPMIYRSAGIIGEF--RPVACGVCDADVNCFAAA 277 >gi|301094336|ref|XP_002896274.1| Major Facilitator Superfamily (MFS) [Phytophthora infestans T30-4] gi|262109669|gb|EEY67721.1| Major Facilitator Superfamily (MFS) [Phytophthora infestans T30-4] Length = 715 Score = 35.8 bits (81), Expect = 2.4, Method: Compositional matrix adjust. Identities = 17/42 (40%), Positives = 25/42 (59%) Query: 114 AVIGSFSSRVYVCVACLWLLLCFGAAMLPPLTGIQIDAVSPD 155 + + +F + +Y+ LWLLL FG A+LP TGI I V + Sbjct: 573 SAVTTFFNDIYITAGFLWLLLFFGGAILPACTGIFISVVPAE 614 >gi|338174074|ref|YP_004650884.1| hypothetical protein PUV_00800 [Parachlamydia acanthamoebae UV7] gi|336478432|emb|CCB85030.1| putative uncharacterized protein [Parachlamydia acanthamoebae UV7] Length = 490 Score = 35.0 bits (79), Expect = 3.2, Method: Compositional matrix adjust. Identities = 18/52 (34%), Positives = 29/52 (55%), Gaps = 1/52 (1%) Query: 80 HGKSFWRVSFCSLVGVVCMHLNIYTLLPCVRVCSAVIGSFSSRVYVCVACLW 131 H KS W+VS +L G+VC+ +N + L+P + + +F V V V +W Sbjct: 354 HKKSIWQVSVETLAGLVCVGMN-FALIPSLGKEGGALATFFGFVAVIVLSIW 404 >gi|282890430|ref|ZP_06298957.1| hypothetical protein pah_c017o013 [Parachlamydia acanthamoebae str. Hall's coccus] gi|281499684|gb|EFB41976.1| hypothetical protein pah_c017o013 [Parachlamydia acanthamoebae str. Hall's coccus] Length = 499 Score = 35.0 bits (79), Expect = 3.6, Method: Compositional matrix adjust. Identities = 18/52 (34%), Positives = 29/52 (55%), Gaps = 1/52 (1%) Query: 80 HGKSFWRVSFCSLVGVVCMHLNIYTLLPCVRVCSAVIGSFSSRVYVCVACLW 131 H KS W+VS +L G+VC+ +N + L+P + + +F V V V +W Sbjct: 363 HKKSIWQVSVETLAGLVCVGMN-FALIPSLGKEGGALATFFGFVAVIVLSIW 413 >gi|145486746|ref|XP_001429379.1| hypothetical protein [Paramecium tetraurelia strain d4-2] gi|124396471|emb|CAK61981.1| unnamed protein product [Paramecium tetraurelia] Length = 462 Score = 35.0 bits (79), Expect = 3.7, Method: Compositional matrix adjust. Identities = 15/38 (39%), Positives = 22/38 (57%) Query: 119 FSSRVYVCVACLWLLLCFGAAMLPPLTGIQIDAVSPDL 156 F+ + C+W LL FG A++PPL GI + +V L Sbjct: 299 FTDTFWFGALCVWFLLFFGGAIVPPLMGIMLSSVPKHL 336 >gi|340503314|gb|EGR29914.1| major facilitator superfamily protein, putative [Ichthyophthirius multifiliis] Length = 486 Score = 35.0 bits (79), Expect = 3.7, Method: Composition-based stats. Identities = 21/70 (30%), Positives = 31/70 (44%), Gaps = 13/70 (18%) Query: 87 VSFCSLVGVVCMHLNIYTLLPCVRVCSAVIGSFSSRVYVCVACLWLLLCFGAAMLPPLTG 146 +S C L GV C +C+ I F + V LW L+ F AAM+P + G Sbjct: 305 LSICMLNGV------------CASMCAIPI-PFQKDFSIVVTLLWFLMFFEAAMIPAIMG 351 Query: 147 IQIDAVSPDL 156 + + +V L Sbjct: 352 LMLSSVQKKL 361 >gi|340505347|gb|EGR31685.1| major facilitator superfamily protein, putative [Ichthyophthirius multifiliis] Length = 365 Score = 33.9 bits (76), Expect = 7.8, Method: Compositional matrix adjust. Identities = 14/38 (36%), Positives = 22/38 (57%) Query: 119 FSSRVYVCVACLWLLLCFGAAMLPPLTGIQIDAVSPDL 156 F Y+ + +W LL FG AM+P LTG+ + +V + Sbjct: 283 FFDSFYLAASSIWCLLFFGGAMVPGLTGMMLSSVEAEF 320 >gi|209876496|ref|XP_002139690.1| major facilitator superfamily transporter [Cryptosporidium muris RN66] gi|209555296|gb|EEA05341.1| major facilitator superfamily protein [Cryptosporidium muris RN66] Length = 652 Score = 33.9 bits (76), Expect = 8.1, Method: Composition-based stats. Identities = 18/55 (32%), Positives = 31/55 (56%), Gaps = 5/55 (9%) Query: 103 YTLLPCV--RVCSAVIGSFS---SRVYVCVACLWLLLCFGAAMLPPLTGIQIDAV 152 YT+L C+ + + V G + + +A +W LL FG+ ++PP+TGI + V Sbjct: 519 YTMLFCLGCSITATVFGIIALVIDHFFATIAGIWGLLFFGSFLVPPITGICVGVV 573 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects Posted date: Jul 22, 2011 4:42 PM Number of letters in database: 5,058,227,080 Number of sequences in database: 14,777,732 Lambda K H 0.335 0.146 0.519 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 14777732 Number of Hits to DB: 1,389,584,683 Number of extensions: 50909562 Number of successful extensions: 155199 Number of sequences better than 10.0: 30 Number of HSP's gapped: 155950 Number of HSP's successfully gapped: 31 Length of query: 156 Length of database: 5,058,227,080 Length adjustment: 118 Effective length of query: 38 Effective length of database: 3,314,454,704 Effective search space: 125949278752 Effective search space used: 125949278752 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 15 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 39 (21.6 bits) S2: 76 (33.9 bits)