|
|
|
|
|
|
|
V |
MATA1 |
|
0 |
|
|
Need to study literature more carefully and consult experts.but at first glance none of these motifs seems right |
V |
YJL206C |
|
0 |
|
|
Seven motifs from ChIP-chip, but none of them corresponds well to ChIP-chip data, and none of them resembles a GAL4 motif. 1169 has a CGG in the middle, but too much flanking information to be credible without further independent support. |
V |
YGR288W |
MAL13 |
0 |
|
|
None of the ChIP-chip motifs correspond wekk to the data they come from and/or resemble a GAL4 motif. |
V |
YBR297W |
MAL33 |
0 |
|
|
None of the ChIP-chip motifs correspond wekk to the data they come from and/or resemble a GAL4 motif. |
V |
MBP1-SWI6-dimer |
MBP1-SWI6-dimer |
0 |
|
|
Redundant with MBP1 |
V |
YIR017C |
MET28 |
0 |
|
|
Like MET4, component of a complex. SGD: "Basic leucine zipper (bZIP) transcriptional activator in the Cbf1p-Met4p-Met28p complex".."Both Met4p and Met28p bind to DNA only in the presence of Cbf1p, and the presence of Cbf1p and Met4p stimulates the binding of Met28p to DNA (1, 2).". ChIP-chip motif 703 (CTGTGG) is clearly the Met31/32 motif. The other ChIP-chip motif is essentially poly-A, and scores poorly. Hence, neither of these motifs represents the intrinsic sequence specificity of MET28. Need in vitro data for complexes. |
V |
YNL103W |
MET4 |
0 |
|
|
My understanding is that Met4 is a modifier of the specificity of other proteins. SGD states that it "requires different combinations of the auxiliary factors Cbf1p, Met28p, Met31p and Met32p". ChIP-chip motifs 1023 and 1024 I believe are cofactor motifs; they are E-boxes. ChIP-chip motif 689 is different and matches Met28 and Met32 motifs. (CTGTGG core). Met28 is a bZIP protein, and Met32 is a C2H2. MITOMI motif for Met32 is TGTGG. So this is the Met32 motif. I do not believe that any of the Met4 motifs is correct. Need to obtain motifs for complexes. |
V |
YKR064W |
OAF3 |
0 |
|
|
I do not see how either of these motifs could possibly be a Gal4-class binding motif. And, there is no correspondence to any of the data, even the ChIP-chip data from which it is derived. |
V |
YOR363C |
PIP2 |
0 |
|
|
See Oaf1-Pip2-dimer |
V |
YOR162C |
YRR1 |
2245 |
High |
|
Classic monomeric GAL4-class motif. PBM studies agree and score significantly on Harbison data. No other motifs have spacing/orientation except 11909958, but even the authors of this study note that "Only half a dyad seems to be conserved in this consensus sequence". 2245 scores highest in Harbison data. |
V |
YDR207C |
UME6 |
2239 |
High |
|
All motifs are similar to each other. BEEML-PBM motif 2239 scores highest across the board. |
V |
YPR009W |
SUT2 |
2236 |
High |
|
Highest-scoring motif (PBM) is a classical GAL4-type monomeric motif and is very significant in ChIP-chip |
V |
YDR169C |
STB3 |
2233 |
High |
|
STB3 binds RRPE element (AAAAATTT) both in vivo and in vitro (PMID 17616518). PBM motifs 810 and 2233 strongly resembles the RRPE element, scores significantly in deletion expression data, and nail the GO categories "nucleolus" and "ribosome biogenesis". 2233 gets slightly higher scores. |
V |
YKL038W |
RGT1 |
2227 |
High |
|
PBM motif 2227 is very similar to "traditional" motif and to monomeric GAL4 motifs, and scores highest on ChIP-chip data. All PBM motifs are similar. |
V |
YFR034C |
PHO4 |
2222 |
High |
|
Almost all motifs match classic HLH E-box. PBM motif 2222 has highest match to both ChIP-chip and expression data, without being circular. |
V |
YER088C |
DOT6 |
2221 |
High |
|
PBM motif 812 most closely resembles that of homolog TOD6, which is well-supported; has highest correlation to both ChIP and expression data. |
V |
YPL248C |
GAL4 |
2206 |
High |
|
ChIP-chip motif 1510 resembles literature motif, and PBM motif 875, but scores highly on ChIP and expression data, across the board. Note, however, that the high ChIP-chip scores stem from an experiment with high negative correlation. PBM motif 2206 appears to be a monomeric version, socres even higher on ChIP-chip and expression. |
V |
YPR104C |
FHL1 |
2203 |
High |
|
ChIP-chip motifs are all Rap1. PBMs identify a different motif which also corresponds to ChIP-chip data. Selected 2203 as it scores highest on ChIP-chip and expression data. |
V |
YML081W |
|
2194 |
High |
|
PBM motifs are a classical C2H2 motif that match each other and have some correspondence to ChIP-chip data. 2194 has highest correspondence to ChIP chip. |
V |
YKL222C |
|
2192 |
High |
|
Two motifs from PBMs resemble monomeric GAL4-like motif. 2192 agrees best with ChIP-chip data and expression data. |
V |
YGR067C |
|
2191 |
High |
|
PBM motif is a classical C2H2 motif that has good correspondence to ChIP-chip data. 2191 corresponds best and has fewer empty columns in the PWM. |
V |
YML007W |
YAP1 |
2186 |
High |
|
PBM motif 2186 looks like a monomeric bZIP site but it has the highest scores on both ChIP and expression |
V |
YBR150C |
TBS1 |
2179 |
High |
|
Two motifs from PBMs are nearly identical GAL4-class motifs with defined spacing and orientation. Motif 552 has slightly higher scores. Two motifs from BEEML analysis of PBM data give monomeric motif - also give this high confidence. |
V |
YPL128C |
TBF1 |
2178 |
High |
|
All motifs, obtained by three different means, are all very similar, although there is no ChIP or expression support for any of them. Went with 2178, which is the BEEML output. |
V |
YHR006W |
STP2 |
2174 |
High |
|
STP1 and 2 have very similar DNA-binding domains. However, they are not similar to those of STP3 and 4. PBM motif for STP2 (2174) correlates highest with ChIP-chip and expression data. ChIP-chip motif for STP1 (660) most strongly resembles motif 800, and scores highly on ChIP-chip data. In addition, these motifs resemble halfmers of literature-derived binding sites. |
V |
YOR380W |
RDR1 |
2158 |
High |
|
All motifs are related except 1851. PBM motif 2158 is monomeric and has highest correspondence to ChIP-chip data. The literature motif 756 consists of two back-to-back and slightly overlapping versions of the monomeric PBM motif. There is no evidence for direct binding in this specific spacing and orientation; however, the results of mutations in reporters indicate that both copies are necessary for induction in the mutant. Retain both motifs. |
V |
YDL106C |
PHO2 |
2154 |
High |
|
Motifs are largely all different from each other. PBM motif 2154 scores highly on ChIP data and resembles classic TAAT homeobox core. Note that PBM motif 794 even more strongly resembles homeobox (TAATTA) but scores slightly less highly. |
V |
YKL043W |
PHD1 |
2153 |
High |
|
High-scoring motifs are all similar, with characteristic APSES GC core and palindromic. PBM motifs score highest on ChIP-seq data, while ChIP-chip motif 393 (which contains flanking G/C residues) scores highest on expression data. Retain both - possibly, the rest of the protein contributes to binding flanking residues. This is the higher-scoring PBM motif (2153). |
V |
YDR043C |
NRG1 |
2148 |
High |
|
PBM, ChIP-chip, and literature motifs all appear very similar, and resemble motif for the related protein NRG2. Choose top PBM motif (2148). There is also a recurring ChIP-chip motif (TGTGCCT) which I believe is actually the MOT3 binding site. |
V |
YER028C |
MIG3 |
2144 |
High |
|
PBM motif 2144 has highest correspondence to ChIP-chip data |
V |
YGL209W |
MIG2 |
2143 |
High |
|
PBM motif 2143 has highest correspondence to ChIP-chip data |
V |
YGL035C |
MIG1 |
2142 |
High |
|
PBM motif 2142 has highest correspondence to ChIP-chip AND AUC for GO category "generation of precursor metabolites and energy". The adjacent A/T stretch, which is also noted in the literature, is found in ChIP-chip motif 654 and others; however, that motif does not sort as well for GO category "generation of precursor metabolites and energy" and also scores lower for both ChIP and expression, so it seems unlikely to represent a key intrinsic activity of the protein itself. |
V |
YDR253C |
MET32 |
2140 |
High |
|
Most motifs look similar. PBM motif 2140 has highest correspondence to both ChIP and expression. |
V |
YDL056W |
MBP1 |
2138 |
High |
|
Almost all motifs look similar to literature binding site. PBM motif 2138 scores at the top on ChIP-chip and expression. And is non-circular. |
V |
YLR451W |
LEU3 |
2135 |
High |
|
Most motifs look similar - dimeric GAL4 motif. Literature motif (781) has high correspondence to ChIP-chip and expression data and is not circular. But, PBM motif 2135, which is a monomeric GAL4 motif, scores highest on both ChIP-chip and expression data. |
V |
YOL089C |
HAL9 |
2134 |
High |
|
PBM motifs 799 and 2134 score highest on ChIP-chip data; classic dimeric and monomeric GAL4 sites, respectively. |
V |
YJL110C |
GZF3 |
2133 |
High |
|
Classic GATA motif 2133 from PBM scores highest on ChIP-chip and expression data |
V |
YLR013W |
GAT3 |
2128 |
High |
|
All PBM motifs look similar, also similar to a subset of other GATAs. 2128 scores quite highly on ChIP-chip (albeit with negative correlation!), and also higher on expression and OE data. |
V |
YLR228C |
ECM22 |
2122 |
High |
|
PBM motif 2122 is a monomeric GAL4 class motif, and scores highest on both ChIP and expression ata. 849 is a classic dimeric GAL4 motif with lower but still reasonable scores and is moderately predictive across the board. |
V |
YPL177C |
CUP9 |
2121 |
High |
|
MITOMI and PBM motifs are similar. PBM motif 2121 has slightly lower correspondence to ChIP data, but more significant correspondence to expression data. |
V |
YLR098C |
CHA4 |
2120 |
High |
|
Two PBM motifs agree, and PBM motif 2120 has highest correspondence to ChIP-chip data, even highter than the best ChIP-chip motif. Has a GAL4-like appearance, albeit a variant. Monomeric. (Highest scoring motif - 1607 - is actually a Rap1 motif). |
V |
YDR421W |
ARO80 |
2115 |
High |
|
PBM motif 2115 appears monomeric and has highest correspondence to ChIP-chip data. ChIP motif 1509 appears dimeric and correlates with ChIP data. Literature motif 725 appears trimeric and has experimental support. Retain all three. |
V |
YLR278C |
|
2112 |
High |
|
Only 2112 (from PBMs) stands out; dimeric GAL4 motif with high score on ChIP-chip. |
V |
YMR019W |
STB4 |
2107 |
High |
|
PBM motif 2107 is clearly a dimeric GAL4-class motif, and it blows all the other motifs out of the water. |
V |
YDR423C |
CAD1 |
2098 |
High |
|
Classic YAP motif in most cases. Include examples of both overlapping and adjacent monomeric sites - there are examples of both in PBM data and they both score highly on ChIP data. This one is adjacent. |
V |
YJL056C |
ZAP1 |
2097 |
High |
|
Most motifs are similar but do not exceed confidence thresholds on any data type. PBM motif 2097 has highest score for ChIP and expression, and is not circular |
V |
YBR033W |
EDS1 |
2093 |
High |
|
PBM and ChIP-chip motifs are very similar. PBM motif 2093 scores most significantly on ChIP data. Classic GAL4 class motif. |
V |
YLR256W |
HAP1 |
2078 |
High |
|
Literature binding site is direct CGG repeats with a 6bp spacer (PMID: 7958882). PBM motif 2078 gets this; it scores highest overall, including significant scores on both ChIP-chip and expression. |
V |
YDR423C |
CAD1 |
2073 |
High |
|
Classic YAP motif in most cases. Include examples of both overlapping and adjacent monomeric sites - there are examples of both in PBM data and they both score highly on ChIP data. This one is overlapping. |
V |
YPL075W |
GCR1 |
2071 |
High |
|
Gcr2 is not a DNA-binding protein. SGD: "Gcr1p is a DNA-binding protein interacting with the consensus sequence CTTCC, whereas Gcr2p interacts with Gcr1p". But, ChIP-chip motif 606 is probably the best Gcr1 motif available (even though it came from Gcr2 ChIP). |
V |
YIL101C |
XBP1 |
2039 |
High |
|
PBM and in vitro selection-derived motifs have highest scores across the board. 842 is higher on GO, but only slightly in AUC, and it has a very large number of empty flanking bases. 2039 (in vitro selection) seems a reasonable compromise - it's highest on ChIP and almost the highest on expression. |
V |
YIL131C |
FKH1 |
2002 |
High |
|
Classic Forkhead motif for most of them. 2002 strongly resembles PBM motif but scores higher on both ChIP (which is circular) and expression (which is not). |
V |
YKL112W |
ABF1 |
1993 |
High |
|
Most motifs are similar, and five have pegged the ChIP P-value. Choose 791- it's the highest scoring overall, and is from PBMs |
V |
YFL031W |
HAC1 |
1788 |
High |
|
1788 is the overall winner. But, literature motif 94 also scores well in ChIP-chip, despite being somewhat different. Possible difference in heterodimerization partners, or proteolytic fragment? Retain both, score 94 as medium. |
V |
YOL028C |
YAP7 |
1737 |
High |
|
7-base bZIP core. Obtained in ChIP-chip studies and higher correspondence to stressed ChIP-chip data. Possible heterodimer? Little literature on this protein. 1737 chosen because it is largely symmetric and has highest score for both stressed and unstressed Harbison data, also, higher GO score |
V |
YDL020C |
RPN4 |
1700 |
High |
|
In vitro motifs do not contain the TTT sequence on the end. But they were derived from the DBD only. The rest of the protein may contribute to binding the TTT segment. Motif 1700 has the highest correspondence to ChIP-chip and expression and GO. |
V |
YML065W |
ORC1 |
1549 |
High |
|
Looks like ORC1 motif. Which is not really a TF, but it is a sequence-specific DNA-binding protein. |
V |
YMR021C |
MAC1 |
1540 |
High |
|
Literature motif 1540 most closely most closely corresponds to ChIP-chip data (albeit barely significant). Nothing else to gauge by, but no reason to doubt literature motif. |
V |
YPL248C |
GAL4 |
1510 |
High |
|
ChIP-chip motif 1510 resembles literature motif, and PBM motif 875, but scores highly on ChIP and expression data, across the board. Note, however, that the high ChIP-chip scores stem from an experiment with high negative correlation. PBM motif 2206 appears to be a monomeric version, socres even higher on ChIP-chip and expression. |
V |
YDR421W |
ARO80 |
1509 |
High |
|
PBM motif 2115 appears monomeric and has highest correspondence to ChIP-chip data. ChIP motif 1509 appears dimeric and correlates with ChIP data. Literature motif 725 appears trimeric and has experimental support. Retain all three. |
V |
YML099C |
ARG81 |
1506 |
High |
|
ChIP motif 1506 correlates well with ChIP and also with expression data. Resembles dimeric GAL4 class motif. |
V |
YHR124W |
NDT80 |
1464 |
High |
|
Motif 1464 matches literature motifs and PBM motif, and nails sporulation on GO. It also has the highest correspondence to ChIP-chip data. |
V |
YBR240C |
THI2 |
1449 |
High |
|
This is a GAL4-class protein. All motifs are ChIP-chip derived, none resembles each other. 1449 is the only one with respectable scores on ChIP and expression,and it also has the appearance of a GAL4 class motif..although, the structural prior presumably forces it to have this property. |
V |
YOL028C |
YAP7 |
1414 |
High |
|
8-base bZIP core. Obtained by Mitomi, so this is a homodimer. Higher correspondence to unstressed ChIP-chip data. Little literature on this protein. 1414 chosen for higher ChIP-chip overall scores; plus, it is a palindrome as expected for a bZIP protein. |
V |
YHL009C |
YAP3 |
1411 |
High |
|
Mitomi yields a nearly palindromic 8-mer motif with strong similarity to that of Yap6. PBM motif is similar but appears to be partial. |
V |
YHR178W |
STB5 |
1405 |
High |
|
All motifs have CGG core and most have CGGnG. Most ChIP-derived motifs have no relationship to expression data. Mitomi motif 1405 and PBM motif 514 score decently on both ChIP-chip and expression data, and seem to nail the GO category (oxidative stress response), and look like classic Gal4 halfmers. MITOMI motif scores slighly higher overall. This is presumably the monomeric motif |
V |
YNL167C |
SKO1 |
1401 |
High |
|
The MITOMI motif 1401 is an offset and asymmetric version of the traditional consensus (TGACGTCA) but has a higher ChIP-chip and expression correspondence than the motifs that are more symmetric. |
V |
YPR065W |
ROX1 |
1396 |
High |
|
About half the motifs have a typical ACAAT Sox core. MITOMI motif 1396 has highest correspondence to both ChIP-chip and deletion expression data. |
V |
YBR066C |
NRG2 |
1383 |
High |
|
MITOMI motif 1383 looks like a classic yeast C2H2 binding site (row of G's). Also resembles motifs obtained by both ChIP and PBMs for related protein Nrg1. |
V |
YMR037C |
MSN2 |
1380 |
High |
|
MITOMI motif 1380 has the highest overall correspondence to ChIP-chip, overexpression, and deletion data. Resembles classic Msn2/4 motif. |
V |
YOL116W |
MSN1 |
1378 |
High |
|
MITOMI motif 1376 has the highest correspondence to ChIP-chip. MITOMI motif 1378 is very close, however, and seems to be a circular permutation. Retain both motifs. |
V |
YOL116W |
MSN1 |
1376 |
High |
|
MITOMI motif 1376 has the highest correspondence to ChIP-chip. MITOMI motif 1378 is very close, however, and seems to be a circular permutation. Retain both motifs. |
V |
YPL038W |
MET31 |
1370 |
High |
|
Most motifs look similar. MITOMI motif 1370 has highest overall correlation to ChIP-chip, OE, and deletion data. |
V |
YCR039C |
MATALPHA2 |
1364 |
High |
|
According to PMID: 9858582, "A comparison of the 2 binding sites in both asg and hsg operators yields the same consensus sequence, 5'-CATGTA-3"; results in Figure 2 of the same paper support a consensus of CATGTAA. MITOMI yields ACATG, which is the reverse complement of most of the literature consensus. Motif 1364 has highest information content; use this. |
V |
YEL009C |
GCN4 |
1363 |
High |
|
Virtually all motifs look the same. MITOMI motif 1363 is as good as any of the ChIP-chip motifs but not circular; scores high across the board. |
V |
YJR060W |
CBF1 |
1346 |
High |
|
Classic E-box. MITOMI motif 1346 nearly has highest correspondence to ChIP-chip data and is non-circular; no other supporting data |
V |
YLR131C |
ACE2 |
1332 |
High |
|
Highest-scoring ChIP-chip motif is Rap1 site. MITOMI motif 1332 is next, and resembles the classic Swi5/Ace2 motif. |
V |
YFL021W |
GAT1 |
962 |
High |
|
ChIP-chip motif 962 scores higher on both ChIP-chip and expression data |
V |
YBR049C |
REB1 |
907 |
High |
|
All motifs are similar. ChIP-chip motif 907 has highest correspondence to both ChIP-chip and expression data, and strongly resembles MITOMI and PBM motifs. |
V |
YPR015C |
|
871 |
High |
|
Only one motif available, from PBMs; resembles motof from CMR3 which is a paralogous gene (and nearly adjacent on the chromosome). And, scores significantly against expression data. |
V |
YDR034C |
LYS14 |
865 |
High |
|
PBM motifs are virtually identical and appear monomeric; literature motif is dimeric. Include both. Choose PBM motif 865 as it appears to have more robust CGG. |
V |
YPR196W |
|
861 |
High |
|
Motifs from PBMs are very similar and are a variant monomeric GAL4-like motif. Chose 861 as it passes the significance threshold against ChIP-chip data. |
V |
YPR013C |
CMR3 |
859 |
High |
|
PBM motifs are very similar. No other supporting data, but it's a clean motif. Chose 859 because it most closely resembles motif from paralog YPR015c. |
V |
YBL054W |
TOD6 |
852 |
High |
|
Two PBM motifs largely agree; 852 has higher correspondence to expression data while 495 has higher correspondence to ChIP-chip. Use 852; score is way higher. Also for GO. |
V |
YLR228C |
ECM22 |
849 |
High |
|
PBM motif 2122 is a monomeric GAL4 class motif, and scores highest on both ChIP and expression ata. 849 is a classic dimeric GAL4 motif with lower but still reasonable scores and is moderately predictive across the board. |
V |
YMR043W |
MCM1 |
831 |
High |
|
Most motifs resemble a classic SRF site. PBM motif 831 scores highly across the board, except for expression data where none does well, and its scores are non-circular. |
V |
YNL068C |
FKH2 |
830 |
High |
|
Most motifs are classic Forkhead. PBM motif 830 is one of the highest scoring and is not circular. |
V |
YBR083W |
TEC1 |
815 |
High |
|
All motifs agree, and are significant by several criteria. PBM motif 815 has the second-highest scores overall, and it is non-circular for in vivo binding. Also has highest GO score. |
V |
YOR172W |
YRM1 |
813 |
High |
|
Two PBM studies largely agree on classic GAL4-class monomeric motif. Motif 813 has indications of spacing and orientation of dimeric protein. |
V |
YNR063W |
|
804 |
High |
|
Motifs from PBMs are virtually identical. This is a monomeric GAL4-like motif. 804 agrees more with ChIP-chip data. |
V |
YOL089C |
HAL9 |
799 |
High |
|
PBM motifs 799 and 2134 score highest on ChIP-chip data; classic dimeric and monomeric GAL4 sites, respectively. |
V |
YER148W |
SPT15 |
798 |
High |
|
This is TATA-binding protein. PBM motif 798 chosen because 1326 was derived from the 96-sequence TIRF-PBM array instead of a full 40K PBM |
V |
YLR403W |
SFP1 |
797 |
High |
|
Most ChIP-seq studies identified the Rap1 motif. PBM motif 797 is less significant by ChIP-seq (although still highly significant) but is the winner across the board for all types of expression data. |
V |
YLR451W |
LEU3 |
781 |
High |
|
Most motifs look similar - dimeric GAL4 motif. Literature motif (781) has high correspondence to ChIP-chip and expression data and is not circular. But, PBM motif 2135, which is a monomeric GAL4 motif, scores highest on both ChIP-chip and expression data. |
V |
YIR018W |
YAP5 |
777 |
High |
|
ChIP-chip yields a classic 7-mer Yap motif that scores well on ChIP and significantly on expression. |
V |
YDR421W |
ARO80 |
725 |
High |
|
PBM motif 2115 appears monomeric and has highest correspondence to ChIP-chip data. ChIP motif 1509 appears dimeric and correlates with ChIP data. Literature motif 725 appears trimeric and has experimental support. Retain all three. |
V |
YDR451C |
YHP1 |
716 |
High |
|
ChIP-chip, EMSA, and one-hybrid all arrive at a classic homeodomain TAATTG motif. Microarray enrichment motif (716) scores higher on OE data from another study than ChIP motifs do, and does nearly as well on ChIP data. |
V |
YDR123C |
INO2 |
713 |
High |
|
Ino2/4 binds as a heterodimer, so there should just be one motif for the two proteins. All motifs appear similar but none of them is derived from in vitro data. Nonetheless most motifs match a classic E-box with some preference for flanking bases. Motif 713 is derived from ChIP-chip; it is not the highest-scoring ChIP-chip motif but it is highest for OE and deletion expression. |
V |
YOL108C |
INO4 |
713 |
High |
|
Ino2/4 binds as a heterodimer, so there should just be one motif for the two proteins. All motifs appear similar but none of them is derived from in vitro data. Nonetheless most motifs match a classic E-box with some preference for flanking bases. Motif 713 is derived from ChIP-chip; it is not the highest-scoring ChIP-chip motif but it is highest for OE and deletion expression. |
V |
YDR026C |
|
696 |
High |
|
Three ChIP-chip motifs are virtually identical in appearance; resemble Reb1 motifs; high correspondence to ChIP-chip data |