|
|
|
|
|
|
|
V |
YOR363C |
PIP2 |
0 |
|
|
See Oaf1-Pip2-dimer |
V |
YNL103W |
MET4 |
0 |
|
|
My understanding is that Met4 is a modifier of the specificity of other proteins. SGD states that it "requires different combinations of the auxiliary factors Cbf1p, Met28p, Met31p and Met32p". ChIP-chip motifs 1023 and 1024 I believe are cofactor motifs; they are E-boxes. ChIP-chip motif 689 is different and matches Met28 and Met32 motifs. (CTGTGG core). Met28 is a bZIP protein, and Met32 is a C2H2. MITOMI motif for Met32 is TGTGG. So this is the Met32 motif. I do not believe that any of the Met4 motifs is correct. Need to obtain motifs for complexes. |
V |
YKR064W |
OAF3 |
0 |
|
|
I do not see how either of these motifs could possibly be a Gal4-class binding motif. And, there is no correspondence to any of the data, even the ChIP-chip data from which it is derived. |
V |
YJL206C |
|
0 |
|
|
Seven motifs from ChIP-chip, but none of them corresponds well to ChIP-chip data, and none of them resembles a GAL4 motif. 1169 has a CGG in the middle, but too much flanking information to be credible without further independent support. |
V |
YIR017C |
MET28 |
0 |
|
|
Like MET4, component of a complex. SGD: "Basic leucine zipper (bZIP) transcriptional activator in the Cbf1p-Met4p-Met28p complex".."Both Met4p and Met28p bind to DNA only in the presence of Cbf1p, and the presence of Cbf1p and Met4p stimulates the binding of Met28p to DNA (1, 2).". ChIP-chip motif 703 (CTGTGG) is clearly the Met31/32 motif. The other ChIP-chip motif is essentially poly-A, and scores poorly. Hence, neither of these motifs represents the intrinsic sequence specificity of MET28. Need in vitro data for complexes. |
V |
YGR288W |
MAL13 |
0 |
|
|
None of the ChIP-chip motifs correspond wekk to the data they come from and/or resemble a GAL4 motif. |
V |
YBR297W |
MAL33 |
0 |
|
|
None of the ChIP-chip motifs correspond wekk to the data they come from and/or resemble a GAL4 motif. |
V |
MBP1-SWI6-dimer |
MBP1-SWI6-dimer |
0 |
|
|
Redundant with MBP1 |
V |
MATA1 |
|
0 |
|
|
Need to study literature more carefully and consult experts.but at first glance none of these motifs seems right |
V |
YER045C |
ACA1 |
8 |
Medium |
|
Literature motif 8 is supported by experimental investigation, and resembles a bZIP site, but has no other support; motif was not obtained objectively. Can bind as heterodimer. The highest-scoring motif (from ChIP, 1457) has low information content - I'm concerned it is learning other features of bound promoters. |
V |
YKL185W |
ASH1 |
28 |
Medium |
|
The literature motif may not represent the full binding activity of the protein. Also, it is not supported by ChIP-chip. ChIP-chip identifies Mcm1-like motifs. But, it does score highly in both ChIP-chip and expression. The only higher-scoring motif has almost no information content. |
V |
YMR280C |
CAT8 |
33 |
Medium |
|
Near-classic dimeric GAL4 motif. Literature-based. Not clear this is an optimal site but it does bind. Seems to hit the right GO category. |
V |
YGL166W |
CUP2 |
48 |
Medium |
|
Three motifs account for three possible spacings in the literature motif; it is not clear that this is the optimal site, however |
V |
YIR023W |
DAL81 |
53 |
Low |
|
None of the motifs agree with each other. The literature motif characterization was indirect; hence low confidence that this is the true motif. The ChIP-chip motifs score higher on ChIP data but that's circular. |
V |
YGL254W |
FZF1 |
69 |
Low |
|
Literature motif is the only one that appears credible. PBM motif I believe is a known artifact. Literature motif gets low confidence however as it is based on a single known binding sequence. |
V |
YFL031W |
HAC1 |
94 |
Medium |
|
1788 is the overall winner. But, literature motif 94 also scores well in ChIP-chip, despite being somewhat different. Possible difference in heterodimerazation partners, or proteolytic fragment? Retain both. |
V |
YDR034C |
LYS14 |
133 |
High |
|
PBM motifs are virtually identical and appear monomeric; literature motif is dimeric. Include both. Choose PBM motif 865 as it appears to have more robust CGG. |
V |
MAL63 |
|
136 |
Medium |
|
This is an unconventional dimeric GAL4-class motif |
V |
YLR266C |
PDR8 |
244 |
Medium |
|
Both motifs are equally credible but have very limited support. Literature motif is related to that of YRR1 literature motif. PBM motif, however, is a classic GAL4 monomer. This is the literature motif. |
V |
YNL216W |
RAP1 |
254 |
High |
|
Most motifs look similar. ChIP-chip motif 254 has highest correspondence to expression data. |
V |
YGR044C |
RME1 |
273 |
Low |
|
Motif 273 shows similarity to RME response elements (RREs), GTACC(T/A)ACAAAA (in fact it is derived from them). The fact that RME has three C2H2 zinc fingers and also requires an additional C-terminal region for binding in vitro, together with its relatively large footprint, are consistent with such a large binding site. However, I gave this motif a "low" score as there is no systematic analysis in vivo or in vitro indicating that these are really the most preferred sites. It would be valuable to redo the in vitro and in vivo experiments under appropriate conditions. |
V |
YML076C |
WAR1 |
325 |
Low |
|
None of the motifs are convincing, but at least sequences with the literature motif have been experimentally confirmed to bind the protein (even if it is not shown that this is the optimal binding site) |
V |
YLR403W |
SFP1 |
357 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YOR372C |
NDD1 |
366 |
Incorrect |
|
Likely represents Mcm1 binding site. |
V |
YHR206W |
SKN7 |
380 |
High |
|
Motifs are remarkably discordant considering that they all resemble each other in being G+C rich and containing a GGCC core. Possibly reflecting different modes of multimerization? Include the two that score highest on independent data: PBM motif 583, which represents a monomer, and ChIP-chip motif 380, which appears to represent a dimer. |
V |
YDR310C |
SUM1 |
383 |
High |
|
This is the motif for the FL SUM1; scores highest on ChIP-chip and resembles the canonical literature motif; also has some relationship to deletion expression data |
V |
YPL202C |
AFT2 |
389 |
High |
|
All motifs look similar. ChIP-chip motif 389 scores high on ChIP-chip data and also best on expression data. |
V |
YKL043W |
PHD1 |
393 |
High |
|
High-scoring motifs are all similar, with characteristic APSES GC core and palindromic. PBM motifs score highest on ChIP-seq data, while ChIP-chip motif 393 (which contains flanking G/C residues) scores highest on expression data. Retain both - possibly, the rest of the protein contributes to binding flanking residues. This is the ChIP motif that scores highest on expression data. |
V |
YOR344C |
TYE7 |
397 |
High |
|
All studies except one get canonical HLH motif. 795 (PBM) is nearly tied for best ChIP-chip score with the best ChIP-chip motif. Still, ChIP motif 397 scores higher, and looks identical, but with fewer flanking empty positions. |
V |
YHR084W |
STE12 |
400 |
High |
|
All motifs but one resemble the canonical literature site. Motif 400 is derived from ChIP-chip data (on which it scores highest) but also scores highest on expression data. |
V |
YKR099W |
BAS1 |
402 |
High |
|
Virtually all motifs are similar, with GAGTCA core. ChIP motif 402 has highest correspondence to both ChIP-chip and expression data. |
V |
YMR016C |
SOK2 |
404 |
High |
|
ChIP-chip motif 404 has highest correspondence to both ChIP-chip and expression data - and strongly resembles PBM motif |
V |
YPR104C |
FHL1 |
406 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YOR028C |
CIN5 |
409 |
High |
|
Most motifs match the classic YAP motif. This is the best in vivo motif (highest match to ChIP-chip). |
V |
YGL073W |
HSF1 |
411 |
Medium |
|
Four types of motifs contain TTC monomeric core and all score highly on both ChIP and expression. Appear to represent different monomeric/multimeric binding configurations. This is the spaced direct repeat dimeric site. From ChIP. |
V |
YPL089C |
RLM1 |
419 |
Medium |
|
Motif 419 has a MADS-like appearance, and scores very highly in ChIP-chip data, despite being derived from the literature. Not much correspondence to expression however, hence Medium confidence. ChIP-chip motif 910 does slightly better on expression but to me is not a credible MADS box binding site. |
V |
YGL073W |
HSF1 |
476 |
Medium |
|
Four types of motifs contain TTC monomeric core and all score highly on both ChIP and expression. Appear to represent different monomeric/multimeric binding configurations. This is the monomeric site. From PBM. |
V |
YDR310C |
SUM1 |
478 |
High |
|
This is the motif for the SUM1 AT_hook; scores highest in deletion expression data |
V |
YGL013C |
PDR1 |
485 |
High |
|
PBM motif 485 looks like a traditional literature motif and has highest correspondence to ChIP and expression data. Dimeric GAL4 motif. |
V |
YDL170W |
UGA3 |
486 |
Medium |
|
Appears to be a monomeric GAL4-class motif. Derived from PBM data, scores highly in ChIP-chip data, but not as high as the dimeric site derived from the ChIP-chip data. |
V |
YBR267W |
REI1 |
489 |
High |
|
PBM motif looks like a yeast C2H2 motif (row of C's); highly significant relationship to ChIP-chip data |
V |
YGL096W |
TOS8 |
494 |
High |
|
No corroborating data on this TF, and only one PBM motif known and one ChIP motif. But, it resembles TGTCA, which was also obtained for paralog Cup9 by multiple approaches (GTGNCA), as well as PBM results for the Meis/Mrg/Pknox/Tgif family, which are the closest mammalian homologs. The ChIP motif (1902) does not resemble a homeodomain binding sequence, and scores lower on expression data. |
V |
YLR176C |
RFX1 |
496 |
Medium |
|
Curious case - virtually all motifs are similar in appearance, with a common TGGCAAC core. They range from what appear to be monomers to full dimers, with multiple partial forms. However, none of them scores highly on both ChIP-chip and expression data. Select two representatives: one that scores well on ChIP-chip, and one that scores well on expression. This is the one that scores most highly on expression (close 2nd on deletion and 1st on overexpression). It is the only purely monomeric motif. Give medium confidence, since according to the literature this protein should bind as a dimer. |
V |
YML027W |
YOX1 |
498 |
High |
|
Two PBM studies and Pramila et al. (PMID 12464633) agree on classic homeodomain TAATTA motif. All three correlate with expression change and OE. Motif 453 is not a direct measurement so choose PBM motif that is the same length as the typical homeodomain footprint - 498 also correlates best with OE data; expression scores are skewed low by the large number of cell-cycle measurements. |
V |
YOR113W |
AZF1 |
499 |
High |
|
PBM motif 499 scores as well as the ChIP-chip motifs, but without the circularity. No significant data except ChIP-chip, however. |
V |
YCR106W |
RDS1 |
506 |
High |
|
All motifs look similar. PBM motif 506 has a higher score on ChIP-chip than any of the ChIP-chip derived motifs. |
V |
YPL230W |
USV1 |
509 |
High |
|
Two PBM studies essentially agree on classical C2H2 GGGG-containing motif. Chose 509 because it scores much higher on both ChIP and expression data. |
V |
YER184C |
|
512 |
Medium |
|
One motif from PBMs is a monomeric GAL4-like motif and the other is dimeric. Medium confidence because there is little independent support, and both contain the CCGG core that I believe may be an artifact. However, both score significantly on ChIP-chip data. Only 512 is significant on expression data. |
V |
YNL027W |
CRZ1 |
516 |
High |
|
PBM motif 516 scores highest on ChIP and expression; resembles classic literature motifs |
V |
YKL062W |
MSN4 |
518 |
High |
|
PBM motif 518 resembles both the classical MSN motif and the PBM motif, and scores highest on both expression and ChIP-chip. |
V |
YMR168C |
CEP3 |
524 |
High |
|
Two PBM motifs agree. Went with 524 because it appears neater. No other supporting data for any of them. |
V |
YLL054C |
|
526 |
Medium |
|
Three motifs available, from PBMs; two dimeric GAL4-like motifs but with different spacings and one monomeric. No backup data but looks tidy. Keep all three. |
V |
YLR266C |
PDR8 |
528 |
Medium |
|
Both motifs are equally credible but have very limited support. Literature motif is related to that of YRR1 literature motif. PBM motif, however, is a classic GAL4 monomer. This is the PBM motif. |
V |
YMR182C |
RGM1 |
531 |
High |
|
PBM motif 531 looks like a C2H2 motif (row of G's), and scores well on both ChIP-chip and deletion expression data. |
V |
YER130C |
COM2 |
534 |
High |
|
PBM motif 534 has the highest correspondence to expression data. Not much else supporting any of the motifs, although the two PBM motifs look about the same. Also look like typical yeast C2H2 motifs. |
V |
YER040W |
GLN3 |
539 |
High |
|
Most motifs are classic GATA or GATAAG. PBM motif 539 scores highest on ChIP. |
V |
YDR213W |
UPC2 |
544 |
High |
|
The SRE is bound by UPC2 and the "canonical" sequence is TCGTATA. However, the more degenerate version obtained by PBM (motif 544) scores better in both expression analysis and OE experiments. Newer motif 2109 scores better on ChIP-chip, but lower on expression, and the SRE is well-characterized....I think this one deserves further experimental analysis. |
V |
YER169W |
RPH1 |
547 |
High |
|
About half of the motifs look similar to each other, with GGGG core typical of many yeast C2H2 proteins. PBM motif 547 has meaningful scores on both ChIP-chip and mutant expression data. I'm somewhat concerned that motif 279 lacks two A residues captured by both PBM experiments. |
V |
YBR150C |
TBS1 |
552 |
High |
|
Two motifs from PBMs are nearly identical GAL4-class motifs with defined spacing and orientation. Motif 552 has slightly higher scores. Two motifs from BEEML analysis of PBM data give monomeric motif - also give this high confidence. |
V |
YDR520C |
URC2 |
553 |
High |
|
This is a monomeric GAL4-class motif. Two PBM studies essentially agree, and have some relationship to ChIP-chip data. No other informative data. |
V |
YER068W |
MOT2 |
556 |
Medium |
|
PBM motif 556 has high correspondence to ChIP-chip data. However, also resembles TATA element, and could also be a structural motif. RRMs normally bind single-stranded RNA or DNA. Give medium confidence. |
V |
YCR096C |
HMRA2 |
558 |
Medium |
|
Should be similar to MATALPHA2. The one PBM motif is indeed related to the MITOMI motif for MATALPHA2. |
V |
YDL048C |
STP4 |
559 |
Medium |
|
STP3 and 4 have very similar DNA-binding domains. However, they are not similar to those of STP1 and 2; the next most closely related are SWI5 and ACE2, with major differences in the recognition alpha helices. All of the STP4 motifs are different from each other and none have any supporting data. There is only one motif for STP3 (568) from PBM and it matches the STP4 motif from the same study (559) which is the basis for choosing these two motifs. |
V |
YDR096W |
GIS1 |
562 |
High |
|
All motifs similar; PBM motif 562 has highest correspondence to deletion expression data and overexpression data |
V |
YIR013C |
GAT4 |
565 |
High |
|
Two PBM motifs look similar, also similar to a subset of other GATAs. 565 scores higher on expression and OE data. |
V |
YLR375W |
STP3 |
568 |
Medium |
|
STP3 and 4 have very similar DNA-binding domains. However, they are not similar to those of STP1 and 2; the next most closely related are SWI5 and ACE2, with major differences in the recognition alpha helices. All of the STP4 motifs are different from each other and none have any supporting data. There is only one motif for STP3 (568) from PBM and it matches the STP4 motif from the same study (559) which is the basis for choosing these two motifs. |
V |
YDR146C |
SWI5 |
569 |
High |
|
PBM, Chip-chip, and conservation all yield similar motifs. ChIP-chip scores highest in ChIP-chip but that is circular. Choose PBM motif 569 which is nearly identical. |
V |
YCR065W |
HCM1 |
570 |
High |
|
PBM and SAAB/EMSA motifs both look similar to standard FH motif. PBM motif 570 has stronger correspondence to expression data. |
V |
YJL089W |
SIP4 |
573 |
Medium |
|
PBM motif 573 is a monomeric GAL4-type motif (others appear dimeric) but it has good correspondence to ChIP-chip data. Only a few of the dimeric sites are more significant - the motif from in vivo analysis (PMID: 14685767) does not score as highly as 2067 from ChIP-chip data, but they look very similar. This is 573, the presumed monomeric site |
V |
YJR127C |
RSF2 |
575 |
High |
|
No supporting data, but the PBM motif 575 looks like a typical yeast C2H2 motif (Adr1, which has similar zinc fingers, Mig1, etc). |
V |
YDR216W |
ADR1 |
576 |
High |
|
PBM motif 576 has significant correspondence to both ChIP-chip and highest to expression data. And has a classic yeast C2H2 look. |
V |
YPL021W |
ECM23 |
578 |
High |
|
PBM motif 578 strongly resembles that from other yeast GATA-class TFs |
V |
YDR303C |
RSC3 |
580 |
High |
|
PBM motif 580 has best correspondence to expression data - the only significant independent criterion - considering that the correlations are all in the same orientation (they are not for 2165). All motifs look similar. Propose that longer motifs could be due to multiple binding sites in the same sequence. |
V |
YHR206W |
SKN7 |
583 |
High |
|
Motifs are remarkably discordant considering that they all resemble each other in being G+C rich and containing a GGCC core. Possibly reflecting different modes of multimerization? Include the two that score highest on independent data: PBM motif 583, which represents a monomer, and ChIP-chip motif 380, which appears to represent a dimer. |
V |
YER111C |
SWI4 |
584 |
High |
|
Motif is well-characterized and most published motifs match the expected one. PBM motif (584) scores highly (although not highest) in Chip-chip data. It is, however, non-circular, and specifically captures "DNA metabolic process" in GO analysis. |
V |
YIL036W |
CST6 |
585 |
High |
|
PBM motif 585 correlates with expression data (deletion and overexpression). ChIP motif 1466 has higher ChIP score but is lower on expression. |
V |
YPR022C |
|
588 |
High |
|
Only one motif available, from PBMs; classical yeast C2H2 motif, and has some relationship to ChIP-chip data. |
V |
YDR259C |
YAP6 |
599 |
High |
|
PBM and ChIP-chip can derive basically the same motif, which is a classical YAP motif. They score similarly on all criteria. The ChIP-chip motif (599) has fewer low-information flanking bases. |
V |
YHL027W |
RIM101 |
600 |
High |
|
ChIP-chip motif 600 is almost identical to PBM motif 513, but scores slightly higher on expression data. Three of six motifs are very similar. |
V |
YPR199C |
ARR1 |
603 |
Medium |
|
Only motif 603 has significant scores with ChIP-chip and expression data; looks somewhat like a YAP motif |
V |
YOR140W |
SFL1 |
605 |
Medium |
|
None of the motifs are highly related to each other. But, most share a GAAG core and are otherwise A-rich. The PBM motif 839 in particular is compatible with the putative binding sites that are mutated in PMID 17594096, and it scores well on ChIP-chip. Other motifs may represent different multimerization configurations. ChIP-chip motif 605 also scores well on ChIP-chip data, which is circular, but I will retain it for completeness. |
V |
YGL073W |
HSF1 |
615 |
Medium |
|
Four types of motifs contain TTC monomeric core and all score highly on both ChIP and expression. Appear to represent different monomeric/multimeric binding configurations. This is the trimeric site. From ChIP. |
V |
YLR403W |
SFP1 |
621 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YPR104C |
FHL1 |
629 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YKL185W |
ASH1 |
648 |
Incorrect |
|
Likely represents Mcm1 binding site. |
V |
YDL170W |
UGA3 |
651 |
High |
|
Appears to be a dimeric GAL4-class motif. Scores highest in ChIP-chip data, but is derived from the same data. GO seems to match known function! |
V |
YGL071W |
AFT1 |
658 |
High |
|
Most motifs are similar. Also very similar to AFT2 motifs. ChIP-chip motif 658 scores highest on both ChIP-chip and expression data. |
V |
YDR463W |
STP1 |
660 |
High |
|
STP1 and 2 have very similar DNA-binding domains. However, they are not similar to those of STP3 and 4. PBM motif for STP2 (800) correlates with ChIP-chip and expression data. ChIP-chip motif for STP1 (660) most strongly resembles motif 800, and scores highly on ChIP-chip data. In addition, these motifs resemble halfmers of literature-derived binding sites. |
V |
YHL009C |
YAP3 |
672 |
High |
|
ChIP-chip yields a classic 7-mer Yap motif that scores well on ChIP and significantly on expression. Could be a heterodimer. Chose 672 over 1463 because it has a higher score on expression data, which is independent. |
V |
YGL162W |
SUT1 |
673 |
Medium |
|
Four motifs, all derived from ChIP-chip, contain CGG, but are unusual, with degeneracy and a core of CGGGG. Correlate somewhat with both OE and deletion data, however. |
V |
YNL314W |
DAL82 |
690 |
High |
|
PBM and ChIP-chip motifs agree; select ChIP-chip as it scores higher on ChIP-chip although the extra A's on the side could be either due to the FL protein or some other in vivo factor. |
V |
YGL181W |
GTS1 |
694 |
Low |
|
None of the three motifs resembles an AT-hook binding site. Only one correlates with the ChIP-chip data, but that's circular. Low confidence. |
V |
YOR358W |
HAP5 |
695 |
High |
|
Subunit of the heme-activated, glucose-repressed Hap2/3/4/5 CCAAT-binding complex - there should be a single motif for all four proteins, containing CCAAT. ChIP-chip motif 695 resembles CCAATCA, and scores highly on ChIP-chip, OE, and deletion expression data. |
V |
YKL109W |
HAP4 |
695 |
High |
|
Subunit of the heme-activated, glucose-repressed Hap2/3/4/5 CCAAT-binding complex - there should be a single motif for all four proteins, containing CCAAT. ChIP-chip motif 695 resembles CCAATCA, and scores highly on ChIP-chip, OE, and deletion expression data. |
V |
YGL237C |
HAP2 |
695 |
High |
|
Subunit of the heme-activated, glucose-repressed Hap2/3/4/5 CCAAT-binding complex - there should be a single motif for all four proteins, containing CCAAT. ChIP-chip motif 695 resembles CCAATCA, and scores highly on ChIP-chip, OE, and deletion expression data. |
V |
YBL021C |
HAP3 |
695 |
High |
|
Subunit of the heme-activated, glucose-repressed Hap2/3/4/5 CCAAT-binding complex - there should be a single motif for all four proteins, containing CCAAT. ChIP-chip motif 695 resembles CCAATCA, and scores highly on ChIP-chip, OE, and deletion expression data. |
V |
YDR026C |
|
696 |
High |
|
Three ChIP-chip motifs are virtually identical in appearance; resemble Reb1 motifs; high correspondence to ChIP-chip data |
V |
YMR053C |
STB2 |
710 |
Incorrect |
|
Likely represents Reb1 binding site. |
V |
YOL108C |
INO4 |
713 |
High |
|
Ino2/4 binds as a heterodimer, so there should just be one motif for the two proteins. All motifs appear similar but none of them is derived from in vitro data. Nonetheless most motifs match a classic E-box with some preference for flanking bases. Motif 713 is derived from ChIP-chip; it is not the highest-scoring ChIP-chip motif but it is highest for OE and deletion expression. |
V |
YDR123C |
INO2 |
713 |
High |
|
Ino2/4 binds as a heterodimer, so there should just be one motif for the two proteins. All motifs appear similar but none of them is derived from in vitro data. Nonetheless most motifs match a classic E-box with some preference for flanking bases. Motif 713 is derived from ChIP-chip; it is not the highest-scoring ChIP-chip motif but it is highest for OE and deletion expression. |