|
|
|
|
|
|
|
V |
YJL056C |
ZAP1 |
2097 |
High |
|
Most motifs are similar but do not exceed confidence thresholds on any data type. PBM motif 2097 has highest score for ChIP and expression, and is not circular |
V |
YOR162C |
YRR1 |
2245 |
High |
|
Classic monomeric GAL4-class motif. PBM studies agree and score significantly on Harbison data. No other motifs have spacing/orientation except 11909958, but even the authors of this study note that "Only half a dyad seems to be conserved in this consensus sequence". 2245 scores highest in Harbison data. |
V |
YOR172W |
YRM1 |
813 |
High |
|
Two PBM studies largely agree on classic GAL4-class monomeric motif. Motif 813 has indications of spacing and orientation of dimeric protein. |
V |
YML027W |
YOX1 |
498 |
High |
|
Two PBM studies and Pramila et al. (PMID 12464633) agree on classic homeodomain TAATTA motif. All three correlate with expression change and OE. Motif 453 is not a direct measurement so choose PBM motif that is the same length as the typical homeodomain footprint - 498 also correlates best with OE data; expression scores are skewed low by the large number of cell-cycle measurements. |
V |
YDR451C |
YHP1 |
716 |
High |
|
ChIP-chip, EMSA, and one-hybrid all arrive at a classic homeodomain TAATTG motif. Microarray enrichment motif (716) scores higher on OE data from another study than ChIP motifs do, and does nearly as well on ChIP data. |
V |
YOL028C |
YAP7 |
1737 |
High |
|
7-base bZIP core. Obtained in ChIP-chip studies and higher correspondence to stressed ChIP-chip data. Possible heterodimer? Little literature on this protein. 1737 chosen because it is largely symmetric and has highest score for both stressed and unstressed Harbison data, also, higher GO score |
V |
YOL028C |
YAP7 |
1414 |
High |
|
8-base bZIP core. Obtained by Mitomi, so this is a homodimer. Higher correspondence to unstressed ChIP-chip data. Little literature on this protein. 1414 chosen for higher ChIP-chip overall scores; plus, it is a palindrome as expected for a bZIP protein. |
V |
YDR259C |
YAP6 |
599 |
High |
|
PBM and ChIP-chip can derive basically the same motif, which is a classical YAP motif. They score similarly on all criteria. The ChIP-chip motif (599) has fewer low-information flanking bases. |
V |
YIR018W |
YAP5 |
777 |
High |
|
ChIP-chip yields a classic 7-mer Yap motif that scores well on ChIP and significantly on expression. |
V |
YIR018W |
YAP5 |
896 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YHL009C |
YAP3 |
672 |
High |
|
ChIP-chip yields a classic 7-mer Yap motif that scores well on ChIP and significantly on expression. Could be a heterodimer. Chose 672 over 1463 because it has a higher score on expression data, which is independent. |
V |
YHL009C |
YAP3 |
1411 |
High |
|
Mitomi yields a nearly palindromic 8-mer motif with strong similarity to that of Yap6. PBM motif is similar but appears to be partial. |
V |
YML007W |
YAP1 |
2186 |
High |
|
PBM motif 2186 looks like a monomeric bZIP site but it has the highest scores on both ChIP and expression |
V |
YIL101C |
XBP1 |
2039 |
High |
|
PBM and in vitro selection-derived motifs have highest scores across the board. 842 is higher on GO, but only slightly in AUC, and it has a very large number of empty flanking bases. 2039 (in vitro selection) seems a reasonable compromise - it's highest on ChIP and almost the highest on expression. |
V |
YOR229W |
WTM2 |
0 |
|
Dubious |
It is not clear that this is a sequence-specific DNA-binding protein; it contains no DNA-binding domain and has no known in vitro sequence specificity. The motif comes only from ChIP-chip so scoring on ChIP-chip is circular. |
V |
YOR230W |
WTM1 |
1148 |
Low |
Dubious |
It is not clear that this is a sequence-specific DNA-binding protein; it contains no DNA-binding domain and has no known in vitro sequence specificity. The motifs come only from ChIP-chip so scoring on ChIP-chip is circular. |
V |
YML076C |
WAR1 |
325 |
Low |
|
None of the motifs are convincing, but at least sequences with the literature motif have been experimentally confirmed to bind the protein (even if it is not shown that this is the optimal binding site) |
V |
YDR485C |
VPS72 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YIL056W |
VHR1 |
2091 |
Medium |
|
PBM motif has high score on GO because it looks a lot like Gcn4 |
V |
YPL230W |
USV1 |
509 |
High |
|
Two PBM studies essentially agree on classical C2H2 GGGG-containing motif. Chose 509 because it scores much higher on both ChIP and expression data. |
V |
YDR520C |
URC2 |
553 |
High |
|
This is a monomeric GAL4-class motif. Two PBM studies essentially agree, and have some relationship to ChIP-chip data. No other informative data. |
V |
YDR213W |
UPC2 |
544 |
High |
|
The SRE is bound by UPC2 and the "canonical" sequence is TCGTATA. However, the more degenerate version obtained by PBM (motif 544) scores better in both expression analysis and OE experiments. Newer motif 2109 scores better on ChIP-chip, but lower on expression, and the SRE is well-characterized....I think this one deserves further experimental analysis. |
V |
YDR207C |
UME6 |
2239 |
High |
|
All motifs are similar to each other. BEEML-PBM motif 2239 scores highest across the board. |
V |
YPL139C |
UME1 |
1143 |
Low |
Dubious |
It is not clear that this is a sequence-specific DNA-binding protein; it contains no DNA-binding domain and has no known in vitro sequence specificity. The motif comes only from ChIP-chip. But, it has a high P-value, and the motif has low similarity to other motifs, with the possible exception of Yox1. But the function of the protein is very different from that of Yox1. Tough call - leave as Dubious, but give Low confidence to motif 1143. |
V |
YDL170W |
UGA3 |
651 |
High |
|
Appears to be a dimeric GAL4-class motif. Scores highest in ChIP-chip data, but is derived from the same data. GO seems to match known function! |
V |
YDL170W |
UGA3 |
486 |
Medium |
|
Appears to be a monomeric GAL4-class motif. Derived from PBM data, scores highly in ChIP-chip data, but not as high as the dimeric site derived from the ChIP-chip data. |
V |
YOR344C |
TYE7 |
397 |
High |
|
All studies except one get canonical HLH motif. 795 (PBM) is nearly tied for best ChIP-chip score with the best ChIP-chip motif. Still, ChIP motif 397 scores higher, and looks identical, but with fewer flanking empty positions. |
V |
YNL079C |
TPM1 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YGL096W |
TOS8 |
494 |
High |
|
No corroborating data on this TF, and only one PBM motif known and one ChIP motif. But, it resembles TGTCA, which was also obtained for paralog Cup9 by multiple approaches (GTGNCA), as well as PBM results for the Meis/Mrg/Pknox/Tgif family, which are the closest mammalian homologs. The ChIP motif (1902) does not resemble a homeodomain binding sequence, and scores lower on expression data. |
V |
YBL054W |
TOD6 |
852 |
High |
|
Two PBM motifs largely agree; 852 has higher correspondence to expression data while 495 has higher correspondence to ChIP-chip. Use 852; score is way higher. Also for GO. |
V |
YNL139C |
THO2 |
786 |
Low |
Dubious |
It is not clear that this is a sequence-specific DNA-binding protein; it contains no DNA-binding domain and has no known in vitro sequence specificity. The motif comes only from ChIP-chip. |
V |
YER063W |
THO1 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YBR240C |
THI2 |
1449 |
High |
|
This is a GAL4-class protein. All motifs are ChIP-chip derived, none resembles each other. 1449 is the only one with respectable scores on ChIP and expression,and it also has the appearance of a GAL4 class motif..although, the structural prior presumably forces it to have this property. |
V |
YDR362C |
TFC6 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YBR083W |
TEC1 |
815 |
High |
|
All motifs agree, and are significant by several criteria. PBM motif 815 has the second-highest scores overall, and it is non-circular for in vivo binding. Also has highest GO score. |
V |
YOR337W |
TEA1 |
817 |
Medium |
|
Three motifs, all from PBMs. Choose 817 because it has a more robust GAL4 "CGG" core. But there is no convincing corroborating data for either motif and they do not match each other. |
V |
YBR150C |
TBS1 |
552 |
High |
|
Two motifs from PBMs are nearly identical GAL4-class motifs with defined spacing and orientation. Motif 552 has slightly higher scores. Two motifs from BEEML analysis of PBM data give monomeric motif - also give this high confidence. |
V |
YBR150C |
TBS1 |
2179 |
High |
|
Two motifs from PBMs are nearly identical GAL4-class motifs with defined spacing and orientation. Motif 552 has slightly higher scores. Two motifs from BEEML analysis of PBM data give monomeric motif - also give this high confidence. |
V |
TBP-TFIIB |
TBP-TFIIB |
1329 |
Medium |
|
The TIRF-PBM data used to generate the motif included only 96 sequences; hence, medium confidence. |
V |
TBP-TFIIA-TFIIB |
TBP-TFIIA-TFIIB |
1330 |
Medium |
|
The TIRF-PBM data used to generate the motif included only 96 sequences; hence, medium confidence. |
V |
TBP-TFIIA |
TBP-TFIIA |
1328 |
Low |
|
The TIRF-PBM data used to generate the motif included only 96 sequences. Also it is curious that there is no TATA sequence in the logo. |
V |
YPL128C |
TBF1 |
2178 |
High |
|
All motifs, obtained by three different means, are all very similar, although there is no ChIP or expression support for any of them. Went with 2178, which is the BEEML output. |
V |
YLR182W |
SWI6 |
0 |
|
Dubious |
Swi6 is a cofactor, not a DNA-binding protein. These motifs are for Mbp1 or Swi4. |
V |
YDR146C |
SWI5 |
569 |
High |
|
PBM, Chip-chip, and conservation all yield similar motifs. ChIP-chip scores highest in ChIP-chip but that is circular. Choose PBM motif 569 which is nearly identical. |
V |
YER111C |
SWI4 |
584 |
High |
|
Motif is well-characterized and most published motifs match the expected one. PBM motif (584) scores highly (although not highest) in Chip-chip data. It is, however, non-circular, and specifically captures "DNA metabolic process" in GO analysis. |
V |
YJL176C |
SWI3 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YPL016W |
SWI1 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YGR002C |
SWC4 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YPR009W |
SUT2 |
2236 |
High |
|
Highest-scoring motif (PBM) is a classical GAL4-type monomeric motif and is very significant in ChIP-chip |
V |
YGL162W |
SUT1 |
673 |
Medium |
|
Four motifs, all derived from ChIP-chip, contain CGG, but are unusual, with degeneracy and a core of CGGGG. Correlate somewhat with both OE and deletion data, however. |
V |
YDR310C |
SUM1 |
383 |
High |
|
This is the motif for the FL SUM1; scores highest on ChIP-chip and resembles the canonical literature motif; also has some relationship to deletion expression data |
V |
YDR310C |
SUM1 |
478 |
High |
|
This is the motif for the SUM1 AT_hook; scores highest in deletion expression data |
V |
YPR086W |
SUA7 |
1327 |
Low |
Dubious |
This protein is not expected to bind DNA; it is supposed to bind DNA-bound TBP. The TIRF-PBM data used to generate the motif included only 96 sequences. |
V |
YDL048C |
STP4 |
559 |
Medium |
|
STP3 and 4 have very similar DNA-binding domains. However, they are not similar to those of STP1 and 2; the next most closely related are SWI5 and ACE2, with major differences in the recognition alpha helices. All of the STP4 motifs are different from each other and none have any supporting data. There is only one motif for STP3 (568) from PBM and it matches the STP4 motif from the same study (559) which is the basis for choosing these two motifs. |
V |
YLR375W |
STP3 |
568 |
Medium |
|
STP3 and 4 have very similar DNA-binding domains. However, they are not similar to those of STP1 and 2; the next most closely related are SWI5 and ACE2, with major differences in the recognition alpha helices. All of the STP4 motifs are different from each other and none have any supporting data. There is only one motif for STP3 (568) from PBM and it matches the STP4 motif from the same study (559) which is the basis for choosing these two motifs. |
V |
YHR006W |
STP2 |
2174 |
High |
|
STP1 and 2 have very similar DNA-binding domains. However, they are not similar to those of STP3 and 4. PBM motif for STP2 (2174) correlates highest with ChIP-chip and expression data. ChIP-chip motif for STP1 (660) most strongly resembles motif 800, and scores highly on ChIP-chip data. In addition, these motifs resemble halfmers of literature-derived binding sites. |
V |
YDR463W |
STP1 |
660 |
High |
|
STP1 and 2 have very similar DNA-binding domains. However, they are not similar to those of STP3 and 4. PBM motif for STP2 (800) correlates with ChIP-chip and expression data. ChIP-chip motif for STP1 (660) most strongly resembles motif 800, and scores highly on ChIP-chip data. In addition, these motifs resemble halfmers of literature-derived binding sites. |
V |
YHR084W |
STE12 |
400 |
High |
|
All motifs but one resemble the canonical literature site. Motif 400 is derived from ChIP-chip data (on which it scores highest) but also scores highest on expression data. |
V |
YKL072W |
STB6 |
0 |
|
Dubious |
It is not clear that this is a sequence-specific DNA-binding protein; it contains no DNA-binding domain and has no known in vitro sequence specificity. The motif comes only from ChIP-chip so scoring on ChIP-chip is circular. The ChIP-chip motif looks a little like a Rap1 motif. |
V |
YHR178W |
STB5 |
1405 |
High |
|
All motifs have CGG core and most have CGGnG. Most ChIP-derived motifs have no relationship to expression data. Mitomi motif 1405 and PBM motif 514 score decently on both ChIP-chip and expression data, and seem to nail the GO category (oxidative stress response), and look like classic Gal4 halfmers. MITOMI motif scores slighly higher overall. This is presumably the monomeric motif |
V |
YHR178W |
STB5 |
2068 |
Medium |
|
All motifs have CGG core and most have CGGnG. Most ChIP-derived motifs have no relationship to expression data. Motif 2068 scores highest overall; looks a bit unusual for a Gal4 class motif but also does well on expression data. Retain as potential dimer motif, although it may also incorporate extrinsic information. |
V |
YMR019W |
STB4 |
2107 |
High |
|
PBM motif 2107 is clearly a dimeric GAL4-class motif, and it blows all the other motifs out of the water. |
V |
YDR169C |
STB3 |
2233 |
High |
|
STB3 binds RRPE element (AAAAATTT) both in vivo and in vitro (PMID 17616518). PBM motifs 810 and 2233 strongly resembles the RRPE element, scores significantly in deletion expression data, and nail the GO categories "nucleolus" and "ribosome biogenesis". 2233 gets slightly higher scores. |
V |
YMR053C |
STB2 |
710 |
Incorrect |
|
Likely represents Reb1 binding site. |
V |
YMR053C |
STB2 |
710 |
Low |
Dubious |
No direct evidence that this is a DNA-binding protein. Three ChIP-derived motifs but none scores highly by any measure. Motif 710 is an arbitrary choice - looks tidy. |
V |
YNL309W |
STB1 |
0 |
|
Dubious |
No direct evidence that this is a DNA-binding protein. It binds Swi6 and the ChIP motifs all resemble Swi4 binding sites. |
V |
YCR018C |
SRD1 |
2232 |
Medium |
|
PBM studies yield nearly identical motifs. 2232 closely resembles motif from related GATA factors and scores highest overall. This is an unusual motif for the GATA class; hence medium confidence level. |
V |
YKL020C |
SPT23 |
670 |
Low |
Dubious |
I could not find any evidence that this protein binds directly to DNA. It has an IPT domain but no REL domain. None of the ChIP-derived motifs scores highly on ChIP data or anything else. Motif 670 bears some relationship to expression data. |
V |
YER161C |
SPT2 |
1114 |
Low |
Dubious |
I could not find any evidence that this protein binds directly to DNA. None of the motifs is significant. All are from ChIP-chip. Motif 1114 chosen simploy because it has the highest numbers overall. |
V |
YER148W |
SPT15 |
798 |
High |
|
This is TATA-binding protein. PBM motif 798 chosen because 1326 was derived from the 96-sequence TIRF-PBM array instead of a full 40K PBM |
V |
YJL127C |
SPT10 |
1880 |
Low |
|
This is the protein that binds histone promoters. The sequence specificity is derived from the histone promoters only so the literature motif may be inaccurate. Motif 1880 has higher scores overall but does not resemble the literature motif. Uncertain what to do here - use 1880, but give low confidence. Motif learned in vivo could contain extrinsic information. |
V |
YMR016C |
SOK2 |
404 |
High |
|
ChIP-chip motif 404 has highest correspondence to both ChIP-chip and expression data - and strongly resembles PBM motif |
V |
YOR308C |
SNU66 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YGL131C |
SNT2 |
612 |
Low |
Dubious |
All three motifs are derived from the same ChIP-chip data. However, there is no corroborating data, and not all SANT domains are DNA-binding - or are non-specific, in chromatin proteins. So it could be a cofactor motif; in fact it is similar to motifs of Stp3 and Stp4. The protein has other chromatin-related domains (BAH, PHD/RING). Hence the "Low" assessment. |
V |
YCR033W |
SNT1 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YOR290C |
SNF2 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YDR477W |
SNF1 |
1110 |
Medium |
Dubious |
Motif 1110 has a quite strong correspondence to ChIP-chip data (from which it is derived). However, there seems to be no evidence that this is a sequence-specific DNA-binding protein. Aside from a weak relationship to expression data there is no corroborating evidence here (and no DNA-binding domain). |
V |
YBR182C |
SMP1 |
864 |
Medium |
|
PBM motif 864 scores highest on ChIP-chip and expression data. I gave it a medium, however, because it has low information content at most positions, does not closely match the literature motif (although the literature motif does not mach ChIP-chip or expression data), and also does not resemble that of RLM1, which according to the literature should be related. |
V |
YPR054W |
SMK1 |
1875 |
Low |
Dubious |
I could not find any evidence that this protein binds directly to DNA. There is only one motif derived from ChIP-chip but it bears little relationship to the data from which it was derived. |
V |
YNL167C |
SKO1 |
1401 |
High |
|
The MITOMI motif 1401 is an offset and asymmetric version of the traditional consensus (TGACGTCA) but has a higher ChIP-chip and expression correspondence than the motifs that are more symmetric. |
V |
YHR206W |
SKN7 |
583 |
High |
|
Motifs are remarkably discordant considering that they all resemble each other in being G+C rich and containing a GGCC core. Possibly reflecting different modes of multimerization? Include the two that score highest on independent data: PBM motif 583, which represents a monomer, and ChIP-chip motif 380, which appears to represent a dimer. |
V |
YHR206W |
SKN7 |
380 |
High |
|
Motifs are remarkably discordant considering that they all resemble each other in being G+C rich and containing a GGCC core. Possibly reflecting different modes of multimerization? Include the two that score highest on independent data: PBM motif 583, which represents a monomer, and ChIP-chip motif 380, which appears to represent a dimer. |
V |
YDR409W |
SIZ1 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YDR227W |
SIR4 |
0 |
|
Dubious |
There is no evidence that the SIR proteins are sequence-specific DNA-binding proteins. Most of the motifs for them are Rap1 sites. |
V |
YLR442C |
SIR3 |
0 |
|
Dubious |
There is no evidence that the SIR proteins are sequence-specific DNA-binding proteins. Most of the motifs for them are Rap1 sites. |
V |
YDL042C |
SIR2 |
0 |
|
Dubious |
There is no evidence that the SIR proteins are sequence-specific DNA-binding proteins. Most of the motifs for them are Rap1 sites. |
V |
YKR101W |
SIR1 |
0 |
|
Dubious |
There is no evidence that the SIR proteins are sequence-specific DNA-binding proteins. Most of the motifs for them are Rap1 sites. |
V |
YJL089W |
SIP4 |
2067 |
Medium |
|
PBM motif 573 is a monomeric GAL4-type motif (others appear dimeric) but it has good correspondence to ChIP-chip data. Only a few of the dimeric sites are more significant - the motif from in vivo analysis (PMID: 14685767) does not score as highly as 2067 from ChIP-chip data, but they look very similar. This is 2067, the presumed dimeric site. |
V |
YJL089W |
SIP4 |
573 |
Medium |
|
PBM motif 573 is a monomeric GAL4-type motif (others appear dimeric) but it has good correspondence to ChIP-chip data. Only a few of the dimeric sites are more significant - the motif from in vivo analysis (PMID: 14685767) does not score as highly as 2067 from ChIP-chip data, but they look very similar. This is 573, the presumed monomeric site |
V |
YNL257C |
SIP3 |
0 |
|
Dubious |
Sip3 is a protein that "transcription through interaction with DNA-bound Snf1p" (SGD); no DNA-binding domain and no evidence for direct interaction with DNA or intrinsic sequence specificity. |
V |
YLR403W |
SFP1 |
357 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YLR403W |
SFP1 |
621 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YLR403W |
SFP1 |
1710 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YLR403W |
SFP1 |
1100 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YLR403W |
SFP1 |
797 |
High |
|
Most ChIP-seq studies identified the Rap1 motif. PBM motif 797 is less significant by ChIP-seq (although still highly significant) but is the winner across the board for all types of expression data. |
V |
YOR140W |
SFL1 |
839 |
Medium |
|
None of the motifs are highly related to each other. But, most share a GAAG core and are otherwise A-rich. The PBM motif 839 in particular is compatible with the putative binding sites that are mutated in PMID 17594096, and it scores well on ChIP-chip. Other motifs may represent different multimerization configurations. ChIP-chip motif 605 also scores well on ChIP-chip data, which is circular, but I will retain it for completeness. |
V |
YOR140W |
SFL1 |
605 |
Medium |
|
None of the motifs are highly related to each other. But, most share a GAAG core and are otherwise A-rich. The PBM motif 839 in particular is compatible with the putative binding sites that are mutated in PMID 17594096, and it scores well on ChIP-chip. Other motifs may represent different multimerization configurations. ChIP-chip motif 605 also scores well on ChIP-chip data, which is circular, but I will retain it for completeness. |
V |
YBL052C |
SAS3 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YOR077W |
RTS2 |
0 |
|
Dubious |
Homolog of Kin17; not a typical C2H2 zinc finger. Believed to be "chromatin-associated proteins involved in UV response and DNA replication". No evidence for sequence-specific DNA-binding. Single ChIP-chip motif does not have strong correspondence to the data from which it is derived. |
V |
YBL103C |
RTG3 |
870 |
Low |
|
Only the PBM motif is a classic HLH motif. Three different ChIP-chip-derived motifs are all diverse, but all score highly on ChIP-chip data! Are they motifs of other TFs? Check. 602: GCN4; 1095, TEC1; 1096: resembles 602, but is a closer match to CUP9/TOS8. Also hits GCN4. According to the literature (PMID: 9032238) the core binding site for the Rtg1p-Rtg3p heterodimer is 5'-GGTCAC-3'; the only motif that resembles this is 1446. Vague resemblance to 602 and 1096. I am going to retain 1446, which represents the literature site; PBM motif 870, which resembles an E-box, and ChIP-chip motif 1445, which scores highest on ChIP-chip data. But give all low confidence. |