|
|
|
|
|
|
|
V |
YOL067C |
RTG1 |
1493 |
Low |
|
1493 and 1494 are a toss-up and could represent different dimerization partners, conceivably. Similar to 1445 and 1446 above. Retain both but give low confidence. |
V |
YOL067C |
RTG1 |
1494 |
Low |
|
1493 and 1494 are a toss-up and could represent different dimerization partners, conceivably. Similar to 1445 and 1446 above. Retain both but give low confidence. |
V |
YFL031W |
HAC1 |
94 |
Medium |
|
1788 is the overall winner. But, literature motif 94 also scores well in ChIP-chip, despite being somewhat different. Possible difference in heterodimerazation partners, or proteolytic fragment? Retain both. |
V |
YFL031W |
HAC1 |
1788 |
High |
|
1788 is the overall winner. But, literature motif 94 also scores well in ChIP-chip, despite being somewhat different. Possible difference in heterodimerization partners, or proteolytic fragment? Retain both, score 94 as medium. |
V |
YOL028C |
YAP7 |
1737 |
High |
|
7-base bZIP core. Obtained in ChIP-chip studies and higher correspondence to stressed ChIP-chip data. Possible heterodimer? Little literature on this protein. 1737 chosen because it is largely symmetric and has highest score for both stressed and unstressed Harbison data, also, higher GO score |
V |
YOL028C |
YAP7 |
1414 |
High |
|
8-base bZIP core. Obtained by Mitomi, so this is a homodimer. Higher correspondence to unstressed ChIP-chip data. Little literature on this protein. 1414 chosen for higher ChIP-chip overall scores; plus, it is a palindrome as expected for a bZIP protein. |
V |
YER169W |
RPH1 |
547 |
High |
|
About half of the motifs look similar to each other, with GGGG core typical of many yeast C2H2 proteins. PBM motif 547 has meaningful scores on both ChIP-chip and mutant expression data. I'm somewhat concerned that motif 279 lacks two A residues captured by both PBM experiments. |
V |
YPR065W |
ROX1 |
1396 |
High |
|
About half the motifs have a typical ACAAT Sox core. MITOMI motif 1396 has highest correspondence to both ChIP-chip and deletion expression data. |
V |
YCR040W |
MATALPHA1 |
1418 |
Low |
|
According to PMID: 15118075, binds the "Q site" which has "consensus" ACAATGACAG. Seems all that is in common is the CAAT. I believe further study is required. |
V |
YCR039C |
MATALPHA2 |
1364 |
High |
|
According to PMID: 9858582, "A comparison of the 2 binding sites in both asg and hsg operators yields the same consensus sequence, 5'-CATGTA-3"; results in Figure 2 of the same paper support a consensus of CATGTAA. MITOMI yields ACATG, which is the reverse complement of most of the literature consensus. Motif 1364 has highest information content; use this. |
V |
YBR083W |
TEC1 |
815 |
High |
|
All motifs agree, and are significant by several criteria. PBM motif 815 has the second-highest scores overall, and it is non-circular for in vivo binding. Also has highest GO score. |
V |
YOR380W |
RDR1 |
2158 |
High |
|
All motifs are related except 1851. PBM motif 2158 is monomeric and has highest correspondence to ChIP-chip data. The literature motif 756 consists of two back-to-back and slightly overlapping versions of the monomeric PBM motif. There is no evidence for direct binding in this specific spacing and orientation; however, the results of mutations in reporters indicate that both copies are necessary for induction in the mutant. Retain both motifs. |
V |
YOR380W |
RDR1 |
756 |
Medium |
|
All motifs are related except 1851. PBM motif 2158 is monomeric and has highest correspondence to ChIP-chip data. The literature motif 756 consists of two back-to-back and slightly overlapping versions of the monomeric PBM motif. There is no evidence for direct binding in this specific spacing and orientation; however, the results of mutations in reporters indicate that both copies are necessary for induction in the mutant. Retain both motifs. |
V |
YDR207C |
UME6 |
2239 |
High |
|
All motifs are similar to each other. BEEML-PBM motif 2239 scores highest across the board. |
V |
YBR049C |
REB1 |
907 |
High |
|
All motifs are similar. ChIP-chip motif 907 has highest correspondence to both ChIP-chip and expression data, and strongly resembles MITOMI and PBM motifs. |
V |
YHR084W |
STE12 |
400 |
High |
|
All motifs but one resemble the canonical literature site. Motif 400 is derived from ChIP-chip data (on which it scores highest) but also scores highest on expression data. |
V |
YPL133C |
RDS2 |
2226 |
Medium |
|
All motifs contain CGG. PBM motif 2226 appears to be a monomeric version of literature motif 757. However, the paper that produced motif 757 did not demonstrate that this is an optimal binding site. Retain both motifs and give them a "medium" confidence. |
V |
YPL133C |
RDS2 |
757 |
Medium |
|
All motifs contain CGG. PBM motif 2226 appears to be a monomeric version of literature motif 757. However, the paper that produced motif 757 did not demonstrate that this is an optimal binding site. Retain both motifs and give them a "medium" confidence. |
V |
YHR178W |
STB5 |
1405 |
High |
|
All motifs have CGG core and most have CGGnG. Most ChIP-derived motifs have no relationship to expression data. Mitomi motif 1405 and PBM motif 514 score decently on both ChIP-chip and expression data, and seem to nail the GO category (oxidative stress response), and look like classic Gal4 halfmers. MITOMI motif scores slighly higher overall. This is presumably the monomeric motif |
V |
YHR178W |
STB5 |
2068 |
Medium |
|
All motifs have CGG core and most have CGGnG. Most ChIP-derived motifs have no relationship to expression data. Motif 2068 scores highest overall; looks a bit unusual for a Gal4 class motif but also does well on expression data. Retain as potential dimer motif, although it may also incorporate extrinsic information. |
V |
YPL202C |
AFT2 |
389 |
High |
|
All motifs look similar. ChIP-chip motif 389 scores high on ChIP-chip data and also best on expression data. |
V |
YCR106W |
RDS1 |
506 |
High |
|
All motifs look similar. PBM motif 506 has a higher score on ChIP-chip than any of the ChIP-chip derived motifs. |
V |
YDR096W |
GIS1 |
562 |
High |
|
All motifs similar; PBM motif 562 has highest correspondence to deletion expression data and overexpression data |
V |
YPL128C |
TBF1 |
2178 |
High |
|
All motifs, obtained by three different means, are all very similar, although there is no ChIP or expression support for any of them. Went with 2178, which is the BEEML output. |
V |
YLR013W |
GAT3 |
2128 |
High |
|
All PBM motifs look similar, also similar to a subset of other GATAs. 2128 scores quite highly on ChIP-chip (albeit with negative correlation!), and also higher on expression and OE data. |
V |
YOR344C |
TYE7 |
397 |
High |
|
All studies except one get canonical HLH motif. 795 (PBM) is nearly tied for best ChIP-chip score with the best ChIP-chip motif. Still, ChIP motif 397 scores higher, and looks identical, but with fewer flanking empty positions. |
V |
YDL056W |
MBP1 |
2138 |
High |
|
Almost all motifs look similar to literature binding site. PBM motif 2138 scores at the top on ChIP-chip and expression. And is non-circular. |
V |
YFR034C |
PHO4 |
2222 |
High |
|
Almost all motifs match classic HLH E-box. PBM motif 2222 has highest match to both ChIP-chip and expression data, without being circular. |
V |
YDL170W |
UGA3 |
651 |
High |
|
Appears to be a dimeric GAL4-class motif. Scores highest in ChIP-chip data, but is derived from the same data. GO seems to match known function! |
V |
YDL170W |
UGA3 |
486 |
Medium |
|
Appears to be a monomeric GAL4-class motif. Derived from PBM data, scores highly in ChIP-chip data, but not as high as the dimeric site derived from the ChIP-chip data. |
V |
YHR056C |
RSC30 |
2164 |
Medium |
|
Arbitrary choice - all PBM motifs look similar (and resemble motif from homolog Rsc3). I have downgraded this one from high to medium because the best scoring motif actually looks the least like the Rsc3 motif. |
V |
YLR266C |
PDR8 |
244 |
Medium |
|
Both motifs are equally credible but have very limited support. Literature motif is related to that of YRR1 literature motif. PBM motif, however, is a classic GAL4 monomer. This is the literature motif. |
V |
YLR266C |
PDR8 |
528 |
Medium |
|
Both motifs are equally credible but have very limited support. Literature motif is related to that of YRR1 literature motif. PBM motif, however, is a classic GAL4 monomer. This is the PBM motif. |
V |
YML099C |
ARG81 |
1506 |
High |
|
ChIP motif 1506 correlates well with ChIP and also with expression data. Resembles dimeric GAL4 class motif. |
V |
YFL044C |
OTU1 |
1166 |
Low |
|
ChIP-chip motif 1166 has a good relationship to ChIP-chip data, but it is unusual for C2H2 motifs to be A/T rich, and there is no other support for this motif, so it could be a cofactor, nucleosome-excluding, TATA element, etc. In addition, it only has a single C2H2 domain, and is known to function as a deubiquitylation enzyme. Low confidence. |
V |
YPL248C |
GAL4 |
1510 |
High |
|
ChIP-chip motif 1510 resembles literature motif, and PBM motif 875, but scores highly on ChIP and expression data, across the board. Note, however, that the high ChIP-chip scores stem from an experiment with high negative correlation. PBM motif 2206 appears to be a monomeric version, socres even higher on ChIP-chip and expression. |
V |
YPL248C |
GAL4 |
2206 |
High |
|
ChIP-chip motif 1510 resembles literature motif, and PBM motif 875, but scores highly on ChIP and expression data, across the board. Note, however, that the high ChIP-chip scores stem from an experiment with high negative correlation. PBM motif 2206 appears to be a monomeric version, socres even higher on ChIP-chip and expression. |
V |
YLR014C |
PPR1 |
2064 |
Low |
|
ChIP-chip motif 2064 almost matches the literature site, which has been confirmed by directed experimentation, and scores highest on most measures. But, give it low confidence - it is not at all clear that this is an optimal binding site, and none of the scores for any of the motifs are all that high. |
V |
YMR016C |
SOK2 |
404 |
High |
|
ChIP-chip motif 404 has highest correspondence to both ChIP-chip and expression data - and strongly resembles PBM motif |
V |
YHL027W |
RIM101 |
600 |
High |
|
ChIP-chip motif 600 is almost identical to PBM motif 513, but scores slightly higher on expression data. Three of six motifs are very similar. |
V |
YFL021W |
GAT1 |
962 |
High |
|
ChIP-chip motif 962 scores higher on both ChIP-chip and expression data |
V |
YPR104C |
FHL1 |
2203 |
High |
|
ChIP-chip motifs are all Rap1. PBMs identify a different motif which also corresponds to ChIP-chip data. Selected 2203 as it scores highest on ChIP-chip and expression data. |
V |
YIR018W |
YAP5 |
777 |
High |
|
ChIP-chip yields a classic 7-mer Yap motif that scores well on ChIP and significantly on expression. |
V |
YHL009C |
YAP3 |
672 |
High |
|
ChIP-chip yields a classic 7-mer Yap motif that scores well on ChIP and significantly on expression. Could be a heterodimer. Chose 672 over 1463 because it has a higher score on expression data, which is independent. |
V |
YDR451C |
YHP1 |
716 |
High |
|
ChIP-chip, EMSA, and one-hybrid all arrive at a classic homeodomain TAATTG motif. Microarray enrichment motif (716) scores higher on OE data from another study than ChIP motifs do, and does nearly as well on ChIP data. |
V |
YJR060W |
CBF1 |
1346 |
High |
|
Classic E-box. MITOMI motif 1346 nearly has highest correspondence to ChIP-chip data and is non-circular; no other supporting data |
V |
YIL131C |
FKH1 |
2002 |
High |
|
Classic Forkhead motif for most of them. 2002 strongly resembles PBM motif but scores higher on both ChIP (which is circular) and expression (which is not). |
V |
YJL110C |
GZF3 |
2133 |
High |
|
Classic GATA motif 2133 from PBM scores highest on ChIP-chip and expression data |
V |
YOR162C |
YRR1 |
2245 |
High |
|
Classic monomeric GAL4-class motif. PBM studies agree and score significantly on Harbison data. No other motifs have spacing/orientation except 11909958, but even the authors of this study note that "Only half a dyad seems to be conserved in this consensus sequence". 2245 scores highest in Harbison data. |
V |
YDR423C |
CAD1 |
2098 |
High |
|
Classic YAP motif in most cases. Include examples of both overlapping and adjacent monomeric sites - there are examples of both in PBM data and they both score highly on ChIP data. This one is adjacent. |
V |
YDR423C |
CAD1 |
2073 |
High |
|
Classic YAP motif in most cases. Include examples of both overlapping and adjacent monomeric sites - there are examples of both in PBM data and they both score highly on ChIP data. This one is overlapping. |
V |
YLR176C |
RFX1 |
1478 |
Medium |
|
Curious case - virtually all motifs are similar in appearance, with a common TGGCAAC core. They range from what appear to be monomers to full dimers, with multiple partial forms. However, none of them scores highly on both ChIP-chip and expression data. Select two representatives: one that scores well on ChIP-chip, and one that scores well on expression. This is the one that scores most highly on ChIP-chip. It is a dimer motif. Give medium confidence, since it has little relationship to expression data. |
V |
YLR176C |
RFX1 |
496 |
Medium |
|
Curious case - virtually all motifs are similar in appearance, with a common TGGCAAC core. They range from what appear to be monomers to full dimers, with multiple partial forms. However, none of them scores highly on both ChIP-chip and expression data. Select two representatives: one that scores well on ChIP-chip, and one that scores well on expression. This is the one that scores most highly on expression (close 2nd on deletion and 1st on overexpression). It is the only purely monomeric motif. Give medium confidence, since according to the literature this protein should bind as a dimer. |
V |
YGL162W |
SUT1 |
673 |
Medium |
|
Four motifs, all derived from ChIP-chip, contain CGG, but are unusual, with degeneracy and a core of CGGGG. Correlate somewhat with both OE and deletion data, however. |
V |
YGL073W |
HSF1 |
1461 |
Medium |
|
Four types of motifs contain TTC monomeric core and all score highly on both ChIP and expression. Appear to represent different monomeric/multimeric binding configurations. This is the dimeric head-to-tail site. From ChIP and prior. |
V |
YGL073W |
HSF1 |
476 |
Medium |
|
Four types of motifs contain TTC monomeric core and all score highly on both ChIP and expression. Appear to represent different monomeric/multimeric binding configurations. This is the monomeric site. From PBM. |
V |
YGL073W |
HSF1 |
411 |
Medium |
|
Four types of motifs contain TTC monomeric core and all score highly on both ChIP and expression. Appear to represent different monomeric/multimeric binding configurations. This is the spaced direct repeat dimeric site. From ChIP. |
V |
YGL073W |
HSF1 |
615 |
Medium |
|
Four types of motifs contain TTC monomeric core and all score highly on both ChIP and expression. Appear to represent different monomeric/multimeric binding configurations. This is the trimeric site. From ChIP. |
V |
YPL075W |
GCR1 |
2071 |
High |
|
Gcr2 is not a DNA-binding protein. SGD: "Gcr1p is a DNA-binding protein interacting with the consensus sequence CTTCC, whereas Gcr2p interacts with Gcr1p". But, ChIP-chip motif 606 is probably the best Gcr1 motif available (even though it came from Gcr2 ChIP). |
V |
YKL043W |
PHD1 |
393 |
High |
|
High-scoring motifs are all similar, with characteristic APSES GC core and palindromic. PBM motifs score highest on ChIP-seq data, while ChIP-chip motif 393 (which contains flanking G/C residues) scores highest on expression data. Retain both - possibly, the rest of the protein contributes to binding flanking residues. This is the ChIP motif that scores highest on expression data. |
V |
YKL043W |
PHD1 |
2153 |
High |
|
High-scoring motifs are all similar, with characteristic APSES GC core and palindromic. PBM motifs score highest on ChIP-seq data, while ChIP-chip motif 393 (which contains flanking G/C residues) scores highest on expression data. Retain both - possibly, the rest of the protein contributes to binding flanking residues. This is the higher-scoring PBM motif (2153). |
V |
YLR131C |
ACE2 |
1332 |
High |
|
Highest-scoring ChIP-chip motif is Rap1 site. MITOMI motif 1332 is next, and resembles the classic Swi5/Ace2 motif. |
V |
YPR009W |
SUT2 |
2236 |
High |
|
Highest-scoring motif (PBM) is a classical GAL4-type monomeric motif and is very significant in ChIP-chip |
V |
YKR064W |
OAF3 |
0 |
|
|
I do not see how either of these motifs could possibly be a Gal4-class binding motif. And, there is no correspondence to any of the data, even the ChIP-chip data from which it is derived. |
V |
YDL020C |
RPN4 |
1700 |
High |
|
In vitro motifs do not contain the TTT sequence on the end. But they were derived from the DBD only. The rest of the protein may contribute to binding the TTT segment. Motif 1700 has the highest correspondence to ChIP-chip and expression and GO. |
V |
YOL108C |
INO4 |
713 |
High |
|
Ino2/4 binds as a heterodimer, so there should just be one motif for the two proteins. All motifs appear similar but none of them is derived from in vitro data. Nonetheless most motifs match a classic E-box with some preference for flanking bases. Motif 713 is derived from ChIP-chip; it is not the highest-scoring ChIP-chip motif but it is highest for OE and deletion expression. |
V |
YDR123C |
INO2 |
713 |
High |
|
Ino2/4 binds as a heterodimer, so there should just be one motif for the two proteins. All motifs appear similar but none of them is derived from in vitro data. Nonetheless most motifs match a classic E-box with some preference for flanking bases. Motif 713 is derived from ChIP-chip; it is not the highest-scoring ChIP-chip motif but it is highest for OE and deletion expression. |
V |
YIR017C |
MET28 |
0 |
|
|
Like MET4, component of a complex. SGD: "Basic leucine zipper (bZIP) transcriptional activator in the Cbf1p-Met4p-Met28p complex".."Both Met4p and Met28p bind to DNA only in the presence of Cbf1p, and the presence of Cbf1p and Met4p stimulates the binding of Met28p to DNA (1, 2).". ChIP-chip motif 703 (CTGTGG) is clearly the Met31/32 motif. The other ChIP-chip motif is essentially poly-A, and scores poorly. Hence, neither of these motifs represents the intrinsic sequence specificity of MET28. Need in vitro data for complexes. |
V |
YDL106C |
PHO2 |
1680 |
Incorrect |
|
Likely represents Abf1 binding site. |
V |
YPL089C |
RLM1 |
1079 |
Incorrect |
|
Likely represents Mcm1 binding site. |
V |
YOR372C |
NDD1 |
366 |
Incorrect |
|
Likely represents Mcm1 binding site. |
V |
YML099C |
ARG81 |
1507 |
Incorrect |
|
Likely represents Mcm1 binding site. |
V |
YKL185W |
ASH1 |
648 |
Incorrect |
|
Likely represents Mcm1 binding site. |
V |
YKL185W |
ASH1 |
932 |
Incorrect |
|
Likely represents Mcm1 binding site. |
V |
YPR104C |
FHL1 |
406 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YPR104C |
FHL1 |
893 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YPR104C |
FHL1 |
1618 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YPR104C |
FHL1 |
629 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YPR104C |
FHL1 |
1196 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YPR104C |
FHL1 |
1504 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YLR403W |
SFP1 |
1710 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YLR403W |
SFP1 |
1100 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YLR403W |
SFP1 |
357 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YLR403W |
SFP1 |
621 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YLR131C |
ACE2 |
918 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YLR098C |
CHA4 |
1607 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YIR018W |
YAP5 |
896 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YGL013C |
PDR1 |
899 |
Incorrect |
|
Likely represents Rap1 binding site. |
V |
YMR053C |
STB2 |
710 |
Incorrect |
|
Likely represents Reb1 binding site. |
V |
YDL020C |
RPN4 |
1090 |
Incorrect |
|
Likely represents Reb1 binding site. |
V |
YLR256W |
HAP1 |
2078 |
High |
|
Literature binding site is direct CGG repeats with a 6bp spacer (PMID: 7958882). PBM motif 2078 gets this; it scores highest overall, including significant scores on both ChIP-chip and expression. |
V |
YMR021C |
MAC1 |
1540 |
High |
|
Literature motif 1540 most closely most closely corresponds to ChIP-chip data (albeit barely significant). Nothing else to gauge by, but no reason to doubt literature motif. |
V |
YER045C |
ACA1 |
8 |
Medium |
|
Literature motif 8 is supported by experimental investigation, and resembles a bZIP site, but has no other support; motif was not obtained objectively. Can bind as heterodimer. The highest-scoring motif (from ChIP, 1457) has low information content - I'm concerned it is learning other features of bound promoters. |
V |
YCL058C |
FYV5 |
1417 |
Low |
|
Literature motif is derived from a single promoter and while the protein seems to have some DNA-binding activity, perhaps in conjunction with other TFs, I find the evidence supporting this precise binding site incomplete, since it is derived from a single site. Hence, low confidence in the motif. |
V |
YPR008W |
HAA1 |
1425 |
Medium |
|
Literature motif is not completely determined, but scores highly on ChIP-chip data. Regardless, medium confidence. |
V |
YGL254W |
FZF1 |
69 |
Low |
|
Literature motif is the only one that appears credible. PBM motif I believe is a known artifact. Literature motif gets low confidence however as it is based on a single known binding sequence. |
V |
YML065W |
ORC1 |
1549 |
High |
|
Looks like ORC1 motif. Which is not really a TF, but it is a sequence-specific DNA-binding protein. |
V |
YPL177C |
CUP9 |
2121 |
High |
|
MITOMI and PBM motifs are similar. PBM motif 2121 has slightly lower correspondence to ChIP data, but more significant correspondence to expression data. |
V |
YKR034W |
DAL80 |
1355 |
Medium |
|
MITOMI motif 1355 has highest correspondence to ChIP-chip. But it's not striking..none of them are, despite the fact that this is a classic GATA site (GATAAG). |
V |
YOL116W |
MSN1 |
1376 |
High |
|
MITOMI motif 1376 has the highest correspondence to ChIP-chip. MITOMI motif 1378 is very close, however, and seems to be a circular permutation. Retain both motifs. |