|
|
|
|
|
|
|
V |
MAL63 |
|
136 |
Medium |
|
This is an unconventional dimeric GAL4-class motif |
V |
MATA1 |
|
0 |
|
|
Need to study literature more carefully and consult experts.but at first glance none of these motifs seems right |
V |
MATA1-MATALPHA2-dimer |
a1-alpha2-dimer |
1436 |
Medium |
|
Not clear that motif is optimal. |
V |
MATALPHA1-MCM1-dimer |
alpha1-MCM1-dimer |
1442 |
Medium |
|
Not clear that motif is optimal. |
V |
MBP1-SWI6-dimer |
MBP1-SWI6-dimer |
0 |
|
|
Redundant with MBP1 |
V |
TBP-TFIIA |
TBP-TFIIA |
1328 |
Low |
|
The TIRF-PBM data used to generate the motif included only 96 sequences. Also it is curious that there is no TATA sequence in the logo. |
V |
TBP-TFIIA-TFIIB |
TBP-TFIIA-TFIIB |
1330 |
Medium |
|
The TIRF-PBM data used to generate the motif included only 96 sequences; hence, medium confidence. |
V |
TBP-TFIIB |
TBP-TFIIB |
1329 |
Medium |
|
The TIRF-PBM data used to generate the motif included only 96 sequences; hence, medium confidence. |
V |
YAL051W |
OAF1 |
2060 |
Medium |
|
Motif 2060 has a strong resemblance to the literature motifs for the Oaf1-Pip2 dimer, and scores highly on both ChIP and expression data. No in vitro support and it's kind of weak looking so Medium confidence. |
V |
YBL003C |
HTA2 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YBL005W |
PDR3 |
2062 |
Medium |
|
MITOMI yields a simple GAL4 monomeric site that scores well in ChIP-chip data. ChIP-chip yields a dimeric site that resembles the literature site. In vivo, PDR1 and PDR3 may form heterodimers. Retain both. This is the dimeric ChIP-chip motif. |
V |
YBL005W |
PDR3 |
1387 |
Medium |
|
MITOMI yields a simple GAL4 monomeric site that scores well in ChIP-chip data. ChIP-chip yields a dimeric site that resembles the literature site. In vivo, PDR1 and PDR3 may form heterodimers. Retain both. This is the monomeric motif. |
V |
YBL008W |
HIR1 |
0 |
|
Dubious |
Hir1,2,3 are a nucleosome assembly complex, not TFs |
V |
YBL021C |
HAP3 |
695 |
High |
|
Subunit of the heme-activated, glucose-repressed Hap2/3/4/5 CCAAT-binding complex - there should be a single motif for all four proteins, containing CCAAT. ChIP-chip motif 695 resembles CCAATCA, and scores highly on ChIP-chip, OE, and deletion expression data. |
V |
YBL052C |
SAS3 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YBL054W |
TOD6 |
852 |
High |
|
Two PBM motifs largely agree; 852 has higher correspondence to expression data while 495 has higher correspondence to ChIP-chip. Use 852; score is way higher. Also for GO. |
V |
YBL103C |
RTG3 |
870 |
Low |
|
Only the PBM motif is a classic HLH motif. Three different ChIP-chip-derived motifs are all diverse, but all score highly on ChIP-chip data! Are they motifs of other TFs? Check. 602: GCN4; 1095, TEC1; 1096: resembles 602, but is a closer match to CUP9/TOS8. Also hits GCN4. According to the literature (PMID: 9032238) the core binding site for the Rtg1p-Rtg3p heterodimer is 5'-GGTCAC-3'; the only motif that resembles this is 1446. Vague resemblance to 602 and 1096. I am going to retain 1446, which represents the literature site; PBM motif 870, which resembles an E-box, and ChIP-chip motif 1445, which scores highest on ChIP-chip data. But give all low confidence. |
V |
YBL103C |
RTG3 |
1445 |
Low |
|
Only the PBM motif is a classic HLH motif. Three different ChIP-chip-derived motifs are all diverse, but all score highly on ChIP-chip data! Are they motifs of other TFs? Check. 602: GCN4; 1095, TEC1; 1096: resembles 602, but is a closer match to CUP9/TOS8. Also hits GCN4. According to the literature (PMID: 9032238) the core binding site for the Rtg1p-Rtg3p heterodimer is 5'-GGTCAC-3'; the only motif that resembles this is 1446. Vague resemblance to 602 and 1096. I am going to retain 1446, which represents the literature site; PBM motif 870, which resembles an E-box, and ChIP-chip motif 1445, which scores highest on ChIP-chip data. But give all low confidence. |
V |
YBL103C |
RTG3 |
1446 |
Low |
|
Only the PBM motif is a classic HLH motif. Three different ChIP-chip-derived motifs are all diverse, but all score highly on ChIP-chip data! Are they motifs of other TFs? Check. 602: GCN4; 1095, TEC1; 1096: resembles 602, but is a closer match to CUP9/TOS8. Also hits GCN4. According to the literature (PMID: 9032238) the core binding site for the Rtg1p-Rtg3p heterodimer is 5'-GGTCAC-3'; the only motif that resembles this is 1446. Vague resemblance to 602 and 1096. I am going to retain 1446, which represents the literature site; PBM motif 870, which resembles an E-box, and ChIP-chip motif 1445, which scores highest on ChIP-chip data. But give all low confidence. |
V |
YBR033W |
EDS1 |
2093 |
High |
|
PBM and ChIP-chip motifs are very similar. PBM motif 2093 scores most significantly on ChIP data. Classic GAL4 class motif. |
V |
YBR049C |
REB1 |
907 |
High |
|
All motifs are similar. ChIP-chip motif 907 has highest correspondence to both ChIP-chip and expression data, and strongly resembles MITOMI and PBM motifs. |
V |
YBR060C |
ORC2 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YBR066C |
NRG2 |
1383 |
High |
|
MITOMI motif 1383 looks like a classic yeast C2H2 binding site (row of G's). Also resembles motifs obtained by both ChIP and PBMs for related protein Nrg1. |
V |
YBR083W |
TEC1 |
815 |
High |
|
All motifs agree, and are significant by several criteria. PBM motif 815 has the second-highest scores overall, and it is non-circular for in vivo binding. Also has highest GO score. |
V |
YBR089C-A |
NHP6B |
792 |
Medium |
|
NHP6A and NHP6B are similar to the HMGB family, which is thought to lack sequence specificity. However, the proteins do bend the DNA when they bind, and so may have some level of sequence specificity. Essentially similar motifs were obtained for the two different proteins (in the same study) and the PBM motif for Nhp6A has a good correspondence to ChIP-chip data. Give both Medium confidence. |
V |
YBR150C |
TBS1 |
552 |
High |
|
Two motifs from PBMs are nearly identical GAL4-class motifs with defined spacing and orientation. Motif 552 has slightly higher scores. Two motifs from BEEML analysis of PBM data give monomeric motif - also give this high confidence. |
V |
YBR150C |
TBS1 |
2179 |
High |
|
Two motifs from PBMs are nearly identical GAL4-class motifs with defined spacing and orientation. Motif 552 has slightly higher scores. Two motifs from BEEML analysis of PBM data give monomeric motif - also give this high confidence. |
V |
YBR182C |
SMP1 |
864 |
Medium |
|
PBM motif 864 scores highest on ChIP-chip and expression data. I gave it a medium, however, because it has low information content at most positions, does not closely match the literature motif (although the literature motif does not mach ChIP-chip or expression data), and also does not resemble that of RLM1, which according to the literature should be related. |
V |
YBR239C |
ERT1 |
2188 |
Medium |
|
Three PBM motifs are all classic monomeric GAL4 motifs. Chose 2188 because it has fewer noninformative flanking positions, and higher significance on expression data. Also, 826 has the CCGG core that I suspect may be an artefact of PBMs or the DBD clones used in these studies. The highest-scoring ChIP motif is circular and does not resemble a GAL4 class binding site. |
V |
YBR240C |
THI2 |
1449 |
High |
|
This is a GAL4-class protein. All motifs are ChIP-chip derived, none resembles each other. 1449 is the only one with respectable scores on ChIP and expression,and it also has the appearance of a GAL4 class motif..although, the structural prior presumably forces it to have this property. |
V |
YBR267W |
REI1 |
489 |
High |
|
PBM motif looks like a yeast C2H2 motif (row of C's); highly significant relationship to ChIP-chip data |
V |
YBR297W |
MAL33 |
0 |
|
|
None of the ChIP-chip motifs correspond wekk to the data they come from and/or resemble a GAL4 motif. |
V |
YCL055W |
KAR4 |
127 |
Low |
Dubious |
Evidence for sequence specific DNA binding seems weak, hence low confidence |
V |
YCL058C |
FYV5 |
1417 |
Low |
|
Literature motif is derived from a single promoter and while the protein seems to have some DNA-binding activity, perhaps in conjunction with other TFs, I find the evidence supporting this precise binding site incomplete, since it is derived from a single site. Hence, low confidence in the motif. |
V |
YCL067C |
HMLALPHA2 |
2102 |
Medium |
|
Protein is similar to PBX/MEIS/TGIF; both PBM motifs have some similarity (central ACA/TGT), so do sites in crystal and in vivo (e.g. PMID: 1682054) but no clear winner between the two. Keep both PBM motifs in curated set (2102 and 2079) but give medium confidence - no supporting ChIP or expression data. |
V |
YCL067C |
HMLALPHA2 |
2079 |
Medium |
|
Protein is similar to PBX/MEIS/TGIF; both PBM motifs have some similarity (central ACA/TGT), so do sites in crystal and in vivo (e.g. PMID: 1682054) but no clear winner between the two. Keep both PBM motifs in curated set (2102 and 2079) but give medium confidence - no supporting ChIP or expression data. |
V |
YCR018C |
SRD1 |
2232 |
Medium |
|
PBM studies yield nearly identical motifs. 2232 closely resembles motif from related GATA factors and scores highest overall. This is an unusual motif for the GATA class; hence medium confidence level. |
V |
YCR033W |
SNT1 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YCR039C |
MATALPHA2 |
1364 |
High |
|
According to PMID: 9858582, "A comparison of the 2 binding sites in both asg and hsg operators yields the same consensus sequence, 5'-CATGTA-3"; results in Figure 2 of the same paper support a consensus of CATGTAA. MITOMI yields ACATG, which is the reverse complement of most of the literature consensus. Motif 1364 has highest information content; use this. |
V |
YCR040W |
MATALPHA1 |
1418 |
Low |
|
According to PMID: 15118075, binds the "Q site" which has "consensus" ACAATGACAG. Seems all that is in common is the CAAT. I believe further study is required. |
V |
YCR065W |
HCM1 |
570 |
High |
|
PBM and SAAB/EMSA motifs both look similar to standard FH motif. PBM motif 570 has stronger correspondence to expression data. |
V |
YCR066W |
RAD18 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YCR096C |
HMRA2 |
558 |
Medium |
|
Should be similar to MATALPHA2. The one PBM motif is indeed related to the MITOMI motif for MATALPHA2. |
V |
YCR106W |
RDS1 |
506 |
High |
|
All motifs look similar. PBM motif 506 has a higher score on ChIP-chip than any of the ChIP-chip derived motifs. |
V |
YDL002C |
NHP10 |
502 |
Low |
Dubious |
NHP10 is an HMGB-type protein. Known to prefer DNA ends. There is no independent support for the single PBM motif. |
V |
YDL020C |
RPN4 |
1700 |
High |
|
In vitro motifs do not contain the TTT sequence on the end. But they were derived from the DBD only. The rest of the protein may contribute to binding the TTT segment. Motif 1700 has the highest correspondence to ChIP-chip and expression and GO. |
V |
YDL020C |
RPN4 |
1090 |
Incorrect |
|
Likely represents Reb1 binding site. |
V |
YDL042C |
SIR2 |
0 |
|
Dubious |
There is no evidence that the SIR proteins are sequence-specific DNA-binding proteins. Most of the motifs for them are Rap1 sites. |
V |
YDL048C |
STP4 |
559 |
Medium |
|
STP3 and 4 have very similar DNA-binding domains. However, they are not similar to those of STP1 and 2; the next most closely related are SWI5 and ACE2, with major differences in the recognition alpha helices. All of the STP4 motifs are different from each other and none have any supporting data. There is only one motif for STP3 (568) from PBM and it matches the STP4 motif from the same study (559) which is the basis for choosing these two motifs. |
V |
YDL056W |
MBP1 |
2138 |
High |
|
Almost all motifs look similar to literature binding site. PBM motif 2138 scores at the top on ChIP-chip and expression. And is non-circular. |
V |
YDL074C |
BRE1 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YDL106C |
PHO2 |
1680 |
Incorrect |
|
Likely represents Abf1 binding site. |
V |
YDL106C |
PHO2 |
2154 |
High |
|
Motifs are largely all different from each other. PBM motif 2154 scores highly on ChIP data and resembles classic TAAT homeobox core. Note that PBM motif 794 even more strongly resembles homeobox (TAATTA) but scores slightly less highly. |
V |
YDL166C |
FAP7 |
0 |
|
Dubious |
This is supposed to be a ribosome biogenesis factor. I found no evidence that it is a sequence-specific DNA-binding protein. |
V |
YDL170W |
UGA3 |
651 |
High |
|
Appears to be a dimeric GAL4-class motif. Scores highest in ChIP-chip data, but is derived from the same data. GO seems to match known function! |
V |
YDL170W |
UGA3 |
486 |
Medium |
|
Appears to be a monomeric GAL4-class motif. Derived from PBM data, scores highly in ChIP-chip data, but not as high as the dimeric site derived from the ChIP-chip data. |
V |
YDR009W |
GAL3 |
0 |
|
Dubious |
Gal3 is not a sequence-specific DNA-binding protein |
V |
YDR026C |
|
696 |
High |
|
Three ChIP-chip motifs are virtually identical in appearance; resemble Reb1 motifs; high correspondence to ChIP-chip data |
V |
YDR034C |
LYS14 |
865 |
High |
|
PBM motifs are virtually identical and appear monomeric; literature motif is dimeric. Include both. Choose PBM motif 865 as it appears to have more robust CGG. |
V |
YDR034C |
LYS14 |
133 |
High |
|
PBM motifs are virtually identical and appear monomeric; literature motif is dimeric. Include both. Choose PBM motif 865 as it appears to have more robust CGG. |
V |
YDR043C |
NRG1 |
2148 |
High |
|
PBM, ChIP-chip, and literature motifs all appear very similar, and resemble motif for the related protein NRG2. Choose top PBM motif (2148). There is also a recurring ChIP-chip motif (TGTGCCT) which I believe is actually the MOT3 binding site. |
V |
YDR049W |
|
0 |
|
Dubious |
No evidence this is a TF, aside from a poorly-scoring C2H2 zinc finger |
V |
YDR081C |
PDC2 |
1050 |
Low |
|
Motifs do not correlate with the ChIP-chip data from which it was derived. I found no other experimental evidence that this is a sequence-specific DNA-binding protein. However, it does have HTH and transposase motifs. Retain motif 1050 but give low confidence. |
V |
YDR096W |
GIS1 |
562 |
High |
|
All motifs similar; PBM motif 562 has highest correspondence to deletion expression data and overexpression data |
V |
YDR123C |
INO2 |
713 |
High |
|
Ino2/4 binds as a heterodimer, so there should just be one motif for the two proteins. All motifs appear similar but none of them is derived from in vitro data. Nonetheless most motifs match a classic E-box with some preference for flanking bases. Motif 713 is derived from ChIP-chip; it is not the highest-scoring ChIP-chip motif but it is highest for OE and deletion expression. |
V |
YDR146C |
SWI5 |
569 |
High |
|
PBM, Chip-chip, and conservation all yield similar motifs. ChIP-chip scores highest in ChIP-chip but that is circular. Choose PBM motif 569 which is nearly identical. |
V |
YDR169C |
STB3 |
2233 |
High |
|
STB3 binds RRPE element (AAAAATTT) both in vivo and in vitro (PMID 17616518). PBM motifs 810 and 2233 strongly resembles the RRPE element, scores significantly in deletion expression data, and nail the GO categories "nucleolus" and "ribosome biogenesis". 2233 gets slightly higher scores. |
V |
YDR174W |
HMO1 |
2249 |
Low |
|
This motif is uncharacteristic for a Sox protein and HMG proteins typically do not bind DNA in a sequence specific manner. Since it is from ChIP data it could be a cofactor motif. Low confidence. |
V |
YDR207C |
UME6 |
2239 |
High |
|
All motifs are similar to each other. BEEML-PBM motif 2239 scores highest across the board. |
V |
YDR213W |
UPC2 |
544 |
High |
|
The SRE is bound by UPC2 and the "canonical" sequence is TCGTATA. However, the more degenerate version obtained by PBM (motif 544) scores better in both expression analysis and OE experiments. Newer motif 2109 scores better on ChIP-chip, but lower on expression, and the SRE is well-characterized....I think this one deserves further experimental analysis. |
V |
YDR216W |
ADR1 |
576 |
High |
|
PBM motif 576 has significant correspondence to both ChIP-chip and highest to expression data. And has a classic yeast C2H2 look. |
V |
YDR225W |
HTA1 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YDR227W |
SIR4 |
0 |
|
Dubious |
There is no evidence that the SIR proteins are sequence-specific DNA-binding proteins. Most of the motifs for them are Rap1 sites. |
V |
YDR253C |
MET32 |
2140 |
High |
|
Most motifs look similar. PBM motif 2140 has highest correspondence to both ChIP and expression. |
V |
YDR259C |
YAP6 |
599 |
High |
|
PBM and ChIP-chip can derive basically the same motif, which is a classical YAP motif. They score similarly on all criteria. The ChIP-chip motif (599) has fewer low-information flanking bases. |
V |
YDR266C |
|
1161 |
Low |
|
Motifs from ChIP-chip do not correspond to ChIP-chip, and there is no other supporting data. Chose 1161 only because it looks more reasonable. Low confidence. |
V |
YDR277C |
MTH1 |
0 |
|
Dubious |
SGD: "interacts with Rgt1p and the Snf3p and Rgt2p glucose sensors". There is no evidence that this is a sequence-specific transcription factor. |
V |
YDR303C |
RSC3 |
580 |
High |
|
PBM motif 580 has best correspondence to expression data - the only significant independent criterion - considering that the correlations are all in the same orientation (they are not for 2165). All motifs look similar. Propose that longer motifs could be due to multiple binding sites in the same sequence. |
V |
YDR310C |
SUM1 |
383 |
High |
|
This is the motif for the FL SUM1; scores highest on ChIP-chip and resembles the canonical literature motif; also has some relationship to deletion expression data |
V |
YDR310C |
SUM1 |
478 |
High |
|
This is the motif for the SUM1 AT_hook; scores highest in deletion expression data |
V |
YDR323C |
PEP7 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YDR362C |
TFC6 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YDR409W |
SIZ1 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YDR421W |
ARO80 |
2115 |
High |
|
PBM motif 2115 appears monomeric and has highest correspondence to ChIP-chip data. ChIP motif 1509 appears dimeric and correlates with ChIP data. Literature motif 725 appears trimeric and has experimental support. Retain all three. |
V |
YDR421W |
ARO80 |
1509 |
High |
|
PBM motif 2115 appears monomeric and has highest correspondence to ChIP-chip data. ChIP motif 1509 appears dimeric and correlates with ChIP data. Literature motif 725 appears trimeric and has experimental support. Retain all three. |
V |
YDR421W |
ARO80 |
725 |
High |
|
PBM motif 2115 appears monomeric and has highest correspondence to ChIP-chip data. ChIP motif 1509 appears dimeric and correlates with ChIP data. Literature motif 725 appears trimeric and has experimental support. Retain all three. |
V |
YDR423C |
CAD1 |
2098 |
High |
|
Classic YAP motif in most cases. Include examples of both overlapping and adjacent monomeric sites - there are examples of both in PBM data and they both score highly on ChIP data. This one is adjacent. |
V |
YDR423C |
CAD1 |
2073 |
High |
|
Classic YAP motif in most cases. Include examples of both overlapping and adjacent monomeric sites - there are examples of both in PBM data and they both score highly on ChIP data. This one is overlapping. |
V |
YDR448W |
ADA2 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YDR451C |
YHP1 |
716 |
High |
|
ChIP-chip, EMSA, and one-hybrid all arrive at a classic homeodomain TAATTG motif. Microarray enrichment motif (716) scores higher on OE data from another study than ChIP motifs do, and does nearly as well on ChIP data. |
V |
YDR463W |
STP1 |
660 |
High |
|
STP1 and 2 have very similar DNA-binding domains. However, they are not similar to those of STP3 and 4. PBM motif for STP2 (800) correlates with ChIP-chip and expression data. ChIP-chip motif for STP1 (660) most strongly resembles motif 800, and scores highly on ChIP-chip data. In addition, these motifs resemble halfmers of literature-derived binding sites. |
V |
YDR477W |
SNF1 |
1110 |
Medium |
Dubious |
Motif 1110 has a quite strong correspondence to ChIP-chip data (from which it is derived). However, there seems to be no evidence that this is a sequence-specific DNA-binding protein. Aside from a weak relationship to expression data there is no corroborating evidence here (and no DNA-binding domain). |
V |
YDR485C |
VPS72 |
0 |
|
Dubious |
Unlikely to be true TF. |
V |
YDR520C |
URC2 |
553 |
High |
|
This is a monomeric GAL4-class motif. Two PBM studies essentially agree, and have some relationship to ChIP-chip data. No other informative data. |
V |
YEL009C |
GCN4 |
1363 |
High |
|
Virtually all motifs look the same. MITOMI motif 1363 is as good as any of the ChIP-chip motifs but not circular; scores high across the board. |
V |
YER028C |
MIG3 |
2144 |
High |
|
PBM motif 2144 has highest correspondence to ChIP-chip data |
V |
YER040W |
GLN3 |
539 |
High |
|
Most motifs are classic GATA or GATAAG. PBM motif 539 scores highest on ChIP. |
V |
YER045C |
ACA1 |
8 |
Medium |
|
Literature motif 8 is supported by experimental investigation, and resembles a bZIP site, but has no other support; motif was not obtained objectively. Can bind as heterodimer. The highest-scoring motif (from ChIP, 1457) has low information content - I'm concerned it is learning other features of bound promoters. |
V |
YER051W |
JHD1 |
662 |
Low |
Dubious |
This is a histone demethylase. No evidence for direct DNA binding. Motif 662 is significant. Include, but give low confidence - could be a cofactor. |
V |
YER063W |
THO1 |
0 |
|
Dubious |
Unlikely to be true TF. |