public class PWM extends java.lang.Object implements Binnable, java.lang.Comparable<PWM>, ProbCREModel
BASE_KEY
tells which base is represented by each row in the pwm. You can also asign a sig score to each pwm.Modifier and Type | Class and Description |
---|---|
private class |
PWM.StringDoublePair
Wrapper for a string and an double; used by expand().
|
Modifier and Type | Field and Description |
---|---|
static char[] |
BASE_KEY
Tells which base is represented by each row in the pwm.
|
static double |
DEFAULT_STRINGENCY
Default upper bound for sum of -log(base_probability) for the bases in a sequence in order for that sequence to get output by
expandAsStrings() and expandAsMotifList() ; |
static int |
LAPLACE
Used in relative entropy calculations.
|
static int |
MAX_EXPANSION |
static double |
PSEUDO
The pseudo count is added to every count when calculating the probabilities.
|
private int[][] |
pwm |
private boolean |
revComp |
private double |
score |
private double[][] |
scoringTable |
private double |
stringency |
DEFAULT_RC
Constructor and Description |
---|
PWM()
This creates an empty PWM--most methods will cause unchecked exceptions to be thrown.
|
PWM(Motif[] m,
MotifFinder mf)
Converts an array of unambiguous motifs of the same length and same reverse complement status into a position-weight matrix.
|
PWM(MotifList ml)
Same as {#PWM(String[])} except it weights the counts by the score of each motif.
|
PWM(Motif m,
MotifFinder mf)
Converts a consensus motif into a position-weight matrix.
|
PWM(java.lang.String[] s)
Constructs a PWM from this array of strings.
|
PWM(java.lang.String[] s,
boolean rc)
Constructs a PWM from this array of strings.
|
Modifier and Type | Method and Description |
---|---|
int |
compareTo(PWM m)
Allows Arrays.sort() to sort pwm's by score in descending order; not consistent with
equals(Object) . |
double |
computeEntropy()
Returns the average entropy for each position in the pwm.
|
double |
computeEntropy(int position)
Returns the entropy at the given position.
|
double |
computeRelativeEntropy(int[] referenceWeights)
Returns the average relative entropy for each position in the pwm; the array of reference weights should correspond to the base
ordering defined in
BASE_KEY . |
double |
computeRelativeEntropy(int position,
int[] referenceWeights)
Returns the relative entropy at the given position; the array of reference weights should correspond to the base ordering defined in
BASE_KEY . |
double |
computeRelativeEntropy(int position,
PWM other)
Returns the relative entropy at the given position.
|
double |
computeRelativeEntropy(PWM other)
Returns the average relative entropy for each position in the pwm.
|
boolean |
equals(java.lang.Object o)
Not implemented yet.
|
private java.util.Vector<PWM.StringDoublePair> |
expand(double[][] probabilities,
int length,
MotifFinder mf)
Returns a vector of the strings and -log probabilities for the strings that pass the stringency test in the first length positions.
|
MotifList |
expandAsMotifList()
Returns a MotifList containing the unambiguous Motifs that this pwm represents; each motif's score is set to its -log (average base
probability) (it's stringency).
|
MotifList |
expandAsMotifList(MotifFinder mf)
Returns a MotifList containing the unambiguous Motifs that this pwm represents; each motif's score is set to its -log (average base
probability) (it's stringency).
|
java.lang.String[] |
expandAsStrings()
Returns an array of the strings of primary bases that this pwm represents.
|
java.lang.String[] |
expandAsStrings(MotifFinder mf)
Returns an array of the strings of primary bases that this pwm represents.
|
boolean |
generatesString(java.lang.String s)
Returns true if this string would be generated by this PWM at the current stringency and for the current scoringTable.
|
java.util.Comparator<Motif> |
getComparator()
Comparator sorts arrays from greatest to least by the
scoreString(String) method. |
int |
getCount()
Returns the number of motifs that went into making this pwm.
|
int |
getCount(char base,
int position) |
int |
getCount(int base,
int position) |
int[] |
getCounts(int position)
Returns the base counts in this position.
|
int[][] |
getFrequencyTable() |
MotifList |
getHits(java.lang.String seq)
Returns those sequences in sequence that match this pwm according to
generatesString(String) as a MotifList. |
Motif |
getMaxLikelihoodMotif()
Returns the most probable instantiation of this pwm.
|
java.lang.String |
getName()
Return a name that will be uniq to this type of model.
|
double[][] |
getProbabilities()
Makes a matrix of probabilities instead of counts.
|
double[][] |
getProbabilities(double pseudo)
Makes a matrix of probabilities instead of counts.
|
double |
getScore()
Scans s and returns a MotifList of those motifs that pass the stringency test.
|
double[][] |
getScoringTable() |
java.lang.String |
getSequence(Alphabet alphabet) |
double |
getStringency() |
double |
getValue()
Same as getScore().
|
int |
length() |
static PWM |
makeNeighborhoodPWM(Motif m,
MotifFinder mf)
Makes a PWM of the 1 hamming distance neighborhood of m in the motif finder mf.
|
private char |
maxBase(int idx)
Returns the most probable base at the given index.
|
CREModel |
newInstanceOf(MotifList ml)
Creates a new PWM as specified by the CREModel interface.
|
static PWM |
parsePWM(java.lang.String[] rows)
Converts a whitespace-delimited table into a position-weight matrix with the reverse complement flagset to
CREModel.DEFAULT_RC ,
and with stringency set to DEFAULT_STRINGENCY . |
static PWM |
parsePWM(java.lang.String[] rows,
boolean rc)
Converts a whitespace-delimited table into a position-weight matrix with the given reverse complement flag, and with stringency set to
DEFAULT_STRINGENCY . |
double |
probabilityOf(java.lang.String s)
Uses getProbabilities and revComp.
|
static double |
probabilityOf(java.lang.String s,
double[][] probabilities,
boolean useRevComp)
Computes the log_2 probability of s coming from probabilities.
|
static int |
rowIndexOf(char b)
Returns the index of the row in the pwm that corresponds to b.
|
Motif.ScoreData |
scoreString(java.lang.String s)
Returns the score of this string as computed from the scoringTable.
|
void |
setScore(double s) |
void |
setScoringTable()
Sets the scoring table to be the log of the probability of a given base at a given position.
|
void |
setScoringTable(double[] refProbs)
Sets the scoring table to be the logratio of the pwm probabilities to the probabilities derived from the given frequencies.
|
void |
setStringency(double s)
Sets the stringency of this model.
|
java.lang.String |
toHTMLString(java.lang.String tableAttributes,
java.lang.String tdAttributes)
Returns a HTML table representing the pwm.
|
java.lang.String |
toString()
Returns a tab-delimited table representing the pwm.
|
boolean |
useRevComp()
Returns true if this pwm uses the reverse complement of the sequence.
|
public static final char[] BASE_KEY
public static final double DEFAULT_STRINGENCY
expandAsStrings()
and expandAsMotifList()
;public static final int LAPLACE
public static final int MAX_EXPANSION
public static final double PSEUDO
private int[][] pwm
private boolean revComp
private double score
private double[][] scoringTable
private double stringency
PWM()
public PWM(Motif m, MotifFinder mf)
public PWM(Motif[] m, MotifFinder mf)
public PWM(MotifList ml)
public PWM(java.lang.String[] s)
public PWM(java.lang.String[] s, boolean rc)
public static PWM makeNeighborhoodPWM(Motif m, MotifFinder mf)
public static PWM parsePWM(java.lang.String[] rows)
CREModel.DEFAULT_RC
,
and with stringency set to DEFAULT_STRINGENCY
. The order is a,t,g,c.public static PWM parsePWM(java.lang.String[] rows, boolean rc)
DEFAULT_STRINGENCY
. The order is a,t,g,c.public static double probabilityOf(java.lang.String s, double[][] probabilities, boolean useRevComp)
public static int rowIndexOf(char b)
public int compareTo(PWM m)
equals(Object)
.compareTo
in interface java.lang.Comparable<PWM>
public double computeEntropy()
public double computeEntropy(int position)
public double computeRelativeEntropy(int position, int[] referenceWeights)
BASE_KEY
.public double computeRelativeEntropy(int position, PWM other)
public double computeRelativeEntropy(int[] referenceWeights)
BASE_KEY
.public double computeRelativeEntropy(PWM other)
public boolean equals(java.lang.Object o)
equals
in class java.lang.Object
private java.util.Vector<PWM.StringDoublePair> expand(double[][] probabilities, int length, MotifFinder mf) throws ExpansionTooLargeException
ExpansionTooLargeException
public MotifList expandAsMotifList() throws ExpansionTooLargeException
expandAsMotifList
in interface CREModel
ExpansionTooLargeException
public MotifList expandAsMotifList(MotifFinder mf) throws ExpansionTooLargeException
ExpansionTooLargeException
public java.lang.String[] expandAsStrings() throws ExpansionTooLargeException
expandAsStrings
in interface CREModel
ExpansionTooLargeException
public java.lang.String[] expandAsStrings(MotifFinder mf) throws ExpansionTooLargeException
ExpansionTooLargeException
public boolean generatesString(java.lang.String s)
Specified by the CREModel interface.
generatesString
in interface CREModel
public java.util.Comparator<Motif> getComparator()
scoreString(String)
method.getComparator
in interface CREModel
public int getCount()
public int getCount(char base, int position)
public int getCount(int base, int position)
public int[] getCounts(int position)
public int[][] getFrequencyTable()
public MotifList getHits(java.lang.String seq)
generatesString(String)
as a MotifList. The returned
MotifList is a set, meaning there are guaranteed to be no duplicates. The order is unspecified. The score of each Motif in the
MotifList is the score assigned to it by this PWM.
Specified by the CREModel interface.
public Motif getMaxLikelihoodMotif()
public java.lang.String getName()
CREModel
public double[][] getProbabilities()
public double[][] getProbabilities(double pseudo)
public double getScore()
public double[][] getScoringTable()
public java.lang.String getSequence(Alphabet alphabet)
public double getStringency()
public int length()
private char maxBase(int idx)
public CREModel newInstanceOf(MotifList ml)
newInstanceOf
in interface CREModel
public double probabilityOf(java.lang.String s)
public Motif.ScoreData scoreString(java.lang.String s)
setScoringTable(double[])
to update the reference
weights on the scoring table.scoreString
in interface CREModel
java.lang.IllegalArgumentException
- if s is of a different length than this pwm.public void setScore(double s)
public void setScoringTable()
scoreString(String)
will be the average log(p) where p is the
probability of seeing the observed base in a given position.public void setScoringTable(double[] refProbs)
setScoringTable
in interface ProbCREModel
public void setStringency(double s)
CREModel
setStringency
in interface CREModel
public java.lang.String toHTMLString(java.lang.String tableAttributes, java.lang.String tdAttributes)
tableAttributes
- A string of attributes to be placed inside the table tagtdAttributes
- A string of attributes to be placed inside each td tag. Several attributes are automatically placed in the td tags:
The first column, which contains the base key, is centered. All other columns are right-aligned. All column widths are set to
20 pixels.public java.lang.String toString()
public boolean useRevComp()