Title: | Identify Mutually Exclusive Mutations |
---|---|
Description: | An optimized method for identifying mutually exclusive genomic events. Its main contribution is a statistical analysis based on the Poisson-Binomial distribution that takes into account that some samples are more mutated than others. See [Canisius, Sander, John WM Martens, and Lodewyk FA Wessels. (2016) "A novel independence test for somatic alterations in cancer shows that biology drives mutual exclusivity but chance explains most co-occurrence." Genome biology 17.1 : 1-17. <doi:10.1186/s13059-016-1114-x>]. The mutations matrices are sparse matrices. The method developed takes advantage of the advantages of this type of matrix to save time and computing resources. |
Authors: | Juan A. Ferrer-Bonsoms, Laura Jareno, and Angel Rubio |
Maintainer: | Juan A. Ferrer-Bonsoms <[email protected]> |
License: | Artistic-2.0 |
Version: | 0.3.2 |
Built: | 2024-11-04 04:34:42 UTC |
Source: | https://github.com/cran/Rediscover |
A binary matrix of class matrix
used as toy example in getPM
and getMutex
and getMutexAB
and getMutexGroup
.
data("A_example")
data("A_example")
The format is: num [1:1000, 1:500] 0 0 0 0 0 1 0 0 0 1 ...
data(A_example)
data(A_example)
A binary dgCMatrix matrix used as toy example in getPM
and getMutex
and getMutexAB
and getMutexGroup
data("A_Matrix")
data("A_Matrix")
The format is:
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
..@ i : int [1:249838] 5 9 10 11 13 14 18 20 23 24 ...
..@ p : int [1:501] 0 503 1010 1506 1995 2497 2981 3488 4002 4474 ...
..@ Dim : int [1:2] 1000 500
..@ Dimnames:List of 2
.. ..$ : NULL
.. ..$ : NULL
..@ x : num [1:249838] 1 1 1 1 1 1 1 1 1 1 ...
..@ factors : list()
data(A_Matrix)
data(A_Matrix)
A binary matrix, with information about amplifications in Colon Adenocarcinoma, created by applying GDCquery
and used as real example in getMutexAB
.
data("AMP_COAD")
data("AMP_COAD")
The format is:
num [1:1000, 1:391] 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:1000] "ENSG00000212993.4" "ENSG00000279524.1" "ENSG00000136997.13" "ENSG00000101294.15" ...
..$ : chr [1:391] "TCGA-CA-6718" "TCGA-D5-6931" "TCGA-AZ-6601" "TCGA-G4-6320" ...
data(AMP_COAD) ## maybe str(AMP_COAD)
data(AMP_COAD) ## maybe str(AMP_COAD)
A binary matrix of class matrix
used as toy example in getPM
and getMutex
and getMutexAB
and getMutexGroup
.
data("B_example")
data("B_example")
The format is:
int [1:1000, 1:500] 0 1 1 0 1 0 0 0 1 1 ...
data(B_example)
data(B_example)
A binary dgCMatrix matrix used as toy example in getPM
and getMutex
and getMutexAB
and getMutexGroup
.
data("B_Matrix")
data("B_Matrix")
The format is:
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
..@ i : int [1:249526] 1 2 4 8 9 11 13 15 18 20 ...
..@ p : int [1:501] 0 498 1014 1527 2048 2558 3036 3511 4035 4537 ...
..@ Dim : int [1:2] 1000 500
..@ Dimnames:List of 2
.. ..$ : NULL
.. ..$ : NULL
..@ x : num [1:249526] 1 1 1 1 1 1 1 1 1 1 ...
..@ factors : list()
data(B_Matrix)
data(B_Matrix)
Function adapted to maftools where given a .maf file, it graphs the somatic interactions between a group of genes, i.e., the combination of gene expression and mutation data to detect mutually exclusive or co-ocurring events.
discoversomaticInteractions( maf, top = 25, genes = NULL, pvalue = c(0.05, 0.01), getMutexMethod = "ShiftedBinomial", getMutexMixed = TRUE, returnAll = TRUE, geneOrder = NULL, fontSize = 0.8, showSigSymbols = TRUE, showCounts = FALSE, countStats = "all", countType = "all", countsFontSize = 0.8, countsFontColor = "black", colPal = "BrBG", showSum = TRUE, colNC = 9, nShiftSymbols = 5, sigSymbolsSize = 2, sigSymbolsFontSize = 0.9, pvSymbols = c(46, 42), limitColorBreaks = TRUE )
discoversomaticInteractions( maf, top = 25, genes = NULL, pvalue = c(0.05, 0.01), getMutexMethod = "ShiftedBinomial", getMutexMixed = TRUE, returnAll = TRUE, geneOrder = NULL, fontSize = 0.8, showSigSymbols = TRUE, showCounts = FALSE, countStats = "all", countType = "all", countsFontSize = 0.8, countsFontColor = "black", colPal = "BrBG", showSum = TRUE, colNC = 9, nShiftSymbols = 5, sigSymbolsSize = 2, sigSymbolsFontSize = 0.9, pvSymbols = c(46, 42), limitColorBreaks = TRUE )
maf |
maf object generated by read.maf |
top |
check for interactions among top 'n' number of genes. Defaults to top 25. genes |
genes |
List of genes among which interactions should be tested. If not provided, test will be performed between top 25 genes. |
pvalue |
Default c(0.05, 0.01) p-value threshold. You can provide two values for upper and lower threshold. |
getMutexMethod |
Method for the 'getMutex' function (by default "ShiftedBinomial") |
getMutexMixed |
Mixed parameter for the 'getMutex' function (by default TRUE) |
returnAll |
If TRUE returns test statistics for all pair of tested genes. Default FALSE, returns for only genes below pvalue threshold. |
geneOrder |
Plot the results in given order. Default NULL. |
fontSize |
cex for gene names. Default 0.8 |
showSigSymbols |
Default TRUE. Heighlight significant pairs |
showCounts |
Default FALSE. Include number of events in the plot |
countStats |
Default 'all'. Can be 'all' or 'sig' |
countType |
Default 'all'. Can be 'all', 'cooccur', 'mutexcl' |
countsFontSize |
Default 0.8 |
countsFontColor |
Default 'black' |
colPal |
colPalBrewer palettes. Default 'BrBG'. See RColorBrewer::display.brewer.all() for details |
showSum |
show [sum] with gene names in plot, Default TRUE |
colNC |
Number of different colors in the palette, minimum 3, default 9 |
nShiftSymbols |
shift if positive shift SigSymbols by n to the left, default 5 |
sigSymbolsSize |
size of symbols in the matrix and in legend. Default 2 |
sigSymbolsFontSize |
size of font in legends. Default 0.9 |
pvSymbols |
vector of pch numbers for symbols of p-value for upper and lower thresholds c(upper, lower). Default c(46, 42) |
limitColorBreaks |
limit color to extreme values. Default TRUE |
Internally, this function run the getMutex function. With the 'getMutexMethod' parameter user might select the 'method' parameter of the getMutex function. For more details run '?getMutex'
#' @return A list of data.tables and it will print a heatmap with the results.
Mayakonda A, Lin DC, Assenov Y, Plass C, Koeffler HP. 2018. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Research. http://dx.doi.org/10.1101/gr.239244.118
## Not run: #An example of how to perform the function, #using data from TCGA, Colon Adenocarcinoma in this case. #coad.maf <- GDCquery_Maf("COAD", pipelines = "muse") %>% read.maf coad.maf <- read.maf(GDCquery_Maf("COAD", pipelines = "muse")) discoversomaticInteractions(maf = coad.maf, top = 35, pvalue = c(1e-2, 2e-3)) ## End(Not run)
## Not run: #An example of how to perform the function, #using data from TCGA, Colon Adenocarcinoma in this case. #coad.maf <- GDCquery_Maf("COAD", pipelines = "muse") %>% read.maf coad.maf <- read.maf(GDCquery_Maf("COAD", pipelines = "muse")) discoversomaticInteractions(maf = coad.maf, top = 35, pvalue = c(1e-2, 2e-3)) ## End(Not run)
Given a binary matrix and its corresponding probability matrix pij, compute the Poisson Binomial method to estimate mutual exclusive events.
getMutex( A = NULL, PM = getPM(A), lower.tail = TRUE, method = "ShiftedBinomial", mixed = TRUE, th = 0.05, verbose = FALSE, parallel = FALSE, no_cores = NULL )
getMutex( A = NULL, PM = getPM(A), lower.tail = TRUE, method = "ShiftedBinomial", mixed = TRUE, th = 0.05, verbose = FALSE, parallel = FALSE, no_cores = NULL )
A |
The binary matrix |
PM |
The corresponding probability matrix of A. It can be computed using function getPM. By default equal to getPM(A) |
lower.tail |
True if mutually exclusive test. False for co-ocurrence. By default is TRUE. |
method |
one of the following: "ShiftedBinomial" (default),"Exact", "Binomial", and "RefinedNormal". |
mixed |
option to compute lower p-values with an exact method. By default TRUE |
th |
upper threshold of p.value to apply the exact method. |
verbose |
The verbosity of the output |
parallel |
If the exact method is executed with a parallel process. |
no_cores |
number of cores. If not stated number of cores of the CPU - 1 |
we implemented three different approximations of the Poison-Binomial distribution function:
"ShiftedBinomial" (by default) that correspond to a shifted Binomial with three parameters (Peköz, Shwartz, Christiansen, & Berlowitz, 2010).
"Exact" that use the exact formula using the 'PoissonBinomial' Rpackage based on the work from (Biscarri, Zhao, & Brunner, 2018).
"Binomial" with two parameters (Cam, 1960).
"RefinedNormal" that is based on the work from (Volkova, 1996).
If 'mixed' option is selected (by default is FALSE), the "Exact" method is computed for p-values lower than a threshold ('th' parameter, that by default is 0.05). When the exact method is computed, it is possible to parallelize the process by selecting the option 'parallel' (by default FALSE) and setting the number of cores ('no_cores' parameter)
A symmetric matrix with the p-values of the corresponding test.
#This first example is a basic #example of how to perform getMutex. data("A_example") PMA <- getPM(A_example) mismutex <- getMutex(A=A_example,PM=PMA) #The next example, is the same as the first one but, # using a matrix of class Matrix. data("A_Matrix") A_Matrix <- A_Matrix[1:100,1:50] #small for the example PMA_Matrix <- getPM(A_Matrix) mismutex <- getMutex(A=A_Matrix,PM=PMA_Matrix) ## Not run: #Finally, the last example, shows a real #example of how to perform this function when using #data from TCGA, Colon Adenocarcinoma in this case. data("TCGA_COAD") data("PM_COAD") PM_COAD <- getMutex(TCGA_COAD, PM_COAD) ## End(Not run)
#This first example is a basic #example of how to perform getMutex. data("A_example") PMA <- getPM(A_example) mismutex <- getMutex(A=A_example,PM=PMA) #The next example, is the same as the first one but, # using a matrix of class Matrix. data("A_Matrix") A_Matrix <- A_Matrix[1:100,1:50] #small for the example PMA_Matrix <- getPM(A_Matrix) mismutex <- getMutex(A=A_Matrix,PM=PMA_Matrix) ## Not run: #Finally, the last example, shows a real #example of how to perform this function when using #data from TCGA, Colon Adenocarcinoma in this case. data("TCGA_COAD") data("PM_COAD") PM_COAD <- getMutex(TCGA_COAD, PM_COAD) ## End(Not run)
Given two binary matrices and its corresponding probability matrices PAij and PBij, compute the Poisson Binomial method to estimate mutual exclusive events between A and B
getMutexAB( A, PMA = getPM(A), B, PMB = getPM(B), lower.tail = TRUE, method = "ShiftedBinomial", mixed = TRUE, th = 0.05, verbose = FALSE, parallel = FALSE, no_cores = NULL )
getMutexAB( A, PMA = getPM(A), B, PMB = getPM(B), lower.tail = TRUE, method = "ShiftedBinomial", mixed = TRUE, th = 0.05, verbose = FALSE, parallel = FALSE, no_cores = NULL )
A |
The binary matrix of events A |
PMA |
The corresponding probability matrix of A. It can be computed using function getPM. By default equal to getPM(A) |
B |
The binary matrix of events B |
PMB |
The corresponding probability matrix of B. It can be computed using function getPM. By default equal to getPM(B) |
lower.tail |
True if mutually exclusive test. False for co-ocurrence. By default is TRUE. |
method |
one of the following: "ShiftedBinomial" (default),"Exact", "RefinedNormal", and "Binomial". |
mixed |
option to compute lower p-values with an exact method. By default TRUE |
th |
upper threshold of p-value to apply the exact method. |
verbose |
The verbosity of the output |
parallel |
If the exact method is executed with a parallel process. |
no_cores |
number of cores. If not stated number of cores of the CPU - 1 |
we implemented three different approximations of the Poison-Binomial distribution function:
"ShiftedBinomial" (by default) that correspond to a shifted Binomial with three parameters (Peköz, Shwartz, Christiansen, & Berlowitz, 2010).
"Exact" that use the exact formula using the 'PoissonBinomial' Rpackage based on the work from (Biscarri, Zhao, & Brunner, 2018).
"Binomial" with two parameters (Cam, 1960).
"RefinedNormal" that is based on the work from (Volkova, 1996).
If 'mixed' option is selected (by default is FALSE), the "Exact" method is computed for p-values lower than a threshold ('th' parameter, that by default is 0.05). When the exact method is computed, it is possible to parallelize the process by selecting the option 'parallel' (by default FALSE) and setting the number of cores ('no_cores' parameter)
A matrix with the p-values of the corresponding test.
#The next example, is the same as the first # one but, using a matrix of class Matrix. data("A_Matrix") data("B_Matrix") PMA <- getPM(A_Matrix) PMB <- getPM(B_Matrix) mismutex <- getMutexAB(A=A_Matrix, PM=PMA, B=B_Matrix, PMB = PMB) #Finally, the last example, shows a #real example of how to perform this function # when using data from TCGA, Colon Adenocarcinoma in this case. ## Not run: data("TCGA_COAD_AMP") data("AMP_COAD") data("PM_TCGA_COAD_AMP") data("PM_AMP_COAD") mismutex <- getMutexAB(A=TCGA_COAD_AMP, PMA=PM_TCGA_COAD_AMP, B=AMP_COAD, PMB = PM_AMP_COAD) ## End(Not run)
#The next example, is the same as the first # one but, using a matrix of class Matrix. data("A_Matrix") data("B_Matrix") PMA <- getPM(A_Matrix) PMB <- getPM(B_Matrix) mismutex <- getMutexAB(A=A_Matrix, PM=PMA, B=B_Matrix, PMB = PMB) #Finally, the last example, shows a #real example of how to perform this function # when using data from TCGA, Colon Adenocarcinoma in this case. ## Not run: data("TCGA_COAD_AMP") data("AMP_COAD") data("PM_TCGA_COAD_AMP") data("PM_AMP_COAD") mismutex <- getMutexAB(A=TCGA_COAD_AMP, PMA=PM_TCGA_COAD_AMP, B=AMP_COAD, PMB = PM_AMP_COAD) ## End(Not run)
Given a binary matrix and its corresponding probability matrix pij, compute the Poisson Binomial method to estimate mutual exclusive events.
getMutexGroup(A = NULL, PM = NULL, type = "Impurity", lower.tail = TRUE)
getMutexGroup(A = NULL, PM = NULL, type = "Impurity", lower.tail = TRUE)
A |
The binary matrix |
PM |
The corresponding probability matrix of A. It can be computed using function getPM. By default equal to getPM(A) |
type |
one of Coverage, Exclusivity or Impurity. By default is Impurity |
lower.tail |
True if mutually exclusive test. False for co-ocurrence. By default is TRUE. |
A symmetric matrix with the p.value of the corresponding test.
#This first example is a basic #example of how to perform getMutexGroup data("A_example") A2 <- A_example[,1:30] A2[1,1:10] <- 1 A2[2,1:10] <- 0 A2[3,1:10] <- 0 A2[1,11:20] <- 0 A2[2,11:20] <- 1 A2[3,11:20] <- 0 A2[1,21:30] <- 0 A2[2,21:30] <- 0 A2[3,21:30] <- 1 PM2 <- getPM(A2) A <- A2[1:3,] PM <- PM2[1:3,] getMutexGroup(A, PM, "Impurity") getMutexGroup(A, PM, "Coverage") getMutexGroup(A, PM, "Exclusivity")
#This first example is a basic #example of how to perform getMutexGroup data("A_example") A2 <- A_example[,1:30] A2[1,1:10] <- 1 A2[2,1:10] <- 0 A2[3,1:10] <- 0 A2[1,11:20] <- 0 A2[2,11:20] <- 1 A2[3,11:20] <- 0 A2[1,21:30] <- 0 A2[2,21:30] <- 0 A2[3,21:30] <- 1 PM2 <- getPM(A2) A <- A2[1:3,] PM <- PM2[1:3,] getMutexGroup(A, PM, "Impurity") getMutexGroup(A, PM, "Coverage") getMutexGroup(A, PM, "Exclusivity")
Given a binary matrix estimates the corresponding probability matrix pij.
getPM(A)
getPM(A)
A |
The binary matrix |
A 'PMatrix' object with the corresponding probability estimations. This 'PMatrix' object stored the corresponding coefficients of the logistic regression computed. With this coefficients it is possible to build the complete matrix of probabilities.
#This first example is a basic example of how to perform getPM: data("A_example") PMA <- getPM(A_example) #The next example, is the same as the first one but, #using a matrix of class Matrix: data("A_Matrix") PMA_Matrix <- getPM(A_Matrix) ## Not run: #Finally, the last example, shows a real example #of how to perform this function when when using #data from TCGA, Colon Adenocarcinoma in this case: data("TCGA_COAD") PM_COAD <- getPM(TCGA_COAD) ## End(Not run)
#This first example is a basic example of how to perform getPM: data("A_example") PMA <- getPM(A_example) #The next example, is the same as the first one but, #using a matrix of class Matrix: data("A_Matrix") PMA_Matrix <- getPM(A_Matrix) ## Not run: #Finally, the last example, shows a real example #of how to perform this function when when using #data from TCGA, Colon Adenocarcinoma in this case: data("TCGA_COAD") PM_COAD <- getPM(TCGA_COAD) ## End(Not run)
Probability matrix, with information of genes being amplified in samples in Colon Adenocarcinoma, created by AMP_COAD.rda applying getPM
and used as real example and getMutexAB
.
data("PM_AMP_COAD")
data("PM_AMP_COAD")
The format is:
num [1:1000, 1:391] 0.118 0.118 0.118 0.118 0.114 ...
data(PM_AMP_COAD)
data(PM_AMP_COAD)
Probability matrix, with information of genes being mutated in samples in Colon Adenocarcinoma, created by TCGA_COAD.rda applying getPM
and used as real example in getMutex
and getMutexAB
and getMutexGroup
.
data("PM_COAD")
data("PM_COAD")
The format is:
Formal class 'PMatrix' [package "Rediscover"] with 2 slots
..@ rowExps: num [1:399] 13.1 1.02 7.43 3.26 0.4 ...
..@ colExps: num [1:17616] 2.54 1.78 1.76 1.35 0.6 ...
data(PM_COAD)
data(PM_COAD)
An S4 class to store the probabilities of gene i being mutated in sample j
rowExps
Sample depending estimated coefficients obtained from the logistic regression
colExps
gene depending estimated coefficients obtained from the logistic regression
A binary matrix, with information about genes mutations in Colon Adenocarcinoma, created by applying maftools
to .maf file and used as real example in getPM
and getMutex
and getMutexAB
and getMutexGroup
.
data("TCGA_COAD")
data("TCGA_COAD")
The format is:
num [1:399, 1:17616] 1 1 1 1 1 1 1 1 1 1 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:399] "TCGA-CA-6718" "TCGA-D5-6931" "TCGA-AZ-6601" "TCGA-G4-6320" ...
..$ : chr [1:17616] "APC" "TP53" "TTN" "KRAS" ...
data(TCGA_COAD)
data(TCGA_COAD)