Title: | Estimation of Intrinsic and Extrinsic Noise from Single-Cell Data |
---|---|
Description: | Functions to calculate estimates of intrinsic and extrinsic noise from the two-reporter single-cell experiment, as in Elowitz, M. B., A. J. Levine, E. D. Siggia, and P. S. Swain (2002) Stochastic gene expression in a single cell. Science, 297, 1183-1186. Functions implement multiple estimators developed for unbiasedness or min Mean Squared Error (MSE) in Fu, A. Q. and Pachter, L. (2016). Estimating intrinsic and extrinsic noise from single-cell gene expression measurements. Statistical Applications in Genetics and Molecular Biology, 15(6), 447-471. |
Authors: | Audrey Qiuyan Fu and Lior Pachter |
Maintainer: | Audrey Q. Fu <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.2 |
Built: | 2024-12-26 02:40:13 UTC |
Source: | https://github.com/cran/noise |
This function computes several estimates of the extrinsic noise (unscaled by the mean). The estimators, described in Fu and Pachter (2016), include the original estimators developed in Elowitz et al. (2002), the unbiased estimator, an min-MSE estimator, and an asymptotic estimator for large sample sizes.
computeExtrinsicNoise(reporter1, reporter2)
computeExtrinsicNoise(reporter1, reporter2)
reporter1 |
A vector of continuous values. |
reporter2 |
A vector of continuous values. |
Four (unscaled) estimates of extrinsic noise: the original estimators developed in Elowitz et al. (2002), the unbiased estimator, an min-MSE estimator, and an asymptotic estimator for large sample sizes.
Audrey Q. Fu
Fu, A. Q. and Pachter, L. (2016). Estimating intrinsic and extrinsic noise from single-cell gene expression measurements. arXiv:1601.03334. Elowitz, M. B., A. J. Levine, E. D. Siggia, and P. S. Swain (2002) Stochastic gene expression in a single cell. Science, 297, 1183-1186.
computeIntrinsicNoise
, simulateSC
. See estimates for data elowitz_data
and yang_nl10
.
This function is similar to computeExtrinsicNoise
, and computes several estimates of the extrinsic noise (unscaled by the mean). The estimators, described in Fu and Pachter (2016), include the original estimators developed in Elowitz et al. (2002), the unbiased estimator, an min-MSE estimator, and an asymptotic estimator for large sample sizes. The only difference between this function calculates the min-MSE estimate using a given correlation.
computeExtrinsicNoiseKnownCor(reporter1, reporter2, true.cor)
computeExtrinsicNoiseKnownCor(reporter1, reporter2, true.cor)
reporter1 |
A vector of continuous values. |
reporter2 |
A vector of continuous values. |
true.cor |
A scalar. |
Four (unscaled) estimates of extrinsic noise: the original estimators developed in Elowitz et al. (2002), the unbiased estimator, an min-MSE estimator (using the given correlation), and an asymptotic estimator for large sample sizes.
Audrey Q. Fu
Fu, A. Q. and Pachter, L. (2016). Estimating intrinsic and extrinsic noise from single-cell gene expression measurements. arXiv:1601.03334. Elowitz, M. B., A. J. Levine, E. D. Siggia, and P. S. Swain (2002) Stochastic gene expression in a single cell. Science, 297, 1183-1186.
computeExtrinsicNoise
, simulateSC
.
This function computes several estimates of the intrinsic noise (unscaled by the mean). The estimators, described in Fu and Pachter (2016), include the original estimators developed in Elowitz et al. (2002), unbiased estimators with and without assuming equal mean of the two reporters, min-MSE estimators with and without assuming equal mean, and asymptotic estimators for large sample sizes with and without assuming equal mean.
computeIntrinsicNoise(reporter1, reporter2)
computeIntrinsicNoise(reporter1, reporter2)
reporter1 |
A vector of continuous values. |
reporter2 |
A vector of continuous values. |
Six (unscaled) estimates of intrinsic noise: the original estimators developed in Elowitz et al. (2002), unbiased estimators with and without assuming equal mean of the two reporters, min-MSE estimators with and without assuming equal mean, and asymptotic estimators for large sample sizes with and without assuming equal mean.
Audrey Q. Fu
Fu, A. Q. and Pachter, L. (2016). Estimating intrinsic and extrinsic noise from single-cell gene expression measurements. arXiv:1601.03334. Elowitz, M. B., A. J. Levine, E. D. Siggia, and P. S. Swain (2002) Stochastic gene expression in a single cell. Science, 297, 1183-1186.
computeExtrinsicNoise
, simulateSC
. See estimates for data elowitz_data
and yang_nl10
.
This function randomly selects a subset of cells (rows) from the data set, computes multiple estimates of intrinsic and extrinsic noise, as well as their mean and standard deviation.
computeNoiseForSubset(data, sample.size, n.iter)
computeNoiseForSubset(data, sample.size, n.iter)
data |
A numeric matrix of two columns. Each row is a cell, and each column expression of a reporter gene. |
sample.size |
An integer that specifies the number of cells in the subset. |
n.iter |
An integer that specifies the number of iterations (for calcuation of mean and standard deviation). |
A list that consists of the following components:
intrinsic |
A numeric matrix of esimated intrinsic noise. 7 rows and n.iter columns. |
extrinsic |
A numeric matrix of esimated extrinsic noise. 4 rows and n.iter columns. |
intrinsic.mean |
A numeric vector of length 7 that contains the mean estimates of intrinsic noise. |
intrinsic.sd |
A numeric vector of length 7 that contains the standard deviation of the estimates of intrinsic noise. |
extrinsic.mean |
A numeric vector of length 7 that contains the mean estimates of extrinsic noise. |
extrinsic.sd |
A numeric vector of length 7 that contains the standard deviation of the estimates of extrinsic noise. |
Audrey Q. Fu
Fu, A. Q. and Pachter, L. (2016). Estimating intrinsic and extrinsic noise from single-cell gene expression measurements. arXiv:1601.03334.
computeIntrinsicNoise
, computeExtrinsicNoise
, elowitz_data
, yang_nl10
.
data(yang_nl10) # quantile normalization on log2 transformed data # install bioconductor package for quantile normalization # source('http://bioconductor.org/biocLite.R') # biocLite('preprocessCore') library(preprocessCore) # ignore a few values that are negative yang_nl10.pos <- yang_nl10[-which (yang_nl10[,1]<0),] yang_nl10.pos.log2.quant <- normalize.quantiles (as.matrix (log2 (yang_nl10.pos[,c(1,3)]))) # subset the data and compute noise estimates yang.50 <- computeNoiseForSubset (yang_nl10.pos.log2.quant, sample.size=50, n.iter=1000) summary (yang.50)
data(yang_nl10) # quantile normalization on log2 transformed data # install bioconductor package for quantile normalization # source('http://bioconductor.org/biocLite.R') # biocLite('preprocessCore') library(preprocessCore) # ignore a few values that are negative yang_nl10.pos <- yang_nl10[-which (yang_nl10[,1]<0),] yang_nl10.pos.log2.quant <- normalize.quantiles (as.matrix (log2 (yang_nl10.pos[,c(1,3)]))) # subset the data and compute noise estimates yang.50 <- computeNoiseForSubset (yang_nl10.pos.log2.quant, sample.size=50, n.iter=1000) summary (yang.50)
Expression of reporter genes CFP and YFP in over 200 E. coli cells of two strains: D22 and M22. These values are displayed in a scatterplot in Elowitz et al (2002) Fig 3a.
data("elowitz_data")
data("elowitz_data")
The format is: List of 2 $ D22:'data.frame': 284 obs. of 2 variables: ..$ CFP: num [1:284] 3080 3082 2893 3053 2891 ... ..$ YFP: num [1:284] 2309 2394 2145 2340 2245 ... $ M22:'data.frame': 250 obs. of 2 variables: ..$ CFP: num [1:250] 2438 2316 2521 2646 2830 ... ..$ YFP: num [1:250] 1409 1391 1511 1460 1638 ...
Elowitz, M. B., A. J. Levine, E. D. Siggia, and P. S. Swain (2002) Stochastic gene expression in a single cell. Science, 297, 1183-1186.
data(elowitz_data) # Normalize data such that they are # comparable to Fig 3a in Elowitz et al. (2002). # Normalized data have mean 1. D22.cfp.norm <- (elowitz_data$D22[,1]-mean (elowitz_data$D22[,1]))/sd(elowitz_data$D22[,1])/8+1 D22.yfp.norm <- (elowitz_data$D22[,2]-mean (elowitz_data$D22[,2]))/sd(elowitz_data$D22[,2])/8+1 M22.cfp.norm <- (elowitz_data$M22[,1]-mean (elowitz_data$M22[,1]))/sd(elowitz_data$M22[,1])/12+1 M22.yfp.norm <- (elowitz_data$M22[,2]-mean (elowitz_data$M22[,2]))/sd(elowitz_data$M22[,2])/12+1 # Compute noise estimates. # Since the mean is 1, estimates with and without # the scaling are the same. unlist (computeIntrinsicNoise (D22.cfp.norm, D22.yfp.norm)) unlist (computeExtrinsicNoise (D22.cfp.norm, D22.yfp.norm)) unlist (computeIntrinsicNoise (M22.cfp.norm, M22.yfp.norm)) unlist (computeExtrinsicNoise (M22.cfp.norm, M22.yfp.norm))
data(elowitz_data) # Normalize data such that they are # comparable to Fig 3a in Elowitz et al. (2002). # Normalized data have mean 1. D22.cfp.norm <- (elowitz_data$D22[,1]-mean (elowitz_data$D22[,1]))/sd(elowitz_data$D22[,1])/8+1 D22.yfp.norm <- (elowitz_data$D22[,2]-mean (elowitz_data$D22[,2]))/sd(elowitz_data$D22[,2])/8+1 M22.cfp.norm <- (elowitz_data$M22[,1]-mean (elowitz_data$M22[,1]))/sd(elowitz_data$M22[,1])/12+1 M22.yfp.norm <- (elowitz_data$M22[,2]-mean (elowitz_data$M22[,2]))/sd(elowitz_data$M22[,2])/12+1 # Compute noise estimates. # Since the mean is 1, estimates with and without # the scaling are the same. unlist (computeIntrinsicNoise (D22.cfp.norm, D22.yfp.norm)) unlist (computeExtrinsicNoise (D22.cfp.norm, D22.yfp.norm)) unlist (computeIntrinsicNoise (M22.cfp.norm, M22.yfp.norm)) unlist (computeExtrinsicNoise (M22.cfp.norm, M22.yfp.norm))
This function simulates expression levels of two reporters across single cells, mimicking the two-reporter assay. The hierarchical model described in Fu and Pachter (2016) is used for simulation. We further make the simplifying assumption that intrinsic noise is the same across cells.
simulateSC(n = 1000, intrinsic = 0.7, extrinsic = 0.8, mean = 1)
simulateSC(n = 1000, intrinsic = 0.7, extrinsic = 0.8, mean = 1)
n |
Number of single cells (sample size). |
intrinsic |
Scalar. The (unscaled) intrinsic noise (or within-cell variability), denoted
by |
extrinsic |
Scalar. The (unscaled) extrinsic noise (or between-cell variability), denoted
by |
mean |
Scalar. The overall mean of expression level, denoted by |
A data frame of two columns and rows. Each column contains the expression levels of a reporter. Each row is a single cell.
Audrey Q. Fu
Fu, A. Q. and Pachter, L. (2016). Estimating intrinsic and extrinsic noise from single-cell gene expression measurements. arXiv:1601.03334.
computeIntrinsicNoise
, computeExtrinsicNoise
.
# simulation #1 # simulate 500 data sets n.simu <- 500 # true intrinsic and extrinsic noise int.true <- 0.7 ext.true <- 0.8 # create matrices to hold estimated intrinsic and extrinsic noise # using different estimators int.simu.mtx <- matrix (0, nrow=n.simu, ncol=8) ext.simu.mtx <- matrix (0, nrow=n.simu, ncol=4) for (i in 1:n.simu) { n <- 1000 simu <- simulateSC (n=n, intrinsic=int.true, extrinsic=ext.true, mean=1) int.simu.mtx[i,] <- c(unlist (computeIntrinsicNoise (simu[,1], simu[,2])), cor (simu[,1], simu[,2])) ext.simu.mtx[i,] <- unlist (computeExtrinsicNoise (simu[,1], simu[,2])) } # add column names to simulation estimates colnames (int.simu.mtx) <- c("ELSS", "unbiasedGeneral", "unbiasedEqualMean", "minMSEGeneral", "minMSEEqualMean", "asymptoticGeneral", "asymptoticEqualMean", "cor") colnames (ext.simu.mtx) <- c("ELSS", "unbiased", "minMSE", "asymptotic") # simulation #2 # simulate 500 data sets n.simu <- 500 # true intrinsic and extrinsic noise int.true <- 0.7 ext.true <- 0.8 # use true correlation for the min-MSE estimates of extrinsic noise true.cor <- ext.true / (ext.true + int.true) # create matrices to hold estimated intrinsic and extrinsic noise # using different estimators int.simu.mtx <- matrix (0, nrow=n.simu, ncol=8) ext.simu.mtx <- matrix (0, nrow=n.simu, ncol=4) ext.simu.mtx.2 <- matrix (0, nrow=n.simu, ncol=4) for (i in 1:n.simu) { n <- 50 simu <- simulateSC (n=n, intrinsic=int.true, extrinsic=ext.true, mean=1) int.simu.mtx[i,] <- c(unlist (computeIntrinsicNoise (simu[,1], simu[,2])), cor (simu[,1], simu[,2])) ext.simu.mtx[i,] <- unlist (computeExtrinsicNoise (simu[,1], simu[,2])) ext.simu.mtx.2[i,] <- c(unlist (computeExtrinsicNoiseKnownCor (simu[,1], simu[,2], true.cor))) } # add column names to simulation estimates colnames (int.simu.mtx) <- c("ELSS", "unbiasedGeneral", "unbiasedEqualMean", "minMSEGeneral", "minMSEEqualMean", "asymptoticGeneral", "asymptoticEqualMean", "cor") colnames (ext.simu.mtx) <- c("ELSS", "unbiased", "minMSE", "asymptotic") colnames (ext.simu.mtx.2) <- c("ELSS", "unbiased", "minMSE", "asymptotic") # compute the MSE of estimates computeMSE <- function (a, t) {return (mean((a-t)^2))} apply (int.simu.mtx[,1:7], 2, computeMSE, t=int.true) apply (ext.simu.mtx, 2, computeMSE, t=ext.true) apply (ext.simu.mtx.2, 2, computeMSE, t=ext.true)
# simulation #1 # simulate 500 data sets n.simu <- 500 # true intrinsic and extrinsic noise int.true <- 0.7 ext.true <- 0.8 # create matrices to hold estimated intrinsic and extrinsic noise # using different estimators int.simu.mtx <- matrix (0, nrow=n.simu, ncol=8) ext.simu.mtx <- matrix (0, nrow=n.simu, ncol=4) for (i in 1:n.simu) { n <- 1000 simu <- simulateSC (n=n, intrinsic=int.true, extrinsic=ext.true, mean=1) int.simu.mtx[i,] <- c(unlist (computeIntrinsicNoise (simu[,1], simu[,2])), cor (simu[,1], simu[,2])) ext.simu.mtx[i,] <- unlist (computeExtrinsicNoise (simu[,1], simu[,2])) } # add column names to simulation estimates colnames (int.simu.mtx) <- c("ELSS", "unbiasedGeneral", "unbiasedEqualMean", "minMSEGeneral", "minMSEEqualMean", "asymptoticGeneral", "asymptoticEqualMean", "cor") colnames (ext.simu.mtx) <- c("ELSS", "unbiased", "minMSE", "asymptotic") # simulation #2 # simulate 500 data sets n.simu <- 500 # true intrinsic and extrinsic noise int.true <- 0.7 ext.true <- 0.8 # use true correlation for the min-MSE estimates of extrinsic noise true.cor <- ext.true / (ext.true + int.true) # create matrices to hold estimated intrinsic and extrinsic noise # using different estimators int.simu.mtx <- matrix (0, nrow=n.simu, ncol=8) ext.simu.mtx <- matrix (0, nrow=n.simu, ncol=4) ext.simu.mtx.2 <- matrix (0, nrow=n.simu, ncol=4) for (i in 1:n.simu) { n <- 50 simu <- simulateSC (n=n, intrinsic=int.true, extrinsic=ext.true, mean=1) int.simu.mtx[i,] <- c(unlist (computeIntrinsicNoise (simu[,1], simu[,2])), cor (simu[,1], simu[,2])) ext.simu.mtx[i,] <- unlist (computeExtrinsicNoise (simu[,1], simu[,2])) ext.simu.mtx.2[i,] <- c(unlist (computeExtrinsicNoiseKnownCor (simu[,1], simu[,2], true.cor))) } # add column names to simulation estimates colnames (int.simu.mtx) <- c("ELSS", "unbiasedGeneral", "unbiasedEqualMean", "minMSEGeneral", "minMSEEqualMean", "asymptoticGeneral", "asymptoticEqualMean", "cor") colnames (ext.simu.mtx) <- c("ELSS", "unbiased", "minMSE", "asymptotic") colnames (ext.simu.mtx.2) <- c("ELSS", "unbiased", "minMSE", "asymptotic") # compute the MSE of estimates computeMSE <- function (a, t) {return (mean((a-t)^2))} apply (int.simu.mtx[,1:7], 2, computeMSE, t=int.true) apply (ext.simu.mtx, 2, computeMSE, t=ext.true) apply (ext.simu.mtx.2, 2, computeMSE, t=ext.true)
Expression of reporter genes CFP and mCherry in over 40,000 E. coli cells. A subset of these values are displayed in a scatterplot in Yang et al (2014) Fig 3a rightmost panel.
data("yang_nl10")
data("yang_nl10")
A data frame with 40683 observations on the following 3 variables.
CFP
a numeric vector
Venus
a numeric vector
mCherry
a numeric vector
Yang, S., S. Kim, Y. R. Lim, C. Kim, H. J. An, J.-H. Kim, J. Sung, and N. K. Lee (2014) Contribution of RNA polymerase concentration variation to protein expression noise. Nature Communications, 5, 4761.
data(yang_nl10) # compute the noise estimates # no normalization # unscaled by mean unlist (computeIntrinsicNoise (yang_nl10[,1], yang_nl10[,3])) unlist (computeExtrinsicNoise (yang_nl10[,1], yang_nl10[,3])) # scaled by mean unlist (computeIntrinsicNoise (yang_nl10[,1], yang_nl10[,3])) / mean (yang_nl10[,1]) / mean(yang_nl10[,3]) unlist (computeExtrinsicNoise (yang_nl10[,1], yang_nl10[,3])) / mean (yang_nl10[,1]) / mean(yang_nl10[,3]) # quantile normalization on log2 transformed data # install bioconductor package for quantile normalization # source('http://bioconductor.org/biocLite.R') # biocLite('preprocessCore') library(preprocessCore) # ignore a few values that are negative yang_nl10.pos <- yang_nl10[-which (yang_nl10[,1]<0),] yang_nl10.pos.log2.quant <- normalize.quantiles (as.matrix (log2 (yang_nl10.pos[,c(1,3)]))) # unscaled by mean unlist (computeIntrinsicNoise (yang_nl10.pos.log2.quant[,1], yang_nl10.pos.log2.quant[,2])) unlist (computeExtrinsicNoise (yang_nl10.pos.log2.quant[,1], yang_nl10.pos.log2.quant[,2])) # scaled by mean unlist (computeIntrinsicNoise (yang_nl10.pos.log2.quant[,1], yang_nl10.pos.log2.quant[,2])) / mean (yang_nl10.pos.log2.quant[,1]) / mean(yang_nl10.pos.log2.quant[,2]) unlist (computeExtrinsicNoise (yang_nl10.pos.log2.quant[,1], yang_nl10.pos.log2.quant[,2])) / mean (yang_nl10.pos.log2.quant[,1]) / mean(yang_nl10.pos.log2.quant[,2])
data(yang_nl10) # compute the noise estimates # no normalization # unscaled by mean unlist (computeIntrinsicNoise (yang_nl10[,1], yang_nl10[,3])) unlist (computeExtrinsicNoise (yang_nl10[,1], yang_nl10[,3])) # scaled by mean unlist (computeIntrinsicNoise (yang_nl10[,1], yang_nl10[,3])) / mean (yang_nl10[,1]) / mean(yang_nl10[,3]) unlist (computeExtrinsicNoise (yang_nl10[,1], yang_nl10[,3])) / mean (yang_nl10[,1]) / mean(yang_nl10[,3]) # quantile normalization on log2 transformed data # install bioconductor package for quantile normalization # source('http://bioconductor.org/biocLite.R') # biocLite('preprocessCore') library(preprocessCore) # ignore a few values that are negative yang_nl10.pos <- yang_nl10[-which (yang_nl10[,1]<0),] yang_nl10.pos.log2.quant <- normalize.quantiles (as.matrix (log2 (yang_nl10.pos[,c(1,3)]))) # unscaled by mean unlist (computeIntrinsicNoise (yang_nl10.pos.log2.quant[,1], yang_nl10.pos.log2.quant[,2])) unlist (computeExtrinsicNoise (yang_nl10.pos.log2.quant[,1], yang_nl10.pos.log2.quant[,2])) # scaled by mean unlist (computeIntrinsicNoise (yang_nl10.pos.log2.quant[,1], yang_nl10.pos.log2.quant[,2])) / mean (yang_nl10.pos.log2.quant[,1]) / mean(yang_nl10.pos.log2.quant[,2]) unlist (computeExtrinsicNoise (yang_nl10.pos.log2.quant[,1], yang_nl10.pos.log2.quant[,2])) / mean (yang_nl10.pos.log2.quant[,1]) / mean(yang_nl10.pos.log2.quant[,2])