factR-exp.Rd
A set of functions to incorporate expression data into factR object and tests regulatory potential of alternative splicing events.
Correlates exon inclusion levels with gene expression levels
# S4 method for factR
addTxCounts(object, countData, sampleData = NULL, psi = NULL, verbose = TRUE)
# S4 method for factR
testGeneCorr(object, vst = TRUE, min_n = 3, ...)
factR object
Can be one of the following:
Path to local count matrix file in .tsv or .csv format
Matrix object containing transcript-level expression counts data.
(Optional) Dataframe containing samples information. Dataframe rows can be named to match the sample names in `countData`. If `sampleData` has no row names, function will attempt to pick the column containing sample names and assign it as rownames.
(Optional) Exon-level splicing inclusion data. Can be one of the following:
Path to local file in .tsv or .csv format
Matrix object
Rownames of matrix should be in chr:start-end format
Boolean value as to whether messages should be printed (Default: TRUE)
whether to apply variance stabilization on splicing and expression levels
minimum number of non-NA samples required for correlation testing
additional arguments parsed to cor.test function
factRObject with updated counts data and samples metadata.
factRObject with updated ASE metadata
`addTxCounts` converts transcript-level expression counts into gene-level expression counts. If "psiData" is not provided, this function will determine the inclusion level of alternative splicing events (psi) for each sample by calculating the proportion of transcripts containing the splicing event over all transcripts that spans that splicing event. If "psiData" is provided, will check for common exon coordinates and non-overlapping ones detected by factR2 will be set to NA. Row names of "sampleData" dataframe can be unnamed, and this function will search for column variable corresponding to sample names. If unsuccessful or if row names of "sampleData" does not match to column names of "countData", an error will be returned.
factRObject-class
factR-exp-meta
cor.test
factR-exp
## get path to sample GTF and expression data
gtf <- system.file("extdata", "pb_custom.gtf.gz", package = "factR2")
counts <- system.file("extdata", "pb_expression.tsv.gz", package = "factR2")
## create factRObject with expression counts
factRObject <- createfactRObject(gtf, "vM33", countData = counts)
#> 🡆 Checking inputs
#> 🡆 Checking factRObject
#> 🡆 Adding custom transcriptome
#> ℹ Importing from local directory
#> 🡆 Adding annotation
#> ℹ Importing from URL
#> 🡆 Adding genome sequence
#> ℹ Using BSgenome object
#> Warning: 4 seqlevel(s) in ``your custom GTF`` are not found in ``reference annotation``
#> 🡆 Matching chromosome names
#> 🡆 Matching gene information
#> Number of mismatched gene_ids found: 83520
#> --> Attempting to match ensembl gene_ids...
#> --> No ensembl gene ids found in query
#> ---> Attempting to match gene_ids by finding overlapping coordinates...
#> ---> 65203 gene_id matched
#> Total gene_ids corrected: 65203
#> Remaining number of mismatched gene_ids: 18317
#> 🡆 Creating factRset objects
#> ℹ Adding gene information
#> ℹ Adding transcript information
#> ℹ Adding alternative splicing information
#> 🡆 Annotating novel transcripts
#> 🡆 Annotating novel AS events
#> 🡆 Adding expression data
#> ℹ Importing from local file
#> 🡆 Creating samples metadata
#> 🡆 Processing expression data
#> ℹ Adding gene counts
#> ℹ Adding spliced-event counts
#> ℹ Normalizing counts
#> ℹ factRobject created!
### This can also be done post-creation of a factR object
if (FALSE) {
factRObject <- createfactRObject(gtf, "vM33")
factRObject <- addTxCounts(factRObject, counts)}
## access counts data
counts(factRObject) # returns normalised expression data from current Set
#> P14_M5_1 P14_M5_2 P14_M5_3 P14_M6_1 P14_M6_2 P14_M6_3 P28_M1_1 P28_M1_2
#> AS01129 0.123000 0.170000 0.155000 1.12e-01 5.48e-02 0.132000 0.124000 0.169000
#> AS01642 0.166000 0.230000 0.140000 2.63e-01 0.00e+00 0.843000 0.177000 0.000000
#> AS01647 0.000000 0.000000 0.000000 7.52e-01 4.19e-01 0.000000 0.287000 0.502000
#> AS01649 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.301000 0.518000
#> AS01650 0.223000 0.000000 0.000000 3.41e-01 2.70e-01 0.000000 0.453000 0.608000
#> AS01651 0.223000 0.000000 0.000000 3.41e-01 2.70e-01 0.000000 0.453000 0.608000
#> AS01652 0.543000 0.372000 0.533000 4.08e-01 2.87e-01 0.240000 0.566000 0.388000
#> AS01657 0.241000 0.241000 0.346000 1.66e-01 1.26e-01 0.284000 0.126000 0.117000
#> AS01658 0.561000 0.390000 0.118000 4.73e-01 5.92e-01 0.285000 0.326000 0.250000
#> AS01659 0.561000 0.390000 0.118000 4.73e-01 5.92e-01 0.285000 0.326000 0.250000
#> AS01662 0.217000 0.217000 0.188000 2.06e-01 2.74e-01 0.510000 0.077500 0.054600
#> AS01663 0.217000 0.217000 0.188000 2.06e-01 2.74e-01 0.510000 0.077500 0.054600
#> AS01665 0.000000 0.000000 0.103000 0.00e+00 1.11e-01 0.000000 0.200000 0.186000
#> AS01667 0.000000 0.595000 0.000000 4.07e-01 1.43e-01 0.478000 0.100000 0.000000
#> AS02707 0.059900 0.000000 0.000000 3.68e-01 0.00e+00 0.000000 0.032400 0.113000
#> AS02708 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.095200 0.060200
#> AS02709 0.258000 0.000000 0.629000 5.60e-01 8.86e-01 0.741000 0.187000 0.113000
#> AS02710 0.000000 0.165000 0.000000 0.00e+00 0.00e+00 0.000000 0.341000 0.257000
#> AS02711 0.000000 0.000000 0.000000 1.01e-01 0.00e+00 0.000000 0.100000 0.186000
#> AS02712 0.000000 0.000000 0.000000 1.01e-01 0.00e+00 0.000000 0.100000 0.186000
#> AS02714 0.000000 0.000000 0.000000 0.00e+00 3.34e-01 0.271000 0.000000 0.000000
#> AS02738 0.321000 0.341000 0.326000 2.93e-01 2.93e-01 0.293000 0.181000 0.167000
#> AS02739 0.334000 0.324000 0.332000 3.48e-01 3.48e-01 0.348000 0.353000 0.453000
#> AS02749 0.835000 1.000000 0.590000 2.92e-01 3.51e-01 0.302000 1.000000 0.419000
#> AS02750 0.000000 0.000000 0.187000 5.38e-02 1.54e-01 0.000000 0.000000 0.000000
#> AS02751 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.092100 0.201000 0.174000
#> AS02756 0.000000 0.000000 0.000000 2.10e-01 0.00e+00 0.000000 0.000000 0.000000
#> AS02777 1.000000 1.000000 0.700000 5.99e-01 1.00e+00 0.000000 0.803000 1.000000
#> AS02784 0.360000 0.568000 0.252000 0.00e+00 0.00e+00 0.000000 0.198000 0.000000
#> AS02838 0.056800 0.135000 0.173000 2.16e-01 2.97e-01 0.202000 0.120000 0.140000
#> AS02847 0.598000 0.710000 0.609000 7.86e-01 8.72e-01 0.797000 0.620000 0.609000
#> AS02850 0.099500 0.181000 0.000000 0.00e+00 3.56e-01 0.000000 0.356000 0.172000
#> AS02854 0.137000 0.000000 0.106000 4.88e-01 7.04e-01 0.544000 0.443000 0.000000
#> AS02856 0.372000 0.624000 0.458000 7.57e-01 8.80e-01 0.830000 0.692000 0.769000
#> AS02857 0.000000 0.000000 0.027100 1.24e-01 4.96e-02 0.000000 0.058500 0.000000
#> AS02858 0.000000 0.000000 0.000000 0.00e+00 3.42e-01 0.000000 0.000000 0.000000
#> AS02859 0.000000 0.000000 0.000000 0.00e+00 3.42e-01 0.000000 0.000000 0.000000
#> AS02860 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.124000 0.443000
#> AS02866 0.000000 0.345000 0.000000 0.00e+00 1.67e-01 0.000000 0.412000 0.319000
#> AS03103 0.000000 0.085700 0.000000 0.00e+00 6.69e-02 0.000000 0.000000 0.000000
#> AS03108 0.532000 0.238000 0.303000 3.56e-01 1.06e-01 0.378000 0.263000 0.465000
#> AS03110 0.000000 0.000000 0.000000 3.64e-01 2.77e-01 NaN 0.000000 0.364000
#> AS03112 0.163000 0.000000 0.000000 3.05e-01 0.00e+00 NaN 0.841000 0.815000
#> AS03117 0.000000 0.000000 0.407000 0.00e+00 6.47e-01 1.000000 0.000000 0.000000
#> AS03128 0.000000 0.000000 0.000000 4.71e-01 0.00e+00 1.000000 0.000000 0.000000
#> AS03135 0.165000 0.000000 0.308000 3.08e-01 3.72e-01 1.000000 0.000000 0.000000
#> AS03226 0.297000 0.458000 0.628000 1.00e+00 8.08e-01 0.772000 0.628000 0.628000
#> AS03237 0.000000 0.000000 0.301000 NaN 5.64e-01 0.000000 0.000000 0.721000
#> AS03271 1.000000 1.000000 1.000000 NaN 1.00e+00 1.000000 1.000000 1.000000
#> AS03653 0.000000 NaN 0.000000 0.00e+00 0.00e+00 NaN 0.754000 0.000000
#> AS03656 0.000000 NaN 0.000000 0.00e+00 0.00e+00 NaN 0.623000 0.000000
#> AS03664 0.000000 NaN NaN NaN 7.60e-01 NaN NaN NaN
#> AS04209 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.000000 0.367000
#> AS04210 0.892000 1.000000 1.000000 9.06e-01 9.23e-01 0.828000 0.391000 0.559000
#> AS04213 0.000000 0.000000 0.000000 9.39e-02 3.99e-02 0.172000 0.609000 0.000000
#> AS04221 1.000000 NaN NaN NaN 1.00e+00 NaN NaN 0.000000
#> AS04223 0.000000 NaN NaN NaN 0.00e+00 NaN NaN 1.000000
#> AS04225 1.000000 NaN NaN NaN 1.00e+00 NaN NaN 0.000000
#> AS04364 NaN NaN NaN 0.00e+00 0.00e+00 0.000000 NaN NaN
#> AS04433 0.454000 0.268000 0.573000 4.39e-01 5.59e-01 0.443000 0.234000 0.215000
#> AS04437 0.377000 0.139000 0.315000 1.47e-01 2.81e-01 0.247000 0.396000 0.280000
#> AS04439 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.061400 0.000000
#> AS04440 0.513000 0.726000 0.587000 5.20e-01 7.03e-01 0.558000 0.537000 0.713000
#> AS04519 0.000000 0.000000 0.000000 1.00e+00 0.00e+00 0.000000 0.412000 0.000000
#> AS04763 0.238000 0.278000 0.096200 2.38e-01 1.93e-01 0.227000 0.163000 0.104000
#> AS04974 1.000000 NaN 1.000000 0.00e+00 NaN NaN NaN NaN
#> AS04977 0.000000 NaN 0.000000 0.00e+00 NaN NaN NaN NaN
#> AS05912 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.000000 0.365000
#> AS05917 0.000000 0.000000 0.000000 0.00e+00 3.23e-01 0.000000 0.417000 0.000000
#> AS05919 0.000000 0.556000 0.000000 0.00e+00 0.00e+00 0.000000 0.297000 0.297000
#> AS05920 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.466000 0.466000
#> AS05922 0.000000 0.000000 0.000000 5.52e-01 0.00e+00 0.000000 0.000000 0.000000
#> AS05925 0.000000 0.000000 0.000000 7.26e-01 7.26e-01 0.000000 0.570000 0.000000
#> AS05950 1.000000 0.692000 1.000000 6.63e-01 1.00e+00 1.000000 1.000000 NaN
#> AS06103 0.000000 0.000000 0.000000 3.88e-03 0.00e+00 0.000000 0.010400 0.004470
#> AS06105 0.195000 0.121000 0.164000 1.25e-01 1.71e-01 0.127000 0.156000 0.168000
#> AS07063 0.000000 0.000000 0.050200 3.71e-02 0.00e+00 0.000000 0.062000 0.000000
#> AS07065 0.078400 0.000000 0.087300 0.00e+00 0.00e+00 0.000000 0.000000 0.000000
#> AS07081 0.724000 1.000000 0.778000 1.00e+00 8.27e-01 0.833000 0.833000 1.000000
#> AS07082 0.039700 0.000000 0.044700 0.00e+00 8.67e-02 0.055800 0.055800 0.000000
#> AS07083 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.000000 0.000000
#> AS07272 NaN 0.000000 NaN 0.00e+00 0.00e+00 0.000000 1.000000 0.000000
#> AS10530 0.000000 0.000000 0.000000 7.42e-02 8.38e-02 0.000000 0.000000 0.000000
#> P28_M1_3 P28_M2_1 P28_M2_2 P28_M2_3
#> AS01129 0.178000 0.017200 0.048900 0.039800
#> AS01642 0.183000 0.230000 0.000000 0.000000
#> AS01647 0.387000 0.457000 0.000000 0.528000
#> AS01649 0.692000 0.000000 0.000000 0.000000
#> AS01650 0.393000 0.000000 0.000000 0.000000
#> AS01651 0.393000 0.000000 0.000000 0.000000
#> AS01652 0.469000 0.110000 0.122000 0.204000
#> AS01657 0.075300 0.000000 0.185000 0.000000
#> AS01658 0.223000 0.000000 0.407000 0.313000
#> AS01659 0.223000 0.000000 0.407000 0.313000
#> AS01662 0.000000 0.316000 0.373000 0.165000
#> AS01663 0.000000 0.316000 0.373000 0.165000
#> AS01665 0.124000 0.186000 0.164000 0.164000
#> AS01667 0.044900 0.000000 0.000000 0.000000
#> AS02707 0.202000 0.460000 0.386000 0.237000
#> AS02708 0.000000 0.000000 0.000000 0.000000
#> AS02709 0.280000 0.504000 0.696000 0.556000
#> AS02710 0.438000 0.343000 0.000000 0.000000
#> AS02711 0.424000 0.000000 0.000000 0.000000
#> AS02712 0.424000 0.000000 0.000000 0.000000
#> AS02714 0.000000 0.000000 0.000000 0.000000
#> AS02738 0.166000 0.293000 0.293000 0.293000
#> AS02739 0.460000 0.348000 0.348000 0.348000
#> AS02749 1.000000 0.000000 1.000000 NaN
#> AS02750 0.000000 0.000000 0.000000 NaN
#> AS02751 0.000000 0.000000 0.000000 NaN
#> AS02756 0.000000 0.000000 0.000000 NaN
#> AS02777 0.472000 1.000000 0.461000 1.000000
#> AS02784 0.000000 0.000000 0.000000 0.000000
#> AS02838 0.189000 0.130000 0.206000 0.112000
#> AS02847 0.597000 0.791000 0.848000 0.845000
#> AS02850 0.216000 0.293000 0.624000 NaN
#> AS02854 0.166000 0.000000 0.000000 NaN
#> AS02856 0.620000 0.640000 0.897000 0.530000
#> AS02857 0.000000 0.000000 0.053100 0.470000
#> AS02858 0.000000 0.281000 0.886000 1.000000
#> AS02859 0.000000 0.281000 0.886000 1.000000
#> AS02860 0.346000 0.000000 0.718000 1.000000
#> AS02866 0.529000 0.412000 0.219000 0.260000
#> AS03103 0.344000 0.112000 0.177000 0.000000
#> AS03108 0.000000 0.000000 0.263000 0.000000
#> AS03110 0.000000 0.000000 0.000000 0.000000
#> AS03112 0.925000 0.815000 0.000000 0.925000
#> AS03117 0.000000 0.000000 0.733000 0.733000
#> AS03128 0.000000 0.640000 0.842000 0.640000
#> AS03135 0.000000 0.000000 0.000000 0.000000
#> AS03226 0.000000 0.000000 0.000000 0.000000
#> AS03237 0.177000 0.000000 0.000000 0.000000
#> AS03271 0.831000 0.164000 0.197000 0.141000
#> AS03653 NaN NaN 1.000000 NaN
#> AS03656 1.000000 NaN NaN NaN
#> AS03664 NaN NaN NaN NaN
#> AS04209 0.605000 0.000000 0.000000 NaN
#> AS04210 0.154000 1.000000 1.000000 NaN
#> AS04213 0.404000 0.000000 0.000000 NaN
#> AS04221 0.000000 NaN NaN NaN
#> AS04223 1.000000 NaN NaN NaN
#> AS04225 0.000000 NaN NaN NaN
#> AS04364 1.000000 1.000000 NaN NaN
#> AS04433 0.130000 0.288000 0.276000 0.205000
#> AS04437 0.371000 0.368000 0.350000 0.474000
#> AS04439 0.145000 0.000000 0.000000 0.000000
#> AS04440 0.786000 0.850000 0.606000 0.877000
#> AS04519 0.000000 0.000000 0.000000 0.000000
#> AS04763 0.039800 0.073700 0.126000 0.090700
#> AS04974 NaN 0.000000 NaN NaN
#> AS04977 NaN 0.000000 1.000000 1.000000
#> AS05912 0.000000 0.000000 0.697000 0.000000
#> AS05917 0.000000 0.417000 0.000000 0.000000
#> AS05919 0.000000 0.000000 0.000000 0.000000
#> AS05920 0.724000 0.000000 0.000000 0.000000
#> AS05922 0.552000 0.000000 0.000000 0.787000
#> AS05925 0.469000 0.000000 0.000000 0.000000
#> AS05950 0.000000 0.692000 0.628000 1.000000
#> AS06103 0.000000 0.005450 0.011100 0.000000
#> AS06105 0.180000 0.141000 0.101000 0.186000
#> AS07063 0.000000 0.000000 0.000000 0.103000
#> AS07065 0.000000 0.000000 0.000000 0.000000
#> AS07081 1.000000 1.000000 0.770000 0.807000
#> AS07082 0.000000 0.000000 0.000000 0.000000
#> AS07083 0.000000 0.425000 0.000000 0.000000
#> AS07272 0.000000 NaN NaN 0.517000
#> AS10530 0.000000 0.000000 0.000000 0.000000
#> [ reached getOption("max.print") -- omitted 48114 rows ]
counts(factRObject, "Ptbp1") # same as above, but only for specific genes
#> P14_M5_1 P14_M5_2 P14_M5_3 P14_M6_1 P14_M6_2 P14_M6_3 P28_M1_1 P28_M1_2
#> AS27166 0.596 0.392 0.26 0.492 0.492 0.492 0.392 0.446
#> AS27168 0.000 0.000 0.00 0.000 0.480 0.480 0.000 0.334
#> P28_M1_3 P28_M2_1 P28_M2_2 P28_M2_3
#> AS27166 0.659 0.340 0.693 0.301
#> AS27168 0.382 0.324 0.527 0.000
counts(factRObject, "Ptbp1", set = "transcript") # get transcript-level expression
#> P14_M5_1 P14_M5_2 P14_M5_3 P14_M6_1 P14_M6_2
#> transcript37580.chr10.nnic 11.410569 11.39594 5.6328695 1.840952 1.9739042
#> transcript37593.chr10.nnic 4.992124 11.39594 10.3269274 1.227302 1.3159361
#> transcript37653.chr10.nnic 0.000000 0.00000 0.0000000 0.000000 0.6579681
#> transcript37657.chr10.nnic 0.000000 0.00000 0.9388116 0.000000 0.0000000
#> P14_M6_3 P28_M1_1 P28_M1_2 P28_M1_3 P28_M2_1
#> transcript37580.chr10.nnic 4.477255 4.637786 4.7871969 5.6232337 5.331361
#> transcript37593.chr10.nnic 2.984836 4.637786 3.8297575 1.8744112 6.664201
#> transcript37653.chr10.nnic 1.492418 0.000000 0.9574394 0.9372056 1.332840
#> transcript37657.chr10.nnic 0.000000 0.000000 1.9148788 0.0000000 0.000000
#> P28_M2_2 P28_M2_3
#> transcript37580.chr10.nnic 8.624391 2.591131
#> transcript37593.chr10.nnic 2.464112 3.886697
#> transcript37653.chr10.nnic 2.464112 0.000000
#> transcript37657.chr10.nnic 0.000000 0.000000
counts(factRObject, "Ptbp1", set = "gene") # get gene-level expression
#> P14_M5_1 P14_M5_2 P14_M5_3 P14_M6_1 P14_M6_2 P14_M6_3 P28_M1_1 P28_M1_2
#> 16.654551 23.042081 17.085144 3.101008 3.970570 9.086108 9.366016 11.628876
#> P28_M1_3 P28_M2_1 P28_M2_2 P28_M2_3
#> 8.447887 13.365096 13.503386 6.459794
counts(factRObject, "Ptbp1", set = "gene", slot = "counts") # get gene-level count
#> P14_M5_1 P14_M5_2 P14_M5_3 P14_M6_1 P14_M6_2 P14_M6_3 P28_M1_1 P28_M1_2
#> 23 20 18 5 6 6 12 12
#> P28_M1_3 P28_M2_1 P28_M2_2 P28_M2_3
#> 9 10 11 5
## access samples metadata
samples(factRObject)
#> proj.ident
#> P14_M5_1 factRProject
#> P14_M5_2 factRProject
#> P14_M5_3 factRProject
#> P14_M6_1 factRProject
#> P14_M6_2 factRProject
#> P14_M6_3 factRProject
#> P28_M1_1 factRProject
#> P28_M1_2 factRProject
#> P28_M1_3 factRProject
#> P28_M2_1 factRProject
#> P28_M2_2 factRProject
#> P28_M2_3 factRProject
ident(factRObject) # prints out the current identity
#> Error in c_character(...): Character input expected
## run correlation between gene expression and exon inclusion
factRObject <- testGeneCorr(factRObject)
#> Error in dplyr::select(., AS_id, gene_id, ASNMDtype): Can't subset columns that don't exist.
#> ✖ Column `ASNMDtype` doesn't exist.
ase(factRObject)
#> ℹ Set `show_more to TRUE to show more info`
#> # A tibble: 48,197 × 8
#> AS_id gene_id gene_name coord AStype strand width novel
#> <chr> <chr> <chr> <chr> <fct> <fct> <int> <chr>
#> 1 AS01129 ENSMUSG00000033813.16 Tcea1 chr1:49570… AD + 231 yes
#> 2 AS01642 ENSMUSG00000025907.15 Rb1cc1 chr1:62851… AD + 1104 yes
#> 3 AS01647 ENSMUSG00000025907.15 Rb1cc1 chr1:62929… AF + 499 yes
#> 4 AS01649 ENSMUSG00000025907.15 Rb1cc1 chr1:63043… RI + 141 yes
#> 5 AS01650 ENSMUSG00000025907.15 Rb1cc1 chr1:63152… RI + 178 yes
#> 6 AS01651 ENSMUSG00000025907.15 Rb1cc1 chr1:63155… RI + 80 yes
#> 7 AS01652 ENSMUSG00000025907.15 Rb1cc1 chr1:63198… RI + 87 yes
#> 8 AS01657 ENSMUSG00000025907.15 Rb1cc1 chr1:63313… AD + 749 yes
#> 9 AS01658 ENSMUSG00000025907.15 Rb1cc1 chr1:63331… RI + 81 yes
#> 10 AS01659 ENSMUSG00000025907.15 Rb1cc1 chr1:63334… AD + 173 yes
#> # ℹ 48,187 more rows