A set of functions to incorporate expression data into factR object and tests regulatory potential of alternative splicing events.

Correlates exon inclusion levels with gene expression levels

# S4 method for factR
addTxCounts(object, countData, sampleData = NULL, psi = NULL, verbose = TRUE)

# S4 method for factR
testGeneCorr(object, vst = TRUE, min_n = 3, ...)

Arguments

object

factR object

countData

Can be one of the following:

  • Path to local count matrix file in .tsv or .csv format

  • Matrix object containing transcript-level expression counts data.

sampleData

(Optional) Dataframe containing samples information. Dataframe rows can be named to match the sample names in `countData`. If `sampleData` has no row names, function will attempt to pick the column containing sample names and assign it as rownames.

psi

(Optional) Exon-level splicing inclusion data. Can be one of the following:

  • Path to local file in .tsv or .csv format

  • Matrix object

Rownames of matrix should be in chr:start-end format

verbose

Boolean value as to whether messages should be printed (Default: TRUE)

vst

whether to apply variance stabilization on splicing and expression levels

min_n

minimum number of non-NA samples required for correlation testing

...

additional arguments parsed to cor.test function

Value

factRObject with updated counts data and samples metadata.

factRObject with updated ASE metadata

Details

`addTxCounts` converts transcript-level expression counts into gene-level expression counts. If "psiData" is not provided, this function will determine the inclusion level of alternative splicing events (psi) for each sample by calculating the proportion of transcripts containing the splicing event over all transcripts that spans that splicing event. If "psiData" is provided, will check for common exon coordinates and non-overlapping ones detected by factR2 will be set to NA. Row names of "sampleData" dataframe can be unnamed, and this function will search for column variable corresponding to sample names. If unsuccessful or if row names of "sampleData" does not match to column names of "countData", an error will be returned.

Examples

## get path to sample GTF and expression data
gtf <- system.file("extdata", "pb_custom.gtf.gz", package = "factR2")
counts <- system.file("extdata", "pb_expression.tsv.gz", package = "factR2")

## create factRObject with expression counts
factRObject <- createfactRObject(gtf, "vM33", countData = counts)
#> 🡆  Checking inputs
#> 🡆  Checking factRObject
#> 🡆  Adding custom transcriptome
#> Importing from local directory
#> 🡆  Adding annotation
#> Importing from URL
#> 🡆  Adding genome sequence
#> Using BSgenome object
#> Warning: 4 seqlevel(s) in ``your custom GTF`` are not found in ``reference annotation``
#> 🡆  Matching chromosome names
#> 🡆  Matching gene information
#>     Number of mismatched gene_ids found: 83520
#>     --> Attempting to match ensembl gene_ids...
#>     --> No ensembl gene ids found in query
#>     ---> Attempting to match gene_ids by finding overlapping coordinates...
#>     ---> 65203 gene_id matched
#>     Total gene_ids corrected: 65203
#>     Remaining number of mismatched gene_ids: 18317
#> 🡆  Creating factRset objects
#> Adding gene information
#> Adding transcript information
#> Adding alternative splicing information
#> 🡆  Annotating novel transcripts
#> 🡆  Annotating novel AS events
#> 🡆  Adding expression data
#> Importing from local file
#> 🡆  Creating samples metadata
#> 🡆  Processing expression data
#> Adding gene counts
#> Adding spliced-event counts
#> Normalizing counts
#> factRobject created!

### This can also be done post-creation of a factR object
if (FALSE) {
factRObject <- createfactRObject(gtf, "vM33")
factRObject <- addTxCounts(factRObject, counts)}

## access counts data
counts(factRObject)  # returns normalised expression data from current Set
#>         P14_M5_1 P14_M5_2 P14_M5_3 P14_M6_1 P14_M6_2 P14_M6_3 P28_M1_1 P28_M1_2
#> AS01129 0.123000 0.170000 0.155000 1.12e-01 5.48e-02 0.132000 0.124000 0.169000
#> AS01642 0.166000 0.230000 0.140000 2.63e-01 0.00e+00 0.843000 0.177000 0.000000
#> AS01647 0.000000 0.000000 0.000000 7.52e-01 4.19e-01 0.000000 0.287000 0.502000
#> AS01649 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.301000 0.518000
#> AS01650 0.223000 0.000000 0.000000 3.41e-01 2.70e-01 0.000000 0.453000 0.608000
#> AS01651 0.223000 0.000000 0.000000 3.41e-01 2.70e-01 0.000000 0.453000 0.608000
#> AS01652 0.543000 0.372000 0.533000 4.08e-01 2.87e-01 0.240000 0.566000 0.388000
#> AS01657 0.241000 0.241000 0.346000 1.66e-01 1.26e-01 0.284000 0.126000 0.117000
#> AS01658 0.561000 0.390000 0.118000 4.73e-01 5.92e-01 0.285000 0.326000 0.250000
#> AS01659 0.561000 0.390000 0.118000 4.73e-01 5.92e-01 0.285000 0.326000 0.250000
#> AS01662 0.217000 0.217000 0.188000 2.06e-01 2.74e-01 0.510000 0.077500 0.054600
#> AS01663 0.217000 0.217000 0.188000 2.06e-01 2.74e-01 0.510000 0.077500 0.054600
#> AS01665 0.000000 0.000000 0.103000 0.00e+00 1.11e-01 0.000000 0.200000 0.186000
#> AS01667 0.000000 0.595000 0.000000 4.07e-01 1.43e-01 0.478000 0.100000 0.000000
#> AS02707 0.059900 0.000000 0.000000 3.68e-01 0.00e+00 0.000000 0.032400 0.113000
#> AS02708 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.095200 0.060200
#> AS02709 0.258000 0.000000 0.629000 5.60e-01 8.86e-01 0.741000 0.187000 0.113000
#> AS02710 0.000000 0.165000 0.000000 0.00e+00 0.00e+00 0.000000 0.341000 0.257000
#> AS02711 0.000000 0.000000 0.000000 1.01e-01 0.00e+00 0.000000 0.100000 0.186000
#> AS02712 0.000000 0.000000 0.000000 1.01e-01 0.00e+00 0.000000 0.100000 0.186000
#> AS02714 0.000000 0.000000 0.000000 0.00e+00 3.34e-01 0.271000 0.000000 0.000000
#> AS02738 0.321000 0.341000 0.326000 2.93e-01 2.93e-01 0.293000 0.181000 0.167000
#> AS02739 0.334000 0.324000 0.332000 3.48e-01 3.48e-01 0.348000 0.353000 0.453000
#> AS02749 0.835000 1.000000 0.590000 2.92e-01 3.51e-01 0.302000 1.000000 0.419000
#> AS02750 0.000000 0.000000 0.187000 5.38e-02 1.54e-01 0.000000 0.000000 0.000000
#> AS02751 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.092100 0.201000 0.174000
#> AS02756 0.000000 0.000000 0.000000 2.10e-01 0.00e+00 0.000000 0.000000 0.000000
#> AS02777 1.000000 1.000000 0.700000 5.99e-01 1.00e+00 0.000000 0.803000 1.000000
#> AS02784 0.360000 0.568000 0.252000 0.00e+00 0.00e+00 0.000000 0.198000 0.000000
#> AS02838 0.056800 0.135000 0.173000 2.16e-01 2.97e-01 0.202000 0.120000 0.140000
#> AS02847 0.598000 0.710000 0.609000 7.86e-01 8.72e-01 0.797000 0.620000 0.609000
#> AS02850 0.099500 0.181000 0.000000 0.00e+00 3.56e-01 0.000000 0.356000 0.172000
#> AS02854 0.137000 0.000000 0.106000 4.88e-01 7.04e-01 0.544000 0.443000 0.000000
#> AS02856 0.372000 0.624000 0.458000 7.57e-01 8.80e-01 0.830000 0.692000 0.769000
#> AS02857 0.000000 0.000000 0.027100 1.24e-01 4.96e-02 0.000000 0.058500 0.000000
#> AS02858 0.000000 0.000000 0.000000 0.00e+00 3.42e-01 0.000000 0.000000 0.000000
#> AS02859 0.000000 0.000000 0.000000 0.00e+00 3.42e-01 0.000000 0.000000 0.000000
#> AS02860 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.124000 0.443000
#> AS02866 0.000000 0.345000 0.000000 0.00e+00 1.67e-01 0.000000 0.412000 0.319000
#> AS03103 0.000000 0.085700 0.000000 0.00e+00 6.69e-02 0.000000 0.000000 0.000000
#> AS03108 0.532000 0.238000 0.303000 3.56e-01 1.06e-01 0.378000 0.263000 0.465000
#> AS03110 0.000000 0.000000 0.000000 3.64e-01 2.77e-01      NaN 0.000000 0.364000
#> AS03112 0.163000 0.000000 0.000000 3.05e-01 0.00e+00      NaN 0.841000 0.815000
#> AS03117 0.000000 0.000000 0.407000 0.00e+00 6.47e-01 1.000000 0.000000 0.000000
#> AS03128 0.000000 0.000000 0.000000 4.71e-01 0.00e+00 1.000000 0.000000 0.000000
#> AS03135 0.165000 0.000000 0.308000 3.08e-01 3.72e-01 1.000000 0.000000 0.000000
#> AS03226 0.297000 0.458000 0.628000 1.00e+00 8.08e-01 0.772000 0.628000 0.628000
#> AS03237 0.000000 0.000000 0.301000      NaN 5.64e-01 0.000000 0.000000 0.721000
#> AS03271 1.000000 1.000000 1.000000      NaN 1.00e+00 1.000000 1.000000 1.000000
#> AS03653 0.000000      NaN 0.000000 0.00e+00 0.00e+00      NaN 0.754000 0.000000
#> AS03656 0.000000      NaN 0.000000 0.00e+00 0.00e+00      NaN 0.623000 0.000000
#> AS03664 0.000000      NaN      NaN      NaN 7.60e-01      NaN      NaN      NaN
#> AS04209 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.000000 0.367000
#> AS04210 0.892000 1.000000 1.000000 9.06e-01 9.23e-01 0.828000 0.391000 0.559000
#> AS04213 0.000000 0.000000 0.000000 9.39e-02 3.99e-02 0.172000 0.609000 0.000000
#> AS04221 1.000000      NaN      NaN      NaN 1.00e+00      NaN      NaN 0.000000
#> AS04223 0.000000      NaN      NaN      NaN 0.00e+00      NaN      NaN 1.000000
#> AS04225 1.000000      NaN      NaN      NaN 1.00e+00      NaN      NaN 0.000000
#> AS04364      NaN      NaN      NaN 0.00e+00 0.00e+00 0.000000      NaN      NaN
#> AS04433 0.454000 0.268000 0.573000 4.39e-01 5.59e-01 0.443000 0.234000 0.215000
#> AS04437 0.377000 0.139000 0.315000 1.47e-01 2.81e-01 0.247000 0.396000 0.280000
#> AS04439 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.061400 0.000000
#> AS04440 0.513000 0.726000 0.587000 5.20e-01 7.03e-01 0.558000 0.537000 0.713000
#> AS04519 0.000000 0.000000 0.000000 1.00e+00 0.00e+00 0.000000 0.412000 0.000000
#> AS04763 0.238000 0.278000 0.096200 2.38e-01 1.93e-01 0.227000 0.163000 0.104000
#> AS04974 1.000000      NaN 1.000000 0.00e+00      NaN      NaN      NaN      NaN
#> AS04977 0.000000      NaN 0.000000 0.00e+00      NaN      NaN      NaN      NaN
#> AS05912 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.000000 0.365000
#> AS05917 0.000000 0.000000 0.000000 0.00e+00 3.23e-01 0.000000 0.417000 0.000000
#> AS05919 0.000000 0.556000 0.000000 0.00e+00 0.00e+00 0.000000 0.297000 0.297000
#> AS05920 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.466000 0.466000
#> AS05922 0.000000 0.000000 0.000000 5.52e-01 0.00e+00 0.000000 0.000000 0.000000
#> AS05925 0.000000 0.000000 0.000000 7.26e-01 7.26e-01 0.000000 0.570000 0.000000
#> AS05950 1.000000 0.692000 1.000000 6.63e-01 1.00e+00 1.000000 1.000000      NaN
#> AS06103 0.000000 0.000000 0.000000 3.88e-03 0.00e+00 0.000000 0.010400 0.004470
#> AS06105 0.195000 0.121000 0.164000 1.25e-01 1.71e-01 0.127000 0.156000 0.168000
#> AS07063 0.000000 0.000000 0.050200 3.71e-02 0.00e+00 0.000000 0.062000 0.000000
#> AS07065 0.078400 0.000000 0.087300 0.00e+00 0.00e+00 0.000000 0.000000 0.000000
#> AS07081 0.724000 1.000000 0.778000 1.00e+00 8.27e-01 0.833000 0.833000 1.000000
#> AS07082 0.039700 0.000000 0.044700 0.00e+00 8.67e-02 0.055800 0.055800 0.000000
#> AS07083 0.000000 0.000000 0.000000 0.00e+00 0.00e+00 0.000000 0.000000 0.000000
#> AS07272      NaN 0.000000      NaN 0.00e+00 0.00e+00 0.000000 1.000000 0.000000
#> AS10530 0.000000 0.000000 0.000000 7.42e-02 8.38e-02 0.000000 0.000000 0.000000
#>         P28_M1_3 P28_M2_1 P28_M2_2 P28_M2_3
#> AS01129 0.178000 0.017200 0.048900 0.039800
#> AS01642 0.183000 0.230000 0.000000 0.000000
#> AS01647 0.387000 0.457000 0.000000 0.528000
#> AS01649 0.692000 0.000000 0.000000 0.000000
#> AS01650 0.393000 0.000000 0.000000 0.000000
#> AS01651 0.393000 0.000000 0.000000 0.000000
#> AS01652 0.469000 0.110000 0.122000 0.204000
#> AS01657 0.075300 0.000000 0.185000 0.000000
#> AS01658 0.223000 0.000000 0.407000 0.313000
#> AS01659 0.223000 0.000000 0.407000 0.313000
#> AS01662 0.000000 0.316000 0.373000 0.165000
#> AS01663 0.000000 0.316000 0.373000 0.165000
#> AS01665 0.124000 0.186000 0.164000 0.164000
#> AS01667 0.044900 0.000000 0.000000 0.000000
#> AS02707 0.202000 0.460000 0.386000 0.237000
#> AS02708 0.000000 0.000000 0.000000 0.000000
#> AS02709 0.280000 0.504000 0.696000 0.556000
#> AS02710 0.438000 0.343000 0.000000 0.000000
#> AS02711 0.424000 0.000000 0.000000 0.000000
#> AS02712 0.424000 0.000000 0.000000 0.000000
#> AS02714 0.000000 0.000000 0.000000 0.000000
#> AS02738 0.166000 0.293000 0.293000 0.293000
#> AS02739 0.460000 0.348000 0.348000 0.348000
#> AS02749 1.000000 0.000000 1.000000      NaN
#> AS02750 0.000000 0.000000 0.000000      NaN
#> AS02751 0.000000 0.000000 0.000000      NaN
#> AS02756 0.000000 0.000000 0.000000      NaN
#> AS02777 0.472000 1.000000 0.461000 1.000000
#> AS02784 0.000000 0.000000 0.000000 0.000000
#> AS02838 0.189000 0.130000 0.206000 0.112000
#> AS02847 0.597000 0.791000 0.848000 0.845000
#> AS02850 0.216000 0.293000 0.624000      NaN
#> AS02854 0.166000 0.000000 0.000000      NaN
#> AS02856 0.620000 0.640000 0.897000 0.530000
#> AS02857 0.000000 0.000000 0.053100 0.470000
#> AS02858 0.000000 0.281000 0.886000 1.000000
#> AS02859 0.000000 0.281000 0.886000 1.000000
#> AS02860 0.346000 0.000000 0.718000 1.000000
#> AS02866 0.529000 0.412000 0.219000 0.260000
#> AS03103 0.344000 0.112000 0.177000 0.000000
#> AS03108 0.000000 0.000000 0.263000 0.000000
#> AS03110 0.000000 0.000000 0.000000 0.000000
#> AS03112 0.925000 0.815000 0.000000 0.925000
#> AS03117 0.000000 0.000000 0.733000 0.733000
#> AS03128 0.000000 0.640000 0.842000 0.640000
#> AS03135 0.000000 0.000000 0.000000 0.000000
#> AS03226 0.000000 0.000000 0.000000 0.000000
#> AS03237 0.177000 0.000000 0.000000 0.000000
#> AS03271 0.831000 0.164000 0.197000 0.141000
#> AS03653      NaN      NaN 1.000000      NaN
#> AS03656 1.000000      NaN      NaN      NaN
#> AS03664      NaN      NaN      NaN      NaN
#> AS04209 0.605000 0.000000 0.000000      NaN
#> AS04210 0.154000 1.000000 1.000000      NaN
#> AS04213 0.404000 0.000000 0.000000      NaN
#> AS04221 0.000000      NaN      NaN      NaN
#> AS04223 1.000000      NaN      NaN      NaN
#> AS04225 0.000000      NaN      NaN      NaN
#> AS04364 1.000000 1.000000      NaN      NaN
#> AS04433 0.130000 0.288000 0.276000 0.205000
#> AS04437 0.371000 0.368000 0.350000 0.474000
#> AS04439 0.145000 0.000000 0.000000 0.000000
#> AS04440 0.786000 0.850000 0.606000 0.877000
#> AS04519 0.000000 0.000000 0.000000 0.000000
#> AS04763 0.039800 0.073700 0.126000 0.090700
#> AS04974      NaN 0.000000      NaN      NaN
#> AS04977      NaN 0.000000 1.000000 1.000000
#> AS05912 0.000000 0.000000 0.697000 0.000000
#> AS05917 0.000000 0.417000 0.000000 0.000000
#> AS05919 0.000000 0.000000 0.000000 0.000000
#> AS05920 0.724000 0.000000 0.000000 0.000000
#> AS05922 0.552000 0.000000 0.000000 0.787000
#> AS05925 0.469000 0.000000 0.000000 0.000000
#> AS05950 0.000000 0.692000 0.628000 1.000000
#> AS06103 0.000000 0.005450 0.011100 0.000000
#> AS06105 0.180000 0.141000 0.101000 0.186000
#> AS07063 0.000000 0.000000 0.000000 0.103000
#> AS07065 0.000000 0.000000 0.000000 0.000000
#> AS07081 1.000000 1.000000 0.770000 0.807000
#> AS07082 0.000000 0.000000 0.000000 0.000000
#> AS07083 0.000000 0.425000 0.000000 0.000000
#> AS07272 0.000000      NaN      NaN 0.517000
#> AS10530 0.000000 0.000000 0.000000 0.000000
#>  [ reached getOption("max.print") -- omitted 48114 rows ]
counts(factRObject, "Ptbp1")  # same as above, but only for specific genes
#>         P14_M5_1 P14_M5_2 P14_M5_3 P14_M6_1 P14_M6_2 P14_M6_3 P28_M1_1 P28_M1_2
#> AS27166    0.596    0.392     0.26    0.492    0.492    0.492    0.392    0.446
#> AS27168    0.000    0.000     0.00    0.000    0.480    0.480    0.000    0.334
#>         P28_M1_3 P28_M2_1 P28_M2_2 P28_M2_3
#> AS27166    0.659    0.340    0.693    0.301
#> AS27168    0.382    0.324    0.527    0.000
counts(factRObject, "Ptbp1", set = "transcript")  # get transcript-level expression
#>                             P14_M5_1 P14_M5_2   P14_M5_3 P14_M6_1  P14_M6_2
#> transcript37580.chr10.nnic 11.410569 11.39594  5.6328695 1.840952 1.9739042
#> transcript37593.chr10.nnic  4.992124 11.39594 10.3269274 1.227302 1.3159361
#> transcript37653.chr10.nnic  0.000000  0.00000  0.0000000 0.000000 0.6579681
#> transcript37657.chr10.nnic  0.000000  0.00000  0.9388116 0.000000 0.0000000
#>                            P14_M6_3 P28_M1_1  P28_M1_2  P28_M1_3 P28_M2_1
#> transcript37580.chr10.nnic 4.477255 4.637786 4.7871969 5.6232337 5.331361
#> transcript37593.chr10.nnic 2.984836 4.637786 3.8297575 1.8744112 6.664201
#> transcript37653.chr10.nnic 1.492418 0.000000 0.9574394 0.9372056 1.332840
#> transcript37657.chr10.nnic 0.000000 0.000000 1.9148788 0.0000000 0.000000
#>                            P28_M2_2 P28_M2_3
#> transcript37580.chr10.nnic 8.624391 2.591131
#> transcript37593.chr10.nnic 2.464112 3.886697
#> transcript37653.chr10.nnic 2.464112 0.000000
#> transcript37657.chr10.nnic 0.000000 0.000000
counts(factRObject, "Ptbp1", set = "gene") # get gene-level expression
#>  P14_M5_1  P14_M5_2  P14_M5_3  P14_M6_1  P14_M6_2  P14_M6_3  P28_M1_1  P28_M1_2 
#> 16.654551 23.042081 17.085144  3.101008  3.970570  9.086108  9.366016 11.628876 
#>  P28_M1_3  P28_M2_1  P28_M2_2  P28_M2_3 
#>  8.447887 13.365096 13.503386  6.459794 
counts(factRObject, "Ptbp1", set = "gene", slot = "counts") # get gene-level count
#> P14_M5_1 P14_M5_2 P14_M5_3 P14_M6_1 P14_M6_2 P14_M6_3 P28_M1_1 P28_M1_2 
#>       23       20       18        5        6        6       12       12 
#> P28_M1_3 P28_M2_1 P28_M2_2 P28_M2_3 
#>        9       10       11        5 

## access samples metadata
samples(factRObject)
#>            proj.ident
#> P14_M5_1 factRProject
#> P14_M5_2 factRProject
#> P14_M5_3 factRProject
#> P14_M6_1 factRProject
#> P14_M6_2 factRProject
#> P14_M6_3 factRProject
#> P28_M1_1 factRProject
#> P28_M1_2 factRProject
#> P28_M1_3 factRProject
#> P28_M2_1 factRProject
#> P28_M2_2 factRProject
#> P28_M2_3 factRProject
ident(factRObject)   # prints out the current identity
#> Error in c_character(...): Character input expected

## run correlation between gene expression and exon inclusion
factRObject <- testGeneCorr(factRObject)
#> Error in dplyr::select(., AS_id, gene_id, ASNMDtype): Can't subset columns that don't exist.
#>  Column `ASNMDtype` doesn't exist.
ase(factRObject)
#> Set `show_more to TRUE to show more info`
#> # A tibble: 48,197 × 8
#>    AS_id   gene_id               gene_name coord       AStype strand width novel
#>    <chr>   <chr>                 <chr>     <chr>       <fct>  <fct>  <int> <chr>
#>  1 AS01129 ENSMUSG00000033813.16 Tcea1     chr1:49570… AD     +        231 yes  
#>  2 AS01642 ENSMUSG00000025907.15 Rb1cc1    chr1:62851… AD     +       1104 yes  
#>  3 AS01647 ENSMUSG00000025907.15 Rb1cc1    chr1:62929… AF     +        499 yes  
#>  4 AS01649 ENSMUSG00000025907.15 Rb1cc1    chr1:63043… RI     +        141 yes  
#>  5 AS01650 ENSMUSG00000025907.15 Rb1cc1    chr1:63152… RI     +        178 yes  
#>  6 AS01651 ENSMUSG00000025907.15 Rb1cc1    chr1:63155… RI     +         80 yes  
#>  7 AS01652 ENSMUSG00000025907.15 Rb1cc1    chr1:63198… RI     +         87 yes  
#>  8 AS01657 ENSMUSG00000025907.15 Rb1cc1    chr1:63313… AD     +        749 yes  
#>  9 AS01658 ENSMUSG00000025907.15 Rb1cc1    chr1:63331… RI     +         81 yes  
#> 10 AS01659 ENSMUSG00000025907.15 Rb1cc1    chr1:63334… AD     +        173 yes  
#> # ℹ 48,187 more rows