Create a factR object from custom GTF transcriptomes

createfactRObject(
  gtf,
  reference,
  use_own_annotation = NULL,
  use_own_genome = NULL,
  project_name = "factRProject",
  genome_build = "auto",
  match_genes = TRUE,
  countData = NULL,
  sampleData = NULL,
  psi = NULL,
  verbose = TRUE
)

Arguments

gtf

Either path to custom transcriptome file (GTF) or a GenomicRanges object containing GTF transcriptome data

reference

Character value of the ID of genome used as reference. See listSupportedGenomes() to print a list of supported genomes. Input can also be name of species (Homo sapiens or Hsapiens or Human).

use_own_annotation

Can be one of the following:

  • Path to local transcriptome file

  • GenomicRanges object containing GTF transcriptome data

  • AnnotationHub data ID [AHxxxxx]

  • URL to GTF file

use_own_genome

Can be one of the following:

  • Path to local FASTA file

  • Biostrings object containing full genome sequence

  • AnnotationHub data ID [AHxxxxx]

  • URL to FASTA file

project_name

Character value of the name of project

genome_build

Character value of the genome build. Will be determined automatically by default.

match_genes

Boolean value as to whether genes in custom transcriptome is to be matched to reference (Default: TRUE)

countData

(Optional) Matrix of transcript-level counts data

sampleData

(Optional) Dataframe containing sample metadata

psi

(Optional) Matrix of exon inclusion data data

verbose

Boolean value as to whether messages should be printed (Default: TRUE)

Value

factRObject class.

Examples

gtf <- system.file("extdata/sc_merged_sample.gtf.gz", package = "factR")
factR.object <- createfactRObject(gtf, "vM25")
#> 🡆  Checking inputs
#> 🡆  Checking factRObject
#> 🡆  Adding custom transcriptome
#> Importing from local directory
#> 🡆  Adding annotation
#> Importing from URL
#> 🡆  Adding genome sequence
#> Using BSgenome object
#> 🡆  Matching chromosome names
#> 🡆  Matching gene information
#>     Number of mismatched gene_ids found: 500
#>     -> Attempting to correct gene ids by replacing gene_id with ref_gene_id...
#>     -> 212 gene_ids matched
#>     --> Attempting to match ensembl gene_ids...
#>     --> All ensembl gene ids have been matched
#>     ---> Attempting to match gene_ids by finding overlapping coordinates...
#>     ---> 174 gene_id matched
#>     Total gene_ids corrected: 386
#>     Remaining number of mismatched gene_ids: 114
#> 🡆  Creating factRset objects
#> Adding gene information
#> Adding transcript information
#> Adding alternative splicing information
#> 🡆  Annotating novel transcripts
#> 🡆  Annotating novel AS events
#> factRobject created!