factR-proteins.Rd
Upon building CDS information on custom transcripts, one can query the translated sequence of protein-coding transcripts and determine the encoding protein domains and motifs.
`predictDomains` queries the HMM database (https://www.ebi.ac.uk/Tools/hmmer/search/hmmscan) for known protein domain families using either the "superfamily" or "pfam" database. This prediction can be performed globally on all protein coding transcripts or on specific transcript families (recommended).
# S4 method for factR
getAAsequence(object, verbose = FALSE)
# S4 method for factR
predictDomains(object, ..., database = "superfamily", ncores = 4)
factRObject
One or more features to display. Can be the following:
gene_id: ID of gene to plot
gene_name: Name of gene to plot
transcript_id: ID of transcript to plot
HMM database to query. Can be "superfamily" or "pfam".
Number of cores to run prediction on
Updated factRObject. `getAAsequence` stores an AAStringSet object in the factRObject class.
`predictDomains` stores a dataframe of predicted protein domains in the factRObject.
## Load sample factRObject and build CDS
data(factRsample)
factRsample <- buildCDS(factRsample)
## Get peptide sequences
factRsample <- getAAsequence(factRsample)
## Predict domains of gene families
factRsample <- predictDomains(factRsample, "Osmr")
#> ℹ Set `show_more to TRUE to show more info`
#> Warning: Skipped 1 non-coding transcripts
## Predict domains of entire coding transcriptome
### This takes some time. Increase `ncores` where necessary
factRsample <- predictDomains(factRsample)
#> ℹ Set `show_more to TRUE to show more info`
#> Warning: Skipped 70 non-coding transcripts