lrgasp-submissions

Reference genomes and transcript annotations

Reference genomes and annotations are stored on Synapse. To download directly file directly to a server, we recommend using the Synapse command line.

Genome References

Spike-in sequences

Spike-ins are from the Lexogen SIRV Set 4, consisting of both ERCCs and SIRVs. Poly(A) tails are removed from the sequences to form genomic sequences and they are included in the reference genomes as individual sequences. For multi-exon SIRVs, a single sequence is included converting all isoforms of the gene, as well as introns.

The SIRVs GTF has been edit to modify gene ids so that there are no genes with transcripts on non-overlapping strands or genes not joined by any transcript.

The spike-in genomic sequences are available from lrgasp_sirvs4.fasta.gz (syn25683367) with the annotations in GTF format lrgasp_sirvs4.gtf.gz (syn25683630).

GRCh38-based human reference genome

GRCm39 based mouse

De novo ONT-based manatee genome

Transcriptome References

GENCODE V38-based human annotation set

GENCODE VM27-based mouse annotation set