NARWHAL

NARWHAL — Neoantigens Recognition Website and HLA Genotyping Tool

Tutorial

Click the "Guide" button to see the step-by-step guide for analyzing your data on NARWHAL.

Click the 'Manual' button to download the NARWHAL instruction manual for detailed information on general introduction, usage, and data interpretation.

Manual

Demo video for analyzing data on NARWHAL.

Demonstration Reports

Function	Example Link
Neoantigen Identification with DNA-seq (mTSA)	View
Neoantigen Identification with RNA-seq (mTSA and aeTSA)	View
Neoantigen Identification with DNA-seq (mTSA) and RNA-seq (aeTSA)	View
HLA Genotyping	View
Shared Neoantigens in Different Samples	View

Output File Columns

mTSA_tsv.tsv

This file is a summary report for mTSA identification. It includes information on variants with highly frequent tumor-mutated genes in the COSMIC database.

Column Name	Description
Transcript_info	A unique ID for the variant which includes some variant and transcript infomation
HLA Alleles (multiple)	For each HLA allele in the run, the number of this variant’s epitopes that bound well to the HLA allele (with median/lowest mutant binding affinity lower than the binding affinity threshold)
Gene Symbol	The Ensembl gene name of the affected gene
AA Change	The amino acid change for the mutation
Num Passing Transcripts	The number of transcripts with at least one well-binding peptide (median mutant binding affinity < 500 [default]) for this mutation
Best Peptide	The best-binding mutant epitope sequence
Pos	The starting position of the mutation within the epitope sequence, counted from 1 (0 if the mutation precedes the epitope)
Num Passing Peptides	The number of unique well-binding peptides for this mutation
IC50 MT	Median or lowest ic50 binding affinity of the best-binding mutant epitope across all prediction algorithms used
IC50 WT	Median or lowest ic50 binding affinity of the corresponding wildtype epitope across all prediction algorithms used
%ile MT	Median or lowest binding affinity percentile rank of the best-binding mutant epitope across all prediction algorithms used (NetMHC or NetMHCpan)
%ile WT	Median or lowest binding affinity percentile rank of the best-binding corresponding wildtype epitope across all prediction algorithms used (NetMHC or NetMHCpan)
Chromosome	The chromosome of this variant
Start	The start position of this variant in the zero-based, half-open coordinate system
Stop	The stop position of this variant in the zero-based, half-open coordinate system
Reference	The reference allele
Variant	The alt allele
Transcript	The Ensembl ID of the affected transcript
Transcript Support Level	The transcript support level (TSL) of the affected transcript
Ensembl Gene ID	The Ensembl ID of the affected gene
Variant Type	The type of variant: *missense* for missense mutations, *inframe_ins* for inframe insertions, *inframe_del* for inframe deletions, and FS for frameshift variants
Mutation	The amnio acid change of this mutation
Protein Position	The protein position of the mutation
Gene Name	The Ensembl gene name of the affected gene
HGVSc	The HGVS coding sequence variant name
HLA Allele	The HLA allele for this prediction
Peptide Length	The peptide length of the epitope
Sub-peptide Position	The one-based position of the epitope within the protein sequence used to make the prediction
Mutation Position	The one-based positional range (inclusive) of the mutation within the epitope sequence
MT Epitope Seq	The mutant epitope sequence
WT Epitope Seq	The epitope sequence of the wildtype (reference) at the corresponding position in the complete protein sequence. It's "NA" if there's no wildtype sequence at this position or if over half of the amino acids in the mutant epitope have changed.
Best MT IC50 Score Method	Lowest ic50 binding affinity of all prediction algorithms used
Best MT IC50 Score	The alt allele
Corresponding WT IC50 Score	Lowest ic50 binding affinity of all prediction algorithms used
Corresponding Fold Change	Corresponding WT IC50 Score / Best MT IC50 Score. NA if there is no WT Epitope Seq
Best MT Percentile Method	Prediction algorithm with the lowest binding affinity percentile rank for this epitope (NetMHC or NetMHCpan)
Best MT Percentile	Lowest percentile rank of this epitope’s ic50 binding affinity of all prediction algorithms used
Corresponding WT Percentile	binding affinity percentile rank of the wildtype epitope. NA if there is no WT Epitope Seq.
Median WT Score	Tumor DNA depth at this position. NA if VCF entry does not contain tumor DNA read count annotation.
Tumor DNA VAF	Tumor DNA variant allele frequency (VAF) at this position. NA if VCF entry does not contain tumor DNA read count annotation.
Median MT Score	Median ic50 binding affinity of the mutant epitope across all prediction algorithms used
Median Fold Change	Median WT IC50 Score / Median MT IC50 Score. NA if there is no WT Epitope Seq.
Median MT Percentile	Median binding affinity percentile rank of the mutant epitope across all prediction algorithms
Median WT Percentile	Median binding affinity percentile rank of the wildtype epitope across all prediction algorithms used (those that provide percentile output) NA if there is no WT Epitope Seq.
NetMHC WT Score	ic50 binding affintity and percentile ranks for the WT Eptiope Seq by using NetMHC model
NetMHC MT Score	ic50 binding affintity and percentile ranks for the MT Eptiope Seq by using NetMHC model
NetMHC WT Percentile	Median binding affinity percentile rank of the wildtype epitope by using NetMHC model
NetMHC MT Percentile	Median binding affinity percentile rank of the mutant epitope by using NetMHC model
NetMHCpan WT Score	ic50 binding affintity and percentile ranks for the WT Eptiope Seq by using NetMHCpan model
NetMHCpan MT Score	ic50 binding affintity and percentile ranks for the MT Eptiope Seq by using NetMHCpan model
NetMHCpan WT Percentile	Median binding affinity percentile rank of the wildtype epitope by using NetMHCpan model
NetMHCpan MT Percentile	Median binding affinity percentile rank of the mutant epitope by using NetMHCpan model
Predicted Stability	Stability of the peptide-MHC-I complex
Half Life	Half-life of the peptide-MHC-I complex
Stability Rank	Stability of the peptide-MHC-I complex
NetMHCstab allele	Nearest neighbor to the HLA Allele. Used for NetMHCstab prediction
Binding affinity level	The binding affinity filter was applied to categorize peptides as “strong binding” if their ic50 values are below 50 nm, “intermediate binding” if between 50 nm and 250 nm, and “weak binding” if between 250 nm and 500 nm. Users can change the default numbers.
Name	Gene names that commonly have tumor mutations in COSMIC database
Somatic	Marked “yes” if it’s a gene commonly having somatic mutations
Germline	Marked “yes” if it’s a gene commonly having germline mutations
Tumour Types(Somatic)	Tumor types commonly having somatic mutations on this gene
Tumour Types(Germline)	Tumor types commonly having germline mutations on this gene
Role in Cancer	The genes’ roles in cancers

mTSA_and_aeTSA.tsv

This file is a summary report for mTSA, aeTSA, and TAA identification.

***Represent some files lack the following information, so they won't show these columns.

Column Name	Description
Transcript_info	A unique ID for the variant which includes some variant and transcript infomation
HLA Alleles (multiple)	For each HLA allele in the run, the number of this variant’s epitopes that bound well to the HLA allele (with median/lowest mutant binding affinity lower than the binding affinity threshold)
tpm_tumor	Transcripts Per Million in Tumor
tpm_normal	Transcripts Per Million in Normal
Best Peptide	The best-binding epitope sequence
Num Passing Peptides	The number of unique well-binding peptides for this mutation
IC50 MT	Median or lowest ic50 binding affinity of the best-binding mutant epitope across all prediction algorithms used
***IC50 WT	Median or lowest ic50 binding affinity of the best-binding wildtype epitope across all prediction algorithms used
%ile MT	Median or lowest binding affinity percentile rank of the best-binding mutant peptide across all prediction algorithms used (NetMHC or NetMHCpan)
***%ile WT	Median or lowest binding affinity percentile rank of the best-binding wildtype peptide across all prediction algorithms used (NetMHC or NetMHCpan)
***Transcript	The Ensembl ID of the affected transcript
***Ensembl Gene ID	The Ensembl ID of the affected gene
***Predicted Stability	Stability of the peptide-MHC-I complex
***Half Life	Half-life of the peptide-MHC-I complex
***Stability Rank	Stability of the peptide-MHC-I complex
Binding affinity level	The binding affinity filter was applied to categorize peptides as “strong binding” if their ic50 values are below 50 nm, “intermediate binding” if between 50 nm and 250 nm, and “weak binding” if between 250 nm and 500 nm. Users can change the default numbers.

aeTSA.tsv

This file is a summary report for aeTSA and TAA identification.

Column Name	Description
Transcript_info	A unique ID for the variant which includes some variant and transcript infomation
HLA Alleles (multiple)	For each HLA allele in the run, the number of this variant’s epitopes that bound well to the HLA allele (with median/lowest mutant binding affinity lower than the binding affinity threshold)
Best Peptide	The best-binding epitope sequence
Num Passing Peptides	The number of unique well-binding peptides for this mutation
IC50 MT	Median or lowest ic50 binding affinity of the best-binding mutant epitope across all prediction algorithms used
%ile MT	Median or lowest binding affinity percentile rank of the best-binding mutant peptide across all prediction algorithms used (NetMHC or NetMHCpan)
Binding affinity level	The binding affinity filter was applied to categorize peptides as “strong binding” if their ic50 values are below 50 nm, “intermediate binding” if between 50 nm and 250 nm, and “weak binding” if between 250 nm and 500 nm. Users can change the default numbers.
Peptide	The epitope sequence aligned to the gene reference
cDNA sequence	The DNA sequence that has been reverse translated from a specific peptide
cDNA location	The specific position of the cDNA sequence
Tumor read count	The number of tumor reads aligned with the peptide in a specific location
Total tumor read count	The sum of tumor read counts from all possible locations aligned with the peptide
Tumor average read depth	The average read depth of the cDNA sequence in the tumor sample (= total bases on the cDNA / assembled read length)
Normal read count	The number of normal reads aligned with the peptide in a specific location
Normal read count	The sum of normal read counts from all possible locations aligned with the peptide
Normal average read depth	The average read depth of the cDNA sequence in the normal sample (= total bases on the cDNA / assembled read length)
Translated tumor peptide	The full peptide translated from the cDNA sequence
Average depth ratio	The ratio of tumor average read depth to normal average read depth
Sum of tumor and normal read count	The sum of tumor read counts and normal read counts in a specific location
Sum of total tumor and normal read count	The sum of tumor and normal read counts from all possible locations aligned with the peptide
Gene ID nad gene name	The ensembl gene ID and gene name
Gene element	Gene element of the region
Sum of expected read count	The sum of tumor total read counts and normal total read counts that are expected to aligned to a specific location
Element read proportion	The proportion of read counts in a specific element among all ( = the sum of read count in an element / the sum of total tumor and normal read count)
Putative neoantigen type	Possible neoantigen types, including aeTSA, TAA, and none based on different criteria