DIA-NN特点:
1)一个快速和易于使用的工具处理数据(DIA)蛋白质组学数据。
2)采用深度神经网络来改进precursor 的识别。
3)还支持Library free和library的生成。
下载:https://github.com/vdemichev/DiaNN/releases/download/1.7.2/DIA-NN-Setup.msi
命令行操作示例:
diann.exe --threads 4 --f run1 --f run2 --lib yeast.tsv --prefix C:\Data\ --ext .mzML --out run1_2.tsv
参数介绍:
# 主要参数
--f <data file>
--lib <spectral library file>
--threads <thread number>
# 额外参数
--clear-mods : Modification names specified in the spectral library will be used for annotating output. Only suitable for library-based analysis.
--threads : Thread number.
--fasta-search : Library-free search enabled.
--pg-level : this determines which peptides are considered 'proteotypic' and thus affects protein FDR calculation.
--no-batch-mode : Batch mode disabled.
--verbose : verbose
--export-windows : export infomation, default true.
--export-library : export infomation, default true.
--export-decoys : export infomation, default true.
--prosit : prosit, default true.
--vis : XICs for precursors corresponding peptides will be saved.
--cal-info : Save calculate Info, default true.
--compact-report : Extended report, default false.
--no-isotopes : Isotopologue chromatograms will not be used.
--no-ms2-range : MS2 range inference will not be performed.
--min-peak : Minimum peak height
--no-cal-filter : Peptides with modifications that can cause interferences with isotopologues will not be filtered out for mass calibration.
--no-nn-filter : Peptides with modifications that can cause interferences with isotopologues will be used for neural network training.
--nn-cross-val : Neural network cross-validation will be used to tackle potential overfitting.
--guide-classifier : A separate classifier for the guide library will be used.
--int-removal : Number of interference removal iterations.
--int-margin : Interference correlation margin.
--strict-int-removal : Potentially interfering peptides with close (but not the same) elution times will also be discarded.
--reverse-decoys : Decoys will be generated using the pseudo-reverse method.
--force-frag-rec : Decoys will be generated only for precursors with all library fragments recognised.
--max-rec-charge : Fragment recognition module will consider charges up.
--max-rec-loss : Fragment recognition module will consider losses with the index.
--gen-spec-lib : A spectral library will be generated.
--lib-gen-direct-q : When generating a spectral library, run-specific q-values will be used instead of profile q-values.
--save-original-lib : All entries from the library provided will be saved to the newly generated library.
--fast-wiff : Custom fast centroiding will be used when processing .wiff files (WARNING: experimental).
--dir : file path.
--lib : library.
--fasta : fasta file.
--fasta-filter : fasta filter.
--ref : reference.
--out : output file.
--out-gene : out genes.
--qvalue : qvalue
--protein-qvalue : Output will be filtered at protein-level FDR.
--no-prot-inf : Protein inference will not be performed.
--no-swissprot : SwissProt proteins will not be prioritised for protein inference.
--force-swissprot: Only SwissProt proteins will be considered in library-free search.
--species-genes : Species suffix will be added to gene annotation; this affects proteotypicity definition .
--duplicate-proteins : Duplicate proteins in FASTA files will not be skipped.
--out-lib : out library.
--learn-lib : learn library.
--out-measured-rt : When generating a spectral library without a guide library but with a training library, iRT values (Tr_recalibrated) will correspond to the measured retention times.
--library-headers : library headers.
--output-headers : output headers.
--mod : modification.
--fixed-mod : modification will be considered as fixed.
--var-mod : modification will be considered as variable.
--ref-cal : Reference peptides will be used for calibration.
--gen-ref : A library of reference peptides will be generated.
--window : Scan window.
--cut-after : In silico digest will include cuts after amino acids : .
--no-cut-before : In silico digest will not include cuts before amino acids: .
--min-pep-len : Min peptide length.
--max-pep-len : Max peptide length.
--min-fr-corr : Minimum fragment profile correlation for the inclusion into the spectral library.
--min-gen-fr : Minimum number of fragments for library generation.
--min-pr-mz : Min precursor m/z.
--max-pr-mz : Max precursor m/z.
--min-fr-mz : Min fragment m/z.
--max-fr-mz : Max fragment m/z.
--max-fr : Maximum number of fragments.
--min-fr : Minimum number of fragments for library export.
--min-search-fr : Minimum number of fragments required for a precursor to be searched .
--missed-cleavages : Maximum number of missed cleavages.
--unimod4 : Cysteine carbamidomethylation enabled as a fixed modification.
--unimod35 : Methionine oxidation enabled as a variable modification.
--var-mods : Maximum number of variable modifications.
--met-excision : N-terminal methionine excision enabled.
--no-rt-window : Full range of retention times will be considered.
--disable-rt : All RT-related scores disabled.
--min-rt-win: Minimum acceptable RT window scale.
--no-window-inference : Scan window inference disabled.
--individual-windows : Scan windows will be inferred separately for different runs.
--individual-mass-acc : Mass accuracy will be determined separately for different runs.
--individual-reports : Reports will be generated separately for different runs (in the respective folders).
--no-stats : Run statstics infomation, default false.
--convert : MS data files will be converted to .dia format.
--out-dir : output directory.
--remove-quant : .quant files will be removed when the analysis is finished.
--no-quant-files : .quant files will not be saved to the disk.
--use-rt : Existing .quant files will be used for RT profiling.
--use-quant : Existing .quant files will be used.
--quant-only : Quantification will be performed anew using existing identification info.
--report-only : Report will be generated using .quant files.
--iter : Number of iterations .
--profiling-qvalue : RT profiling q-value threshold.
--quant-qvalue : Q-value threshold for cross-run quantification.
--protein-quant-qvalue : Precursor Q-value threshold for protein quantification.
--top : precursors will be used for protein quantification in each run .
--out-lib-qvalue : Q-value threshold for spectral library generation .
--rt-profiling : RT profiling enabled.
--prefix : prefix added to input file names.
--ext : extension added to input file names.
--lc-all-scores : All scores will be used by the linear classifier (not recommended).
--peak-center : Fixed-width center of each elution peak will be used for quantification.
--peak-boundary : Peak boundary intensity factor.
--standardisation-scale : Standardisation scale.
--no-ifs-removal : Interference removal from fragment elution curves disabled.
--no-fr-selection : Cross-run selection of fragments for quantification disabled (not recommended) .
--no-fr-exclusion : Exclusion of fragments shared between heavy and light labelled peptides from quantification disabled .
--peak-translation : Translation of retention times between peptides within the same elution group enabled.
--no-standardisation : Scores will not be standardised for neural network training.
--no-nn : Neural network classifier disabled.
--nn-iter : Neural network classifier will be used starting from the interation number.
--nn-bagging : Neural network bagging .
--nn-epochs : Neural network epochs number.
--nn-learning-rate : Neural network learning rate.
--nn-reg : Neural network regularisation.
--nn-hidden : Number of hidden layers.
--mass-acc-cal : Calibration mass accuracy.
--fix-mass-acc : Force mass accuracy, default true.
--mass-acc : Global mass accuracy.
--mass-acc-ms1 : Global MS1 accuracy.
--gen-acc : Fragmentation spectrum generator accuracy.
--min-corr : Only peaks with correlation sum exceeding Min Ms1 correlation will be considered.
--corr-diff : Peaks with correlation sum below from maximum will not be considered.
--peak-apex : Peaks apex height.
--all-peaks : The number of putative elution peaks considered in library-free mode will not be reduced to decrease RAM usage.
--norm-qvalue : Q-value threshold for cross-run normalisation.
--norm-fraction : Global normalisation peptides fraction.
--norm-radius : Local normalisation radius.
--global-norm : Median-based local normalisation disabled.
--q1-cal : Q1 calibration enabled.
--no-calibration : Mass calibration disabled.
--mass-cal-bins : Maximum number of mass calibration bins.
--min-cal : Minimum number of precursors identified at 10% FDR used for calibration.
--min-class : Minimum number of precursors identified at 10% FDR used for linear classifier training.
--scanning-swath : All runs will be analysed as Scanning SWATH runs.
--regular-swath : All runs will be analysed as regular SWATH runs.
--no-q1 : Q1 scores disabled.
--use-q1 : Q1 scores will be used for regular SWATH runs.
参考资料:
1.https://github.com/vdemichev/DiaNN