Skip to content
D

data processing

Projects with this topic

  • Semantic labelling for column headers in CSV and Excel files.

    Updated
    Updated
  • Provided are functions suitable for the analysis of mass spectrometry-based proteomics data, providing global information on protein abundance and abundances of post-translational modifications (PTMs). However, at least in parts, this package can be also used to analyse other omics, e.g. metabolomics and transcriptomics. The package includes processing of fold changes (e.g. from TMT or SILAC experiments) and intensities (e.g. from LFQ experiments). Processing includes checking reproducibility (PCAs, correlation bubble plots, sample-2-sample distance plots), identification of outliers (based on Mahalanobis distance), log2 transformation, median normalization, filtering for reliably identified proteins/sites, variance stabilization (based on the 'DEP' package), imputation (based on the 'DEP' package), calculation of average log2(fold changes), p-values, and adjusted p-values. These basic results are visualized (volcano plots, stacked bar charts of significant changes, numbers of identified proteins). Enrichment analyses (using 'clusterProfiler') are conducted based on the gene sets stored in the MSigDB (using 'msigdbr') and are visualized subsequently. Also enrichment results obtained using the 'Ingenuity Pathway Analysis' tool (Qiagen) are visualized. Weighted Gene Correlation Network Analysis ('WGCNA', including module formation, module-trait correlation, identification of hub genes/key drivers, visualization of results) is conducted. For PTM data, site intensities are extracted based on Proteome Discoverer's Peptide Isoform table. Furthermore, kinase enrichment for phosphoproteomics data using 'KinSwingR' can be conducted.

    Updated
    Updated
  • Tools to pre- and post-process data for and from mHM.

    Updated
    Updated
  • CORNish PASDy -- COsmic-Ray Neutron flavored PASDy PASDy -- Processing and Analysis of Sensor Data in pYthon

    Updated
    Updated
  • A consistent, extensible, easy-to-use tool/framework for reproducible quality control of time series data.

    Updated
    Updated