Whole tissue proteomic study of N. minuta
Data source
UNRST
Materials and methods
Protein cluster analysis
First, the quantitative information of the target protein set is normalized (normalized to the interval (-1,1)). Then,
The Complexheatmap R package (R Version 3.4) was used for both sample and protein expression dimensions
Classification (distance algorithm: Euclidean, linkage method: Average linkage), and generate hierarchical clustering heat maps.
Subcellular localization analysis
The CELLO (http://cello.life.nctu.edu.tw/), the method of location prediction of cells, the method is adopted
multi-class SVM (support vector Machine) machine learning methods are known for subcellular localization information in public databases
Protein sequence data modeling is used to predict the subcellular localization information of the protein to be retrieved.
Protein domain analysis
Protein domain analysis uses the Pfam database, which is a collection of a series of protein families, each of which
Each protein family is represented in the form of multiple sequence alignment and hidden Markov model. This database contains domain information
For specific analysis, use the interproscan package (version number: InterproScan-5.25-64.0) to integrate
By running a scanning algorithm from the InterPro database for functional characterization of sequences, the target protein sequence is obtained in Pfam
Domain annotation information in the database.
GO function annotation
Blast2GO (version number: BLASTP 2.8.0+) was used to annotate the target protein set
It can be summarized as sequence alignment (BLAST), GO entry extraction (Mapping), GO Annotation (Annotation) and
InterProScan adds four steps to Annotation Augmentation.
KEGG path annotation
KOBAS(version: KOBAS 3.0) software was used to annotate the KEGG pathway of the target protein set.
PCC (Pearson's Correlation Coefficient)
Protein abundance analysis
The number of difference results statistical volcano plot
flower vs leaf
fruit vs leaf
root vs leaf
seed vs leaf
All
Hierarchical cluster analysis of differential protein expression
flower vs leaf
fruit vs leaf
root vs leaf
seed vs leaf
Multi-histone expression pattern clustering