In this paper, we address the problem of protein classification, starting from multi-view 2D snapshots of proteins. Using JMol, a well-known protein visualization software, a set of multi-view 2D representations including 13 different types of protein visualizations are rendered. The 13 visualization types are used to emphasize specific properties of protein structure (e.g. a backbone visualization that displays the backbone structure of the protein as a trace of the Cα atom); while different points of view in the 3D space are used to visualize the protein shapes. Given this set of 2D snapshots for each protein, deep learning is used to perform protein classification starting from the 2D images. Each type of representation is used to train a different Convolutional Neural Network (CNN), and the fusion of these CNNs is shown to be able to exploit the diversity of different types of representations to improve classification performance. The multi-view projections, obtained by uniformly rotating the protein structure around its central X, Y, and Z viewing axes, are used as a kind of data augmentation during the training and testing phases. The resulting approach, named iProStruct2D, is different from most of existing methods in the literature, which are based on protein alignment or on measuring the distance between 3D representation of the protein. Experimental evaluation of the proposed approach on two datasets demonstrates the strength of iProStruct2D with respect to other state-of-the-art approaches. The MATLAB code used in this paper is available at https://github.com/LorisNanni.
Nanni Loris, Lumini Alessandra, Pasquali Federica, Brahnam Sheryl (2020). iProStruct2D: Identifying protein structural classes by deep learning via 2D representations. EXPERT SYSTEMS WITH APPLICATIONS, 142, 1-8 [10.1016/j.eswa.2019.113019].
iProStruct2D: Identifying protein structural classes by deep learning via 2D representations
Lumini Alessandra;
2020
Abstract
In this paper, we address the problem of protein classification, starting from multi-view 2D snapshots of proteins. Using JMol, a well-known protein visualization software, a set of multi-view 2D representations including 13 different types of protein visualizations are rendered. The 13 visualization types are used to emphasize specific properties of protein structure (e.g. a backbone visualization that displays the backbone structure of the protein as a trace of the Cα atom); while different points of view in the 3D space are used to visualize the protein shapes. Given this set of 2D snapshots for each protein, deep learning is used to perform protein classification starting from the 2D images. Each type of representation is used to train a different Convolutional Neural Network (CNN), and the fusion of these CNNs is shown to be able to exploit the diversity of different types of representations to improve classification performance. The multi-view projections, obtained by uniformly rotating the protein structure around its central X, Y, and Z viewing axes, are used as a kind of data augmentation during the training and testing phases. The resulting approach, named iProStruct2D, is different from most of existing methods in the literature, which are based on protein alignment or on measuring the distance between 3D representation of the protein. Experimental evaluation of the proposed approach on two datasets demonstrates the strength of iProStruct2D with respect to other state-of-the-art approaches. The MATLAB code used in this paper is available at https://github.com/LorisNanni.File | Dimensione | Formato | |
---|---|---|---|
iProStructvRevision05.pdf
accesso aperto
Tipo:
Postprint
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione
748.92 kB
Formato
Adobe PDF
|
748.92 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.