In this paper, we address the problem of protein classification, starting from multi-view 2D snapshots of proteins. Using JMol, a well-known protein visualization software, a set of multi-view 2D representations including 13 different types of protein visualizations are rendered. The 13 visualization types are used to emphasize specific properties of protein structure (e.g. a backbone visualization that displays the backbone structure of the protein as a trace of the Cα atom); while different points of view in the 3D space are used to visualize the protein shapes. Given this set of 2D snapshots for each protein, deep learning is used to perform protein classification starting from the 2D images. Each type of representation is used to train a different Convolutional Neural Network (CNN), and the fusion of these CNNs is shown to be able to exploit the diversity of different types of representations to improve classification performance. The multi-view projections, obtained by uniformly rotating the protein structure around its central X, Y, and Z viewing axes, are used as a kind of data augmentation during the training and testing phases. The resulting approach, named iProStruct2D, is different from most of existing methods in the literature, which are based on protein alignment or on measuring the distance between 3D representation of the protein. Experimental evaluation of the proposed approach on two datasets demonstrates the strength of iProStruct2D with respect to other state-of-the-art approaches. The MATLAB code used in this paper is available at https://github.com/LorisNanni.

Nanni Loris, Lumini Alessandra, Pasquali Federica, Brahnam Sheryl (2020). iProStruct2D: Identifying protein structural classes by deep learning via 2D representations. EXPERT SYSTEMS WITH APPLICATIONS, 142, 1-8 [10.1016/j.eswa.2019.113019].

iProStruct2D: Identifying protein structural classes by deep learning via 2D representations

Lumini Alessandra;
2020

Abstract

In this paper, we address the problem of protein classification, starting from multi-view 2D snapshots of proteins. Using JMol, a well-known protein visualization software, a set of multi-view 2D representations including 13 different types of protein visualizations are rendered. The 13 visualization types are used to emphasize specific properties of protein structure (e.g. a backbone visualization that displays the backbone structure of the protein as a trace of the Cα atom); while different points of view in the 3D space are used to visualize the protein shapes. Given this set of 2D snapshots for each protein, deep learning is used to perform protein classification starting from the 2D images. Each type of representation is used to train a different Convolutional Neural Network (CNN), and the fusion of these CNNs is shown to be able to exploit the diversity of different types of representations to improve classification performance. The multi-view projections, obtained by uniformly rotating the protein structure around its central X, Y, and Z viewing axes, are used as a kind of data augmentation during the training and testing phases. The resulting approach, named iProStruct2D, is different from most of existing methods in the literature, which are based on protein alignment or on measuring the distance between 3D representation of the protein. Experimental evaluation of the proposed approach on two datasets demonstrates the strength of iProStruct2D with respect to other state-of-the-art approaches. The MATLAB code used in this paper is available at https://github.com/LorisNanni.
2020
Nanni Loris, Lumini Alessandra, Pasquali Federica, Brahnam Sheryl (2020). iProStruct2D: Identifying protein structural classes by deep learning via 2D representations. EXPERT SYSTEMS WITH APPLICATIONS, 142, 1-8 [10.1016/j.eswa.2019.113019].
Nanni Loris; Lumini Alessandra; Pasquali Federica; Brahnam Sheryl
File in questo prodotto:
File Dimensione Formato  
iProStructvRevision05.pdf

accesso aperto

Tipo: Postprint
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione 748.92 kB
Formato Adobe PDF
748.92 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/759032
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 8
social impact