TY - JOUR
T1 - Solving the Protein Secondary Structure Prediction problem with the Hessian Free Optimization algorithm
AU - Charalampous, Konstantinos
AU - Agathocleous, Michalis
AU - Christodoulou, Chris
AU - Promponas, Vasilis
N1 - Publisher Copyright:
Author
PY - 2022
Y1 - 2022
N2 - Trying to extract features from complex sequential data for classification and prediction problems is an extremely difficult task. This task is even more challenging when both the upstream and downstream information of a time-series is important to process the sequence at a specific time-step. One typical problem which falls in this category is Protein Secondary Structure Prediction (PSSP). Recurrent Neural Networks (RNNs) have been successful in handling sequential data. These methods are demanding in terms of time and space efficiency. On the other hand, simple Feed-Forward Neural Networks (FFNNs) can be trained really fast with the Backpropagation algorithm, but in practice they give poor results in this category of problems. The Hessian Free Optimization (HFO) algorithm is one of the latest developments in the field of Artificial Neural Network (ANN) training algorithms which can converge faster and more accurately. In this paper, we present the implementation of simple FFNNs trained with the powerful HFO second-order learning algorithm for the PSSP problem. In our approach, a single FFNN trained with the HFO learning algorithm can achieve an approximately 79.6% per residue (Q3) accuracy on the PISCES dataset. Despite the simplicity of our method, the results are comparable to some of the state of the art methods which have been designed for this problem. A majority voting ensemble method and filtering with Support Vector Machines have also been applied, which increase our results to 80.4% per residue (Q3) accuracy. Finally, our method has been tested on the CASP13 independent dataset to achieve 78.14% per residue (Q3) accuracy. Moreover, the HFO does not require tuning of any parameters which makes training much faster than other state of the art methods, a very important feature with big datasets and facilitates fast training of FFNN ensembles.
AB - Trying to extract features from complex sequential data for classification and prediction problems is an extremely difficult task. This task is even more challenging when both the upstream and downstream information of a time-series is important to process the sequence at a specific time-step. One typical problem which falls in this category is Protein Secondary Structure Prediction (PSSP). Recurrent Neural Networks (RNNs) have been successful in handling sequential data. These methods are demanding in terms of time and space efficiency. On the other hand, simple Feed-Forward Neural Networks (FFNNs) can be trained really fast with the Backpropagation algorithm, but in practice they give poor results in this category of problems. The Hessian Free Optimization (HFO) algorithm is one of the latest developments in the field of Artificial Neural Network (ANN) training algorithms which can converge faster and more accurately. In this paper, we present the implementation of simple FFNNs trained with the powerful HFO second-order learning algorithm for the PSSP problem. In our approach, a single FFNN trained with the HFO learning algorithm can achieve an approximately 79.6% per residue (Q3) accuracy on the PISCES dataset. Despite the simplicity of our method, the results are comparable to some of the state of the art methods which have been designed for this problem. A majority voting ensemble method and filtering with Support Vector Machines have also been applied, which increase our results to 80.4% per residue (Q3) accuracy. Finally, our method has been tested on the CASP13 independent dataset to achieve 78.14% per residue (Q3) accuracy. Moreover, the HFO does not require tuning of any parameters which makes training much faster than other state of the art methods, a very important feature with big datasets and facilitates fast training of FFNN ensembles.
KW - Approximation algorithms
KW - Hafnium oxide
KW - Hessian Free Optimization
KW - Neural Networks
KW - Optimization
KW - Predictive models
KW - Protein Secondary Structure Prediction
KW - Proteins
KW - Second Order Learning Algorithms
KW - Three-dimensional displays
KW - Training
UR - http://www.scopus.com/inward/record.url?scp=85126331774&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85126331774&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2022.3156888
DO - 10.1109/ACCESS.2022.3156888
M3 - Article
AN - SCOPUS:85126331774
JO - IEEE Access
JF - IEEE Access
SN - 2169-3536
ER -