Abstract:
Evolutionary tree construction is classified as an NP-hard problem since the size of problem will grow exponentially with the number of species and it cannot be solved exactly in polynomial time. Therefore, divide and conquer strategies are introduced in this work in order to reduce the complexity of this problem to polynomial time. One method that can be applied for dividing the size of this problem is clustering analysis. In this work, clustering analysis is used to classify all species into k subclusters. Tree configurations of each subcluster are calculated before regrouping these subclusters into one group. Finally an evolutionary tree of all species is obtained. The proposed algorithm for solving evolutionary tree is developed using MATLAB software. There are 2 main parts of the proposed algorithm. The first is the clustering algorithm. Pairwise distance, which is obtained from CLUSTAL X software, initial number of cluster (K), and maximum species of subcluster are input data of this algorithm. The clustering results, k subclusters, are then used to calculate branch length and grouped into closer groups in the regrouping step. In the case of group(s) containing too many subclusters, a recursive method is introduced in order to subdivide the large group to small subgroups. In addition, the order for regrouping each subgroup is obtained from the recursive method. The proposed algorithm is tested with 9 sets of sequences. Comparing the results of proposed algorithm with the results of Neighbor Joining method shows that the proposed algorithm provides longer evolutionary trees than the Neighbor Joining method by about 4 -12 percent. Even though an optimum solution is not obtained, the proposed algorithm can solve evolutionary tree problems with many species (for instance; 40 species) in polynomial time. Also, the complexity of this algorithm is O (n2).Therefore, it can be concluded that an optimum solution is traded off with the computation requirment for evolutionary tree construction.