M. Idrees, M. U. G. Khan


In the realm of bioinformatics, the key challenges are to manage, store and retrieve the biological data efficiently. It can be classified in to structured, unstructured and semi-structured contents. Typically, the semi-structured biological data comprised of biological sequences. The complex biological sequences produce huge volume of biological data which further produce much more problems for its management, storage and retrieval. This paper proposed metrics; namely, symmetry measure, molecular weight measure, similarity or diversity measure, size base measure, size gap measure, complexity measure and size complexity diversity measure to manage the raised problems in biological data sequences. These metrics measure the sequence complexity, molecular weights, length with gaps and without gaps, its symmetry and similarity through mathematical formulations. The metrics are demonstrated and validated using the proposed hybrid technique which combines empirical evidence with theoretical formulation. This research opens new horizons for efficient management to measure the functionality and quality of metadata for single and multiple biological sequences.

Full Text:



