This dataset derives from UniprotKB/SwissProt release 2018_02 and it is the main dataset used to train DeepMito. It contains 424 non-redundant protein sequences endowed with sub-mitochondrial experimental subcellular localization. In particular, the SM424-18 dataset comprises: 74 outer membrane, 190 inner membrane, 25 intermembrane space and 135 matrix proteins.
This dataset comprising 570 protein sequences has been generated by SubMitoPred authors and described in the following publication:
Kumar et al. (2018) Proteome-wide prediction and annotation of mitochondrial ad sub-mitochondrial proteins by incorporating domain information. Mitochondrion, 42, 11-22.
This dataset comprises 1050 human mitochondrial proteins extracted from the Human Cell Atlas database.