**Hierarchical Active Learning algorithm:**This was the original work in which a hierarchical model for applying Gathercole’s `age' and `difficulty' exemplar selection heuristics was demonstrated on a classification problem with half a million training exemplars. Training times were in the order of fifteen minutes on a laptop computer and classification performance was competitive with that from the original KDD competition winners.- Song D., Heywood M.I., Zincir-Heywood A.N. (2005) Training Genetic Programming on Half a Million Patterns: An Example from Anomaly Detection, IEEE Transactions on Evolutionary Computation, 9(3), pp 225-239, ISSN 1089-778X

**Balanced Block algorithm:**The original hierarchical active learning model of 1(a) did not make any specific attempt to address the problem of unbalanced datasets, typically resulting in degenerate solutions in such cases. The work was extended to explicitly address this problem, without compromising the computational efficiency of the approach.- Curry R., Lichodzijewski P., Heywood M.I. (2007) Scaling Genetic Programming to Large Datasets using Hierarchical Dynamic Subset Selection, IEEE Transactions on Systems, Man and Cybernetics - Part B: Cybernetics. 37(4): 1065-1073.

**Pareto Competitive Coevolution:**An alternative approach for establishing the subsets on which active learning is based is to utilize a competitive model of coevolution, thus taking the perspective of a game (or host-parasite model). The gaming model of interest to this work was first developed under the auspice of Genetic Algorithms and typically utilizes a pairwise competition, possibly between individuals from two different populations (e.g., de Jong and Pollock). In the latter case, the basic goal is to encourage the two populations to engage in a mutually beneficial competition resulting in the improvement of both populations. For the case of GP, one population would represent a subset of indexes to the dataset and the other the GP classifiers. Needless to say, there are several technical challenges to successfully carrying these concepts over to the paradigm of GP classifiers, and multiple approaches to realizing practical competitive cooperative models. Porting the GA Pareto competitive model of coevolution into a GP classification setting requires that a scheme be adopted for defining how individuals are combined from the Pareto front of GP classifiers post training i.e., a voting policy. In addition we adopt pruning heuristics designed to complement the post training voting policy, and make use of a balanced point population heuristic, motivated by recent work on sampling algorithms for semi-supervised learning models (Weiss and Provost). The resulting model provides a significant speedup relative to GP without coevolution, while bettering the classification accuracy, particularly on unbalanced datasets.- Lemczyk M., Heywood M.I. (2007) Training Binary GP Classifiers Efficiently: a Pareto-coevolutionary Approach, 10th European Conference on Genetic Programming (EuroGP), M. Ebner et al. (eds), Lecture Notes in Computer Science 4445: 229-240, Springer-Verlag.