.. _using own strategy: Creating your own active learning strategies with PyRelationAL ============================================================== While PyRelationAL already implements multiple standard active learning strategies, it is not exhaustive. However, users can easily define their own strategies by subclassing :py:class:`pyrelational.strategies.abstract_strategy.Strategy` and overriding :py:meth:`pyrelational.strategies.abstract_strategy.Strategy.__call__` Let's look at some examples. Implementing a mixed strategy _____________________________ In the first example, we show how to implement a mixed strategy for a regression task using the least confidence scorer for informativeness in combination with representative sampling. The step is decomposed in two steps: 1) identifying a subset of the queryable set based on the least confidence score (make sure that this subset is sufficiently larger than the number of annotations we want to query) and 2) select representative samples from this subset based on euclidean distance between input features. .. code-block:: python import torch from pyrelational.informativeness import regression_least_confidence from pyrelational.informativeness.task_agnostic import representative_sampling from pyrelational.strategies.generic_al_strategy import Strategy class MixedStrategy(Strategy): """ Implements a strategy that combines least_confidence scorer with representative sampling. To this end, 10 times more samples than requested are selected based on least_confidence scorer, the list is then reduced based on representative_sampling. """ def __init(self): super(MixedStrategy, self).__init__() def __call__(self, num_annotate, data_manager, model_manager): output = self.train_and_infer(data_manager=data_manager, model_manager=model_manager) scores = regression_least_confidence(x=output) ixs = torch.argsort(scores, descending=True).tolist() ixs = [data_manager.u_indices[i] for i in ixs[: 10 * num_annotate]] subquery = torch.stack(data_manager.get_sample_feature_vectors(ixs)) new_ixs = representative_sampling(subquery) return [ixs[i] for i in new_ixs] Implementing an :math:`\epsilon`-greedy strategy ________________________________________________ In the second example, we implement an :math:`\epsilon`-greedy strategy: for :math:`N` queries, :math:`(1-\epsilon)N` are selected greedily based on model prediction and :math:`\epsilon N` are selected uniformly at random from the remaining queryable set. .. code-block:: python import torch import numpy as np from pyrelational.informativeness import regression_mean_prediction from pyrelational.strategies.generic_al_strategy import Strategy class EpsilonGreedyStrategy(Strategy): """ Implements an epsilon-greedy strategy, whereby a percentage of the samples to annotate are selected randomly while the remaining are selected greedily. """ def __init(self): super(EpsilonGreedyStrategy, self).__init__() def __call__(self, num_annotate, data_manager, model_manager, eps=0.05): assert 0 <= eps <= 1, "epsilon should be a float between 0 and 1" output = self.train_and_infer(data_manager=data_manager, model_manager=model_manager) scores = regression_mean_prediction(x=output) ixs = torch.argsort(scores, descending=True).tolist() greedy_annotate = int((1-eps)*num_annotate) ixs = [data_manager.u_indices[i] for i in ixs[: greedy_annotate]] remaining_u_indices = list(set(data_manager.u_indices) - set(ixs)) random_annotate = np.random.choice(remaining_u_indices, num_annotate-greedy_annotate, replace=False) return ixs + random_annotate.tolist() See the `examples folder `_ in the source repository for more examples.