konnektor.network_planner#

konnektor.network_planners.generators#

class konnektor.network_planners.generators.maximal_network_generator.MaximalNetworkGenerator(mappers: AtomMapper | list[AtomMapper], scorer: Callable | None, progress: bool = False, n_processes: int = 1)#

Bases: NetworkGenerator

The MaximalNetworkGenerator attempts to build a fully connected graph (every node connected to every other node) for given set of Component/s.

The edges of the graph are Transformation s, which contain AtomMapping s of pairwise Component/s. If not all mappings can be created, it will ignore the mapping failure and return a nearly fully connected graph.

If multiple AtomMapper/s are provided, but no scorer, the first valid AtomMapper provided will be used.

… note:: This approach is not recommended for Free Energy calculations in application cases, as it is very computationally expensive. However, this approach is very important, as all other approaches use the Maximal Network as an initial solution, then remove edges to achieve the desired design.

This class is recommended as an initial_edge_lister for other network generators. The MaximalNetworkGenerator is parallelized and the number of CPUs can be chosen with the n_processes argument.

Parameters:
  • mappers (Union[AtomMapper, list[AtomMapper]]) – AtomMapper to use to define the relationship between two ligands.

  • scorer (Callable, optional) – Scoring function that takes in an atom mapping and returns a score in [0,1].

  • progress (bool, optional) – If True, a progress bar will be displayed. (default: False)

  • n_processes (int) – Number of processes to use for network generation. (default: 1)

generate_ligand_network(components: Iterable[Component]) LigandNetwork#

Create a network with all possible proposed mappings.

This will attempt to create (and optionally score) all possible mappings (up to $N(N-1)/2$ for each mapper given). There may be fewer actual mappings than this, because when a mapper cannot return a mapping for a given pair, there is simply no suggested mapping for that pair. This network is typically used as the starting point for other network generators (which then optimize based on the scores) or to debug atom mappers (to see which mappings the mapper fails to generate).

Parameters:

components (Iterable[SmallMoleculeComponent]) – SmallMoleculeComponent/s to include as nodes in the LigandNetwork.

Returns:

LigandNetwork containing all possible mappings, ideally a fully connected graph.

Return type:

LigandNetwork

class konnektor.network_planners.generators.heuristic_maximal_network_generator.HeuristicMaximalNetworkGenerator(mappers: AtomMapper | list[AtomMapper], scorer, n_samples: int = 100, progress: bool = False, n_processes: int = 1)#

Bases: NetworkGenerator

The HeuristicMaximalNetworkGenerator builds for given set of Component s a set of n_samples Transformation s per Component build network under the assumption each Component can be connected to another. The Transformations of this network are realized as AtomMapping s of pairwise Component s. If not all mappings can be created, it will ignore the mapping failure, and return a nearly fully connected graph.

This class is can be used as initial_edge_lister, if there is a large set of Component s (check network connectivity!)

This class is recommended as initial_edge_lister for other approaches. > Note: the HeuristicMaximalNetworkGenerator is parallelized and the number of CPUs can be given with n_processes. > All other approaches in Konnektor benefit from this parallelization and you can use this parallelization with n_processes key word during class construction.

Parameters:
  • mapper (AtomMapper) – the atom mapper is required, to define the connection between two ligands.

  • scorer (AtomMappingScorer) – scoring function evaluating an atom mapping, and giving a score between [0,1].

  • n_samples (int) – number of random edges per node.

  • progress (bool, optional) – if true a progress bar will be displayed. (default: False)

  • n_processes (int) – number of processes that can be used for the network generation. (default: 1)

generate_ligand_network(components: Iterable[Component]) LigandNetwork#

Create a network with n randomly selected edges for possible proposed mappings.

Parameters:

components (Iterable[Component]) – the ligands to include in the LigandNetwork

Returns:

a heuristic max network.

Return type:

LigandNetwork

class konnektor.network_planners.generators.explicit_network_generator.ExplicitNetworkGenerator(mappers: AtomMapper | list[AtomMapper], scorer, n_processes: int = 1, progress: bool = False)#

Bases: NetworkGenerator

Parameters:
  • mapper (AtomMapper) – Defines the connection between two ligands.

  • scorer (AtomMappingScorer) – scoring function evaluating an atom mapping, and giving a score between [0,1].

  • n_processes (int) – number of processes to use to build the ligand network

  • progress (bool, optional) – if true a progress bar will be displayed. (default: False)

generate_ligand_network(edges: Iterable[tuple[Component, Component]], nodes: Iterable[Component] | None = None) LigandNetwork#

Create a network with explicitly-defined edges and nodes. The network can be defined by specifying only edges, in which case the nodes are implicitly added.

Parameters:
  • edges (Iterable[Tuple[Component, Component]]) – Planned edges that will be connected with mappings and scores. Each Tuple represents one edge.

  • nodes (Iterable[Component] | None) – A list of nodes to be included in the network. Optional, since the network can be defined by specifying only edges. This is useful for adding isolated (unconnected) nodes.

Return type:

LigandNetwork

Warns:

Warning – Raises a warning if the network is not connected as a single network.

generate_network_from_indices(components: list[Component], indices: list[tuple[int, int]]) LigandNetwork#

Generate a LigandNetwork by specifying edges as tuples of indices.

Parameters:
  • components (list[Component]) – Component/s to place into the network.

  • indices (list[tuple[int, int]]) – Edges to form between the Components, represented as tuples of indices of the list of Component/s. e.g. [(3, 4), …] will create an edge between the 3rd and 4th molecules (remember that Python uses 0-based indexing)

Return type:

LigandNetwork

Raises:

IndexError – Throws an error if the indices specified are not present in components.j

generate_network_from_names(components: list[Component], names: list[tuple[str, str]]) LigandNetwork#

Generate a LigandNetwork by specifying edges as tuples of names.

Parameters:
  • components (list[Component]) – Component/s to place into the network.

  • mapper (AtomMapper) – the atom mapper to use to construct edges

  • names (list of tuples of names) – the edges to form where the values refer to names of the small molecules, eg [('benzene', 'toluene'), ...] will create an edge between the molecule with names ‘benzene’ and ‘toluene’

Return type:

LigandNetwork

Raises:
  • KeyError – If a name in names is not present in components.

  • ValueError – If multiple molecules have the same name (molecule names must be unique)

konnektor.network_planners.generators.star_network_generator.RadialNetworkGenerator#

alias of StarNetworkGenerator

class konnektor.network_planners.generators.star_network_generator.StarNetworkGenerator(mappers: AtomMapper | list[AtomMapper], scorer, n_processes: int = 1, progress: bool = False, _initial_edge_lister: NetworkGenerator = None)#

Bases: NetworkGenerator

The Star Network is one of the most edge efficient layouts, it basically places all Transformations around one central Component.

The algorithm constructs in a first step all possible Transformations. Next it selects in the default variant the in average best transformation score performing Component as the central component. Finally all Components are connected with a Transformation to the central Component

The Star Network is most edge efficient, but not most graph score efficient, as it has to find a central Component, which usually is a compromise for all ‘Component’s. From a robustness point of view, the Star Network, will immediately be disconnected if one Transformation fails. However the loss of Component s is very limited, as only one ligand is lost per Transformation failure.

Parameters:
  • mappers (AtomMapper or list of AtomMappers) – the atom mapper is required, to define the connection between two ligands.

  • scorer (AtomMappingScorer) – Callable which returns a float between [0,1] for any LigandAtomMapping. Used to assign scores to potential mappings; higher scores indicate better mappings.

  • n_processes (int, optional) – number of processes that can be used for the network generation. (default: 1)

  • progress (bool, optional) – if true a progress bar will be displayed. (default: False)

  • _initial_edge_lister (NetworkPlanner, optional) – this NetworkPlanner is used to give the initial set of edges. For standard usage, the Maximal NetworPlanner is used. However in large scale approaches, it might be interesting to use the heuristicMaximalNetworkPlanner.. (default: MaximalNetworkPlanner)

generate_ligand_network(components: Iterable[Component], central_component: Component = None) LigandNetwork#

generate a star map network for the given compounds. if a central component is defined, the planning stage is shortcutted to only connect the ligands to the central component.

Parameters:
  • components (Iterable[Component]) – the components to be used for the LigandNetwork

  • central_component (Component, optional) – the central component can be given, in order to shortcut the calculations and enforce the central ligand.

Returns:

a star like network.

Return type:

LigandNetwork

class konnektor.network_planners.generators.minimal_spanning_tree_network_generator.MinimalSpanningTreeNetworkGenerator(mappers: AtomMapper | list[AtomMapper], scorer, n_processes: int = 1, progress: bool = False, _initial_edge_lister: NetworkGenerator = None)#

Bases: NetworkGenerator

The MinimalSpanningTreeNetworkGenerator, builds an minimal spanning tree (MST) network for a given set of Component/s. The Transformation s of the network are represented by an AtomMapping s, which are scored by a AtomMappingScorer.

For the MST algorithm, the Kruskal Algorithm is used.

The MST algorithm gives the optimal graph score possible and the minimal required set of Transformations. This makes the MST Network very efficient. However, the MST is not very robust, in case of one failing Transformation, the network is immediately disconnected. The disconnectivity will translate to a loss of Component/s in the final FE Network.

Parameters:
  • mapper (Union[AtomMapper, list[AtomMapper]]) – AtomMapper or list of AtomMapper/s to use to define the relationship between two ligands.

  • scorer (AtomMappingScorer) – The scoring function to use for evaluating an atom mapping. Should give a score in [0,1].

  • n_processes (int, optional) – Number of processes to be used for parallelization. (default: 1)

  • progress (bool, optional) – If True, a progress bar will be displayed. (default: False)

  • _initial_edge_lister (NetworkPlanner, optional) – NetworkPlanner to be used to generate the initial set of edges. For standard usage, the Maximal NetworkPlanner is often appropriate. For very large networks, the HeuristicMaximalNetworkPlanner might be a useful alternative. (default: MaximalNetworkPlanner)

generate_ligand_network(components: Iterable[Component]) LigandNetwork#

Generate a MST network from the given Component/s.

Parameters:

components (Iterable[Component]) – Components to be used as nodes in the LigandNetwork.

Returns:

LigandNetwork generated following the MST rules.

Return type:

LigandNetwork

class konnektor.network_planners.generators.cyclic_network_generator.CyclicNetworkGenerator(mappers: AtomMapper | list[AtomMapper], scorer, node_present_in_cycles: int = 2, cycle_sizes: int | list[int] = 3, n_processes: int = 1, progress: bool = False, _initial_edge_lister: NetworkGenerator = None)#

Bases: NetworkGenerator

A NetworkGenerator that generates a network based on many network cycles. This is of interest for analyzing the uncertainty of FE Estimates along thermodynamic cycles and possibly for correcting the estimates with cycle closure analysis.

The greedy algorithm builds the network up from a nodewise perspective. For each node, the algorithm generates all cycles of size cycle_sizes and assigns a score to each cycle as the sum of all sub-scores. Next, it selects the node_present_in_cycles best score performing and node diversity increasing (see below) cycles per node. The set of selected Transformations constructs the graph. The node diversity criterion is an addition which biases to spread the cycles on the graph equally between all Components.

The number of cycles around each Component can be defined by component_present_in_cycles and the allowed cycle size can be tweaked with cycle_sizes.

This layout has well-distributed connectivity between all Component s, which increases the robustness very well, but still allows for a better graph score then the Twin Star Network, as the connectivity distribution is biased and not enforced. The large number of cycles might be very useful for statistical analysis. Nevertheless, the network has an increased amount of Transformation s.

Parameters:
  • mappers (Union[AtomMapper, list[AtomMapper]]) – AtomMapper(s) to use to propose mappings. At least 1 required, but many can be given, in which case all will be tried to find the lowest score edges.

  • scorer (AtomMappingScorer) – Any callable which takes a AtomMapping and returns a float.

  • node_present_in_cycles (int) – The number of cycles the node should be present in.

  • cycle_sizes (Union[int, List[int]]) – The cycle size to be used for designing the graph. When providing a list[int], a range of sizes is allowed (e.g. [3,4]). (default: 3)

  • n_processes (int, optional) – Number of processes that can be used for the network generation. (default: 1)

  • progress (bool, optional) – If True, displays a progress bar. (default: False)

  • _initial_edge_lister (NetworkPlanner, optional) – The NetworkPlanner used to give the initial set of edges. For standard usage, the MaximalNetworkPlanner is used. (default: MaximalNetworkPlanner)

generate_ligand_network(components: Iterable[Component]) LigandNetwork#

generate a cyclic network for the given compounds.

Parameters:

components (Iterable[Component]) – the ligands to include in the LigandNetwork

Returns:

a complex network.

Return type:

LigandNetwork

class konnektor.network_planners.generators.clustered_network_generator.ClusteredNetworkGenerator(sub_network_planners: Iterable[NetworkGenerator] = (<class 'konnektor.network_planners.generators.cyclic_network_generator.CyclicNetworkGenerator'>, ), concatenator: NetworkConcatenator = <class 'konnektor.network_planners.concatenators.mst_concatenator.MstConcatenator'>, clusterer: ComponentsDiversityClusterer = <konnektor.network_tools.clustering.component_diversity_clustering.ComponentsDiversityClusterer object>, mappers: AtomMapper | list[AtomMapper] = None, scorer=None, n_processes: int = 1, progress: bool = False)#

Bases: NetworkGenerator

Implements the general concept of nd-space clustered networks and provides the logic.

The algorithm works as follows: 1. Cluster Component s with the clusterer obj. 2. Build sub-networks in the clusters using the sub_network_planners. 3. Concatenate all sub-networks using the concatenator to build the final network.

Parameters:
  • sub_network_planners (Iterable[NetworkGenerator]) – NetworkGenerator(s) used to translate clusters to sub-networks.

  • concatenator (NetworkConcatenator) – A NetworkConcatenator used to connect sub-networks.

  • clusterer (ComponentsDiversityClusterer) – Separates the Component s along the first dimension.

  • mappers (Union[AtomMapper, list[AtomMapper]]) – Defines the connection between two ligands if NetworkConcatenator s or NetworkGenerator s are provided. Otherwise, (?) (default:None)

  • scorer (AtomMappingScorer) – scoring function evaluating an AtomMapping, and giving a score between [0,1], if only NetworkConcatenator or NetworkGenerator classes are passed

  • progress (bool, optional) – if True a progress bar will be displayed. (default: False)

  • n_processes (int) – number of processes that can be used for the network generation. (default: 1)

generate_ligand_network(components: Iterable[Component]) LigandNetwork#

Create a network with n randomly selected edges for possible proposed mappings.

Parameters:

components (Iterable[Component]) – the ligands to include in the LigandNetwork

Returns:

a complex network.

Return type:

LigandNetwork

class konnektor.network_planners.generators.clustered_network_generator.StarrySkyNetworkGenerator(mappers: ~gufe.mapping.atom_mapper.AtomMapper | list[~gufe.mapping.atom_mapper.AtomMapper], scorer, clusterer: ~konnektor.network_tools.clustering._abstract_clusterer._AbstractClusterer = <konnektor.network_tools.clustering.component_diversity_clustering.ComponentsDiversityClusterer object>, n_processes: int = 1, progress: bool = False)#

Bases: ClusteredNetworkGenerator

The StarrySkyNetworkGenerator is an advanced network algorithm, that clusters the provided Component s and builds up a network from this.

The approach follows the following steps: 1. Component clustering: a. Translate the Molecules into Morgan Fingerprints. (default) b. Cluster the Morgan Fingerprints with HDBSCAN. (default) 2. Build Sub-Star Networks in each Cluster using the StarNetworkGenerator. 3. Concatenate the Sub-Star Networks to the final Starry Sky Network, with 3 Transformations per cluster pair using the MSTConcatenator.

This approach allows in comparison to the Star Network, to build a network containing multiple centers imopoving the graph score. Still adding a limited amount of Transformation s increasing the computational cost, but not as much Transformations as with the Twin Star Network would be generated. So the Starry Sky Network is a compromise betwen graph score optimization and number of Transformations.

Parameters:
  • mapper (Union[AtomMapper, list[AtomMapper]]) – the atom mapper is required, to define the connection between two ‘Component’s

  • scorer (AtomMappingScorer) – scoring function evaluating an AtomMapping, and giving a score between [0,1]

  • clusterer (ComponentsDiversityClusterer) – This class is seperating the Component s along the first dimension.

  • progress (bool, optional) – if True a progress bar will be displayed. (default: False)

  • n_processes (int) – number of processes that can be used for the network generation. (default: 1)

konnektor.network_planners.concatenators#

class konnektor.network_planners.concatenators.mst_concatenator.MstConcatenator(mappers: AtomMapper | list[AtomMapper], scorer, n_connecting_edges: int = 2, n_processes: int = 1, _initial_edge_lister: NetworkConcatenator = None)#

Bases: NetworkConcatenator

A NetworkConcatenator that connects two Networks with a Kruskal-like approach, up to the number of connecting edges.

Parameters:
  • mapper (AtomMapper) – the atom mapper is required, to define the connection between two ligands.

  • scorer (AtomMappingScorer) – scoring function evaluating an atom mapping, and giving a score between [0,1].

  • n_connecting_edges (int, optional) – maximum number of connecting edges. (default: 2)

  • n_processes (int) – number of processes that can be used for the network generation. (default: 1)

concatenate_networks(ligand_networks: Iterable[LigandNetwork]) LigandNetwork#

Concatenate the given networks.

Parameters:

ligand_networks (Iterable[LigandNetwork]) – an iterable of ligand networks, that shall be connected.

Returns:

returns a concatenated LigandNetwork object, containing all networks.

Return type:

LigandNetwork