API

FromMol(mol, useGroups=True, stringent=True, randomize=False)

Convert RDKit Mol to AMSR

Parameters:
  • mol (Mol) – RDKit Mol

  • useGroups (Optional[bool]) – use group symbols/abbreviations

  • stringent (Optional[bool]) – try to exclude unstable or synthetically inaccessible molecules

  • randomize (Optional[bool]) – randomize order of graph traversal

Return type:

str

Returns:

AMSR

FromMolToTokens(mol, useGroups=True, stringent=True, randomize=False)

Convert RDKit Mol to list of AMSR tokens

Parameters:
  • mol (Mol) – RDKit Mol

  • useGroups (Optional[bool]) – use group symbols/abbreviations

  • stringent (Optional[bool]) – try to exclude unstable or synthetically inaccessible molecules

  • randomize (Optional[bool]) – randomize order of graph traversal

Return type:

List[str]

Returns:

list of AMSR tokens

FromSmiles(s, useGroups=True, stringent=True, randomize=False)

Convert SMILES to AMSR

Parameters:
  • s (str) – SMILES

  • useGroups (Optional[bool]) – use group symbols/abbreviations

  • stringent (Optional[bool]) – try to exclude unstable or synthetically inaccessible molecules

  • randomize (Optional[bool]) – randomize order of graph traversal

Return type:

str

Returns:

AMSR

FromSmilesToTokens(s, useGroups=True, stringent=True, randomize=False)

Convert SMILES to list of AMSR tokens

Parameters:
  • mol – RDKit Mol

  • useGroups (Optional[bool]) – use group symbols/abbreviations

  • stringent (Optional[bool]) – try to exclude unstable or synthetically inaccessible molecules

  • randomize (Optional[bool]) – randomize order of graph traversal

Return type:

List[str]

Returns:

AMSR

ToMol(s, stringent=True, dihedral=None)

Convert AMSR to an RDKit Mol

Parameters:
  • s (str) – AMSR

  • stringent (Optional[bool]) – try to exclude unstable or synthetically inaccessible molecules

  • dihedral (Optional[Dict[Tuple[int, int, int, int], int]]) – return dictionary of dihedral angles, where keys are indices and values are angles in degrees

Return type:

Mol

Returns:

RDKit Mol

ToSmiles(s, stringent=True)

Convert AMSR to SMILES

Parameters:
  • s (str) – AMSR

  • stringent (Optional[bool]) – try to exclude unstable or synthetically inaccessible molecules

Return type:

str

Returns:

SMILES

CheckAMSR(s, stringent=True)

Decode AMSR and check for valid molecule

Parameters:
  • s (str) – AMSR

  • stringent (Optional[bool]) – try to exclude unstable or synthetically inaccessible molecules

Return type:

bool

Returns:

valid molecule?

CheckMol(m1, stringent=True)

Do round trip from RDKit mol m to AMSR. Compare InChI strings (with -FixedH) before and after round trip.

Parameters:
  • m – RDKit Mol

  • stringent (Optional[bool]) – try to exclude unstable or synthetically inaccessible molecules

Return type:

bool

Returns:

do InChI strings match?

CheckSmiles(s, stringent=True)

Convert SMILES s to RDKit Mol, then do round trip to AMSR. Compare InChI strings (with -FixedH) before and after round trip.

Parameters:
  • s (str) – SMILES

  • stringent (Optional[bool]) – try to exclude unstable or synthetically inaccessible molecules

Return type:

bool

Returns:

do InChI strings match?

Groups()

Keys are functional group abbreviations, values are lists of one or more AMSR strings consisting only of atom/bond tokens. May be modified, but InitializeGroups() must be called after modification.

Return type:

Dict[str, List[str]]

Returns:

Groups dictionary

InitializeGroups()

Initialize tree and compile regular expression for converting between group abbreviations and tokens. Must be called after modification of Groups() dictionary.

Return type:

None

ToTokens(s)

Convert AMSR string to a list of tokens

Parameters:

s (str) – AMSR

Return type:

List[str]

Returns:

list of tokens

class Morph(s, t)

morph between two molecules, by taking the minimum-edit pathway between their string representations

Parameters:
  • s (List[str]) – list of AMSR tokens

  • t (List[str]) – list of AMSR tokens

classmethod fromSmiles(s, t)

create morph from two SMILES strings

Parameters:
  • s (str) – SMILES

  • t (str) – SMILES

Returns:

Morph object

showAsSmiles()

display each mol in the morph as SMILES

class Markov(mols)

generate molecules using a simple Markov model

Parameters:

mols (List[Mol]) – rdkit Mols, from which to draw token frequencies

generate(nmax=-1)

generate an AMSR string

Parameters:

nmax (Optional[int]) – maximum length of string

Returns:

AMSR string

generateTokens(nmax=-1)

generate sequence of tokens

Parameters:

nmax (Optional[int]) – maximum number of tokens to generate

Returns:

sequence of tokens

class Modifier(model_path, nDeleteAvg=2, nAddMax=10, nReplaceAvg=3)

modify a molecule by shuffling atom order, deleting tokens, then adding tokens

Parameters:
  • mols – rdkit Mols, from which to draw token frequencies

  • nDeleteAvg (Optional[int]) – average number of tokens to delete

  • nAddMax (Optional[int]) – maximum number of tokens to add

  • nReplaceAvg (Optional[int]) – number of token replacements

modify(mol)

modify given molecule

Parameters:

mol (Mol) – molecule to modify

Return type:

Mol

Returns:

modified molecule

modifySmiles(s)

modify given SMILES

Parameters:

s (str) – SMILES to modify

Return type:

str

Returns:

modified SMILES