API¶
- amsr.encode.FromMol(mol, useGroups=True, stringent=True, randomize=False, canonical=False, useStereo=True)¶
Convert RDKit Mol to AMSR
- Parameters:
mol (
Mol) – RDKit MoluseGroups (
Optional[bool]) – use group symbols/abbreviations (default: True)stringent (
Optional[bool]) – try to exclude unstable or synthetically inaccessible molecules (default: True)randomize (
Optional[bool]) – randomize order of graph traversal (default: False)canonical (
Optional[bool]) – canonical order of graph traversal (default: False)useStereo (
Optional[bool]) – encode stereochemistry (default: True)
- Return type:
str- Returns:
list of AMSR tokens
- amsr.encode.FromMolToTokens(mol, useGroups=True, stringent=True, randomize=False, canonical=False, useStereo=True)¶
Convert RDKit Mol to list of AMSR tokens
- Parameters:
mol (
Mol) – RDKit MoluseGroups (
Optional[bool]) – use group symbols/abbreviations (default: True)stringent (
Optional[bool]) – try to exclude unstable or synthetically inaccessible molecules (default: True)randomize (
Optional[bool]) – randomize order of graph traversal (default: False)canonical (
Optional[bool]) – canonical order of graph traversal (default: False)useStereo (
Optional[bool]) – encode stereochemistry (default: True)
- Return type:
list[str]- Returns:
list of AMSR tokens
- amsr.encode.FromSmiles(s, useGroups=True, stringent=True, randomize=False, canonical=False, useStereo=True)¶
Convert SMILES to AMSR
- Parameters:
s (
str) – SMILESuseGroups (
Optional[bool]) – use group symbols/abbreviationsstringent (
Optional[bool]) – try to exclude unstable or synthetically inaccessible moleculesrandomize (
Optional[bool]) – randomize order of graph traversalcanonical (
Optional[bool]) – canonical order of graph traversal (default: False)useStereo (
Optional[bool]) – encode stereochemistry (default: True)
- Return type:
str- Returns:
AMSR
- amsr.encode.FromSmilesToTokens(s, useGroups=True, stringent=True, randomize=False, canonical=False, useStereo=True)¶
Convert SMILES to list of AMSR tokens
- Parameters:
mol – RDKit Mol
useGroups (
Optional[bool]) – use group symbols/abbreviationsstringent (
Optional[bool]) – try to exclude unstable or synthetically inaccessible moleculesrandomize (
Optional[bool]) – randomize order of graph traversalcanonical (
Optional[bool]) – canonical order of graph traversal (default: False)useStereo (
Optional[bool]) – encode stereochemistry (default: True)
- Return type:
list[str]- Returns:
AMSR
- amsr.decode.ToMol(s, stringent=True, dihedral=None)¶
Convert AMSR to an RDKit Mol
- Parameters:
s (
str) – AMSRstringent (
Optional[bool]) – try to exclude unstable or synthetically inaccessible moleculesdihedral (
Optional[dict[tuple[int,int,int,int],int]]) – return dictionary of dihedral angles, where keys are indices and values are angles in degrees
- Return type:
Mol- Returns:
RDKit Mol
- amsr.decode.ToSmiles(s, stringent=True, isomericSmiles=True)¶
Convert AMSR to SMILES
- Parameters:
s (
str) – AMSRstringent (
Optional[bool]) – try to exclude unstable or synthetically inaccessible moleculesisomericSmiles (
Optional[bool]) – include stereochemistry
- Return type:
str- Returns:
SMILES
- amsr.check.CheckAMSR(s, stringent=True)¶
Decode AMSR and check for valid molecule
- Parameters:
s (
str) – AMSRstringent (
Optional[bool]) – try to exclude unstable or synthetically inaccessible molecules
- Return type:
bool- Returns:
valid molecule?
- amsr.check.CheckMol(m1, stringent=True)¶
Do round trip from RDKit mol m to AMSR. Compare InChI strings (with -FixedH) before and after round trip.
- Parameters:
m – RDKit Mol
stringent (
Optional[bool]) – try to exclude unstable or synthetically inaccessible moleculesm1 (Mol)
- Return type:
bool- Returns:
do InChI strings match?
- amsr.check.CheckSmiles(s, stringent=True)¶
Convert SMILES s to RDKit Mol, then do round trip to AMSR. Compare InChI strings (with -FixedH) before and after round trip.
- Parameters:
s (
str) – SMILESstringent (
Optional[bool]) – try to exclude unstable or synthetically inaccessible molecules
- Return type:
bool- Returns:
do InChI strings match?
- amsr.groups.DecodeGroups(s)¶
- Return type:
str- Parameters:
s (str)
- amsr.groups.EncodeGroups(s)¶
- Return type:
list[str]- Parameters:
s (list[str])
- amsr.groups.Groups()¶
Keys are functional group abbreviations, values are lists of one or more AMSR strings consisting only of atom/bond tokens. May be modified, but
InitializeGroups()must be called after modification.- Return type:
dict[str,list[str]]- Returns:
Groups dictionary
- amsr.groups.InitializeGroups()¶
Initialize tree and compile regular expression for converting between group abbreviations and tokens. Must be called after modification of
Groups()dictionary.- Return type:
None
- amsr.tokens.ToTokens(s)¶
Convert AMSR string to a list of tokens
- Parameters:
s (
str) – AMSR- Return type:
list[str]- Returns:
list of tokens
- class amsr.morph.Morph(s, t)¶
morph between two molecules, by taking the minimum-edit pathway between their string representations
- Parameters:
s (
list[str]) – list of AMSR tokenst (
list[str]) – list of AMSR tokens
- asGridImage()¶
- classmethod fromSmiles(s, t)¶
create morph from two SMILES strings
- Parameters:
s (
str) – SMILESt (
str) – SMILES
- Returns:
Morph object
- showAsSmiles()¶
display each mol in the morph as SMILES
- class amsr.markov.Markov(mols)¶
generate molecules using a simple Markov model
- Parameters:
mols (
list[Mol]) – rdkit Mols, from which to draw token frequencies
- generate(nmax=-1)¶
generate an AMSR string
- Parameters:
nmax (
Optional[int]) – maximum length of string- Returns:
AMSR string
- generateTokens(nmax=-1)¶
generate sequence of tokens
- Parameters:
nmax (
Optional[int]) – maximum number of tokens to generate- Returns:
sequence of tokens
- class amsr.modifier.Modifier(model_path, nDeleteAvg=2, nAddMax=10, nReplaceAvg=3)¶
modify a molecule by shuffling atom order, deleting tokens, then adding tokens
- Parameters:
mols – rdkit Mols, from which to draw token frequencies
nDeleteAvg (
Optional[int]) – average number of tokens to deletenAddMax (
Optional[int]) – maximum number of tokens to addnReplaceAvg (
Optional[int]) – number of token replacementsmodel_path (str)
- modify(mol)¶
modify given molecule
- Parameters:
mol (
Mol) – molecule to modify- Return type:
Mol- Returns:
modified molecule
- modifySmiles(s)¶
modify given SMILES
- Parameters:
s (
str) – SMILES to modify- Return type:
str- Returns:
modified SMILES