API¶
- amsr.encode.FromMol(mol, useGroups=True, stringent=True, randomize=False, canonical=False, useStereo=True)¶
Convert RDKit Mol to AMSR
- Parameters:
mol (
Mol
) – RDKit MoluseGroups (
Optional
[bool
]) – use group symbols/abbreviations (default: True)stringent (
Optional
[bool
]) – try to exclude unstable or synthetically inaccessible molecules (default: True)randomize (
Optional
[bool
]) – randomize order of graph traversal (default: False)canonical (
Optional
[bool
]) – canonical order of graph traversal (default: False)useStereo (
Optional
[bool
]) – encode stereochemistry (default: True)
- Return type:
str
- Returns:
list of AMSR tokens
- amsr.encode.FromMolToTokens(mol, useGroups=True, stringent=True, randomize=False, canonical=False, useStereo=True)¶
Convert RDKit Mol to list of AMSR tokens
- Parameters:
mol (
Mol
) – RDKit MoluseGroups (
Optional
[bool
]) – use group symbols/abbreviations (default: True)stringent (
Optional
[bool
]) – try to exclude unstable or synthetically inaccessible molecules (default: True)randomize (
Optional
[bool
]) – randomize order of graph traversal (default: False)canonical (
Optional
[bool
]) – canonical order of graph traversal (default: False)useStereo (
Optional
[bool
]) – encode stereochemistry (default: True)
- Return type:
list
[str
]- Returns:
list of AMSR tokens
- amsr.encode.FromSmiles(s, useGroups=True, stringent=True, randomize=False, canonical=False, useStereo=True)¶
Convert SMILES to AMSR
- Parameters:
s (
str
) – SMILESuseGroups (
Optional
[bool
]) – use group symbols/abbreviationsstringent (
Optional
[bool
]) – try to exclude unstable or synthetically inaccessible moleculesrandomize (
Optional
[bool
]) – randomize order of graph traversalcanonical (
Optional
[bool
]) – canonical order of graph traversal (default: False)useStereo (
Optional
[bool
]) – encode stereochemistry (default: True)
- Return type:
str
- Returns:
AMSR
- amsr.encode.FromSmilesToTokens(s, useGroups=True, stringent=True, randomize=False, canonical=False, useStereo=True)¶
Convert SMILES to list of AMSR tokens
- Parameters:
mol – RDKit Mol
useGroups (
Optional
[bool
]) – use group symbols/abbreviationsstringent (
Optional
[bool
]) – try to exclude unstable or synthetically inaccessible moleculesrandomize (
Optional
[bool
]) – randomize order of graph traversalcanonical (
Optional
[bool
]) – canonical order of graph traversal (default: False)useStereo (
Optional
[bool
]) – encode stereochemistry (default: True)
- Return type:
list
[str
]- Returns:
AMSR
- amsr.decode.ToMol(s, stringent=True, dihedral=None)¶
Convert AMSR to an RDKit Mol
- Parameters:
s (
str
) – AMSRstringent (
Optional
[bool
]) – try to exclude unstable or synthetically inaccessible moleculesdihedral (
Optional
[dict
[tuple
[int
,int
,int
,int
],int
]]) – return dictionary of dihedral angles, where keys are indices and values are angles in degrees
- Return type:
Mol
- Returns:
RDKit Mol
- amsr.decode.ToSmiles(s, stringent=True)¶
Convert AMSR to SMILES
- Parameters:
s (
str
) – AMSRstringent (
Optional
[bool
]) – try to exclude unstable or synthetically inaccessible molecules
- Return type:
str
- Returns:
SMILES
- amsr.check.CheckAMSR(s, stringent=True)¶
Decode AMSR and check for valid molecule
- Parameters:
s (
str
) – AMSRstringent (
Optional
[bool
]) – try to exclude unstable or synthetically inaccessible molecules
- Return type:
bool
- Returns:
valid molecule?
- amsr.check.CheckMol(m1, stringent=True)¶
Do round trip from RDKit mol m to AMSR. Compare InChI strings (with -FixedH) before and after round trip.
- Parameters:
m – RDKit Mol
stringent (
Optional
[bool
]) – try to exclude unstable or synthetically inaccessible moleculesm1 (Mol)
- Return type:
bool
- Returns:
do InChI strings match?
- amsr.check.CheckSmiles(s, stringent=True)¶
Convert SMILES s to RDKit Mol, then do round trip to AMSR. Compare InChI strings (with -FixedH) before and after round trip.
- Parameters:
s (
str
) – SMILESstringent (
Optional
[bool
]) – try to exclude unstable or synthetically inaccessible molecules
- Return type:
bool
- Returns:
do InChI strings match?
- amsr.groups.DecodeGroups(s)¶
- Return type:
str
- Parameters:
s (str)
- amsr.groups.EncodeGroups(s)¶
- Return type:
list
[str
]- Parameters:
s (list[str])
- amsr.groups.Groups()¶
Keys are functional group abbreviations, values are lists of one or more AMSR strings consisting only of atom/bond tokens. May be modified, but
InitializeGroups()
must be called after modification.- Return type:
dict
[str
,list
[str
]]- Returns:
Groups dictionary
- amsr.groups.InitializeGroups()¶
Initialize tree and compile regular expression for converting between group abbreviations and tokens. Must be called after modification of
Groups()
dictionary.- Return type:
None
- amsr.tokens.ToTokens(s)¶
Convert AMSR string to a list of tokens
- Parameters:
s (
str
) – AMSR- Return type:
list
[str
]- Returns:
list of tokens
- class amsr.morph.Morph(s, t)¶
morph between two molecules, by taking the minimum-edit pathway between their string representations
- Parameters:
s (
list
[str
]) – list of AMSR tokenst (
list
[str
]) – list of AMSR tokens
- classmethod fromSmiles(s, t)¶
create morph from two SMILES strings
- Parameters:
s (
str
) – SMILESt (
str
) – SMILES
- Returns:
Morph object
- showAsSmiles()¶
display each mol in the morph as SMILES
- class amsr.markov.Markov(mols)¶
generate molecules using a simple Markov model
- Parameters:
mols (
list
[Mol
]) – rdkit Mols, from which to draw token frequencies
- generate(nmax=-1)¶
generate an AMSR string
- Parameters:
nmax (
Optional
[int
]) – maximum length of string- Returns:
AMSR string
- generateTokens(nmax=-1)¶
generate sequence of tokens
- Parameters:
nmax (
Optional
[int
]) – maximum number of tokens to generate- Returns:
sequence of tokens
- class amsr.modifier.Modifier(model_path, nDeleteAvg=2, nAddMax=10, nReplaceAvg=3)¶
modify a molecule by shuffling atom order, deleting tokens, then adding tokens
- Parameters:
mols – rdkit Mols, from which to draw token frequencies
nDeleteAvg (
Optional
[int
]) – average number of tokens to deletenAddMax (
Optional
[int
]) – maximum number of tokens to addnReplaceAvg (
Optional
[int
]) – number of token replacementsmodel_path (str)
- modify(mol)¶
modify given molecule
- Parameters:
mol (
Mol
) – molecule to modify- Return type:
Mol
- Returns:
modified molecule
- modifySmiles(s)¶
modify given SMILES
- Parameters:
s (
str
) – SMILES to modify- Return type:
str
- Returns:
modified SMILES