pan assay interference compounds (PAINS)

method one:

Evaluation of DiffLinker, a molecular skeleton transition tool_wufeil's blog-CSDN blog

PAINS (pan assay interference compounds) are a class of compound structural fragments that can produce misleading results in a variety of bioassay experiments. These fragments may interact with multiple targets, leading to false-positive results. To avoid misleading results in drug development, scientists often screen candidate compound libraries to eliminate compounds that contain PAINS fragments. PAINS are roughly divided into two categories: for the first category, the compounds form colloids in the solution at the test concentration, and the proteins are wrapped in the colloids, so the substrate cannot access the active center of the enzyme, so these compounds are false positive compounds; For the second category, the active groups of these compounds can form covalent interactions with protein receptors, resulting in receptor inhibition, but this inhibition is difficult to reverse. At the same time, these ligands with PAINS properties can interact with most Target reaction occurs, lacking specificity. If drug-related articles are to be submitted to journals such as the Journal of Medicinal Chemistry, they may require the calculation of PAINS. In this case, MolAICal can be used to calculate PAINS.

The address where the CSV file is located is: https://github.com/igashov/DiffLinker/blob/main/resources/wehi_pains.csv

from rdkit import Chem
import csv
import pandas as pd

# 判断是否含有PAINS
pains_smarts_loc = './resources/wehi_pains.csv'
with open(pains_smarts_loc, 'r') as f:
    pains_smarts = [Chem.MolFromSmarts(line[0], mergeHs=True) for line in csv.reader(f)]
    pains_smarts = set(pains_smarts)

def check_pains(mol, pains):
    for pain in pains:
        if mol.HasSubstructMatch(pain):
            return True
    return False



# 读取csv文件然后将不含有PAINS的SMILES写入新的csv文件
data = pd.read_csv("result_EP4_classfication85.csv")
smiles_list = data['smiles'].tolist()

passed_pains = []
for smiles in smiles_list:
    pred_mol = Chem.MolFromSmiles(smiles)
    if pred_mol is not None:
        if not check_pains(pred_mol, pains_smarts):
            passed_pains.append(smiles)


output_file = 'non_pains_smiles.csv'
result_df = pd.DataFrame({'smiles': passed_pains})
result_df.to_csv(output_file, index=False)

print("Non-pains SMILES have been written to", output_file)


Method Two:

Online verification tool:

The PAINS Remover

The PAINS Remover

Reference: A piece of super useful information about the free ADMET calculation tool, please check_prediction_compound_toxicity

Note : The number checked online and written in code are different, so please use method one.

Reference file: GitHub - igashov/DiffLinker: DiffLinker: Equivariant 3D-Conditional Diffusion Model for Molecular Linker Design

Guess you like

Origin blog.csdn.net/weixin_43135178/article/details/132144622