Molecule Design by Latent Prompt Transformer
NeurIPS 2024 (Spotlight) [Paper link] [Code link]
Deqian Kong*, Yuhao Huang*, Jianwen Xie*, Edouardo Honig*, Ming Xu, Shuanghong Xue, Pei Lin, Sanping Zhou,
Sheng Zhong, Nanning Zheng, Ying Nian Wu
NeurIPS 2024 (Spotlight) [Paper link] [Code link]
Deqian Kong*, Yuhao Huang*, Jianwen Xie*, Edouardo Honig*, Ming Xu, Shuanghong Xue, Pei Lin, Sanping Zhou,
Sheng Zhong, Nanning Zheng, Ying Nian Wu
Abstract
This work explores the challenging problem of molecule design by framing it as a conditional generative modeling task, with target biological properties or desired chemical constraints as conditioning variables. We propose the Latent Prompt Transformer (LPT), a novel generative model comprising three components: (1) a latent vector with a learnable prior distribution modeled by a neural transformation of Gaussian white noise; (2) a molecule generation model using a causal Transformer with the latent vector as a prompt; and (3) a property prediction model that predicts a molecule’s target property and/or constraint values based on the latent prompt. LPT can be learned by maximum likelihood estimation on molecule-property pairs. During property optimization, the latent prompt is inferred from target properties and constraints via posterior sampling and then used to guide the autoregressive molecule generation. After initial training on existing molecules and their properties, we progressively shift the model distribution towards regions supporting desired target properties. Experiments show that LPT effectively discovers useful molecules in single-objective, multi-objective, and structure-constrained optimization tasks.
Model
Example demos
Illustration of generated molecules binding to PHGDH with docking poses generated by AutoDock-GPU.
Left: the molecule generated through multi-objective optimization.
Right: the molecule generated via structure-constrained optimization.
Experiments