AI News
Managing Type 1 Diabetes Is Tricky. Can AI Help?
In a simulation, AI learned fast and helped virtual patients meet their blood glucose targets. Can machine learning be trusted to help real people too?
AI Could Change How Blind People See the World
Assistive technology services are integrating OpenAI's GPT-4, using artificial intelligence to help describe objects and people.
Generative AI in Games Will Create a Copyright Crisis
Titles like AI Dungeon are already using generative AI to generate in-game content. Nobody knows who owns it.
Boost Your Bottom Line By Learning How To Use ChatGPT For Just $20
Whether you want to streamline business operations or offer more services to clients, you could employ ChatGPT to your advantage. The post Boost Your Bottom Line By Learning How To Use ChatGPT For Just $20 appeared first on TechRepublic.
How to Use Google Bard to Find Images Faster
AI can help you search smarter.
TechRepublic Premium Editorial Calendar: IT Policies, Checklists, Hiring Kits and Research for Download
TechRepublic Premium content helps you solve your toughest IT issues and jump-start your career or next project. The post TechRepublic Premium Editorial Calendar: IT Policies, Checklists, Hiring Kits and Research for Download appeared first on TechRepublic.
How AI can help process waste and increase recycling
Video cameras powered by AI are analysing work at waste processing and recycling facilities.
The AI trained to recognise waste for recycling
Video cameras powered by AI are analysing work at waste processing and recycling facilities.
Microsoft’s First Generative AI Certificate Is Available for Free
Microsoft is also running a grant competition for ideas on using AI training in community building. The post Microsoft’s First Generative AI Certificate Is Available for Free appeared first on TechRepublic.
When computer vision works more like a brain, it sees more like people do
Training artificial neural networks with data from real brains can make computer vision more robust.
Educating national security leaders on artificial intelligence
Experts from MIT’s School of Engineering, Schwarzman College of Computing, and Sloan Executive Education educate national security leaders in AI fundamentals.
Researchers teach an AI to write better chart captions
A new dataset can help scientists develop automatic systems that generate richer, more descriptive captions for online charts.
Data Scientist Survey: Do Tech Leaders Believe the AI Hype?
Is AI hype here to stay? What problems and risks come with it? Get answers to these questions and more from this survey. The post Data Scientist Survey: Do Tech Leaders Believe the AI Hype? appeared first on TechRepublic.
How Generative AI is a Game Changer for Cloud Security
Generative AI will be a game changer in cloud security, especially in common pain points like preventing threats, reducing toil from repetitive tasks, and bridging the cybersecurity talent gap. The post How Generative AI is a Game Changer for Cloud Security appeared first on TechRepublic.
The Huge Power and Potential Danger of AI-Generated Code
Programming can be faster when algorithms help out, but there is evidence AI coding assistants also make bugs more common.
Should I Use an AI to Write My Wedding Toast?
WIRED’s spiritual advice columnist on the meaning of emotional labor and how not to be the worst man.
Generating 3D Molecular Conformers via Equivariant Coarse-Graining and Aggregated Attention
Figure 1: CoarsenConf architecture. Molecular conformer generation is a fundamental task in computational chemistry. The objective is to predict stable low-energy 3D molecular structures, known as conformers, given the 2D molecule. Accurate molecular conformations are crucial for various applications that depend on precise spatial and geometric qualities, including drug discovery and protein docking. We introduce CoarsenConf, an SE(3)-equivariant hierarchical variational autoencoder (VAE) that pools information from fine-grain atomic coordinates to a coarse-grain subgraph level representation for efficient autoregressive conformer generation. Background Coarse-graining reduces the dimensionality of the problem allowing conditional autoregressive generation rather than generating all coordinates independently, as done in prior work. By directly conditioning on the 3D coordinates of prior generated subgraphs, our model better generalizes across chemically and spatially similar subgraphs. This mimics the underlying molecular synthesis process, where small functional units bond together to form large drug-like molecules. Unlike prior methods, CoarsenConf generates low-energy conformers with the ability to model atomic coordinates, distances, and torsion angles directly. The CoarsenConf architecture can be broken into the following components: (I) The encoder $q_\phi(z| X, \mathcal{R})$ takes the fine-grained (FG) ground truth conformer $X$, RDKit approximate conformer $\mathcal{R}$ , and coarse-grained (CG) conformer $\mathcal{C}$ as inputs (derived from $X$ and a predefined CG strategy), and outputs a variable-length equivariant CG representation via equivariant message passing and point convolutions. (II) Equivariant MLPs are applied to learn the mean and log variance of both the posterior and prior distributions. (III) The posterior (training) or prior (inference) is sampled and fed into the Channel Selection module, where an attention layer is used to learn the optimal pathway from CG to FG structure. (IV) Given the FG latent vector and the RDKit approximation, the decoder $p_\theta(X |\mathcal{R}, z)$ learns to recover the low-energy FG structure through autoregressive equivariant message passing. The entire model can be trained end-to-end by optimizing the KL divergence of latent distributions and reconstruction error of generated conformers. MCG Task Formalism We formalize the task of Molecular Conformer Generation (MCG) as modeling the conditional distribution $p(X|\mathcal{R})$, where $\mathcal{R}$ is the RDKit generated approximate conformer and $X$ is the optimal low-energy conformer(s). RDKit, a commonly used Cheminformatics library, uses a cheap distance geometry-based algorithm, followed by an inexpensive physics-based optimization, to achieve reasonable conformer approximations. Coarse-graining Figure 2: Coarse-graining Procedure. (I) Example of variable-length coarse-graining. Fine-grain molecules are split along rotatable bonds that define torsion angles. They are then coarse-grained to reduce the dimensionality and learn a subgraph-level latent distribution. (II) Visualization of a 3D conformer. Specific atom pairs are highlighted for decoder message-passing operations. Molecular coarse-graining simplifies a molecule representation by grouping the fine-grained (FG) atoms in the original structure into individual coarse-grained (CG) beads $\mathcal{B}$ with a rule-based mapping, as shown in Figure 2(I). Coarse-graining has been widely utilized in protein and molecular design, and analogously fragment-level or subgraph-level generation has proven to be highly valuable in diverse 2D molecule design tasks. Breaking down generative problems into smaller pieces is an approach that can be applied to several 3D molecule tasks and provides a natural dimensionality reduction to enable working with large complex systems. We note that compared to prior works that focus on fixed-length CG strategies where each molecule is represented with a fixed resolution of $N$ CG beads, our method uses variable-length CG for its flexibility and ability to support any choice of coarse-graining technique. This means that a single CoarsenConf model can generalize to any coarse-grained resolution as input molecules can map to any number of CG beads. In our case, the atoms consisting of each connected component resulting from severing all rotatable bonds are coarsened into a single bead. This choice in CG procedure implicitly forces the model to learn over torsion angles, as well as atomic coordinates and inter-atomic distances. In our experiments, we use GEOM-QM9 and GEOM-DRUGS, which on average, possess 11 atoms and 3 CG beads, and 44 atoms and 9 CG beads, respectively. SE(3)-Equivariance A key aspect when working with 3D structures is maintaining appropriate equivariance. Three-dimensional molecules are equivariant under rotations and translations, or SE(3)-equivariance. We enforce SE(3)-equivariance in the encoder, decoder, and the latent space of our probabilistic model CoarsenConf. As a result, $p(X | \mathcal{R})$ remains unchanged for any rototranslation of the approximate conformer $\mathcal{R}$. Furthermore, if $\mathcal{R}$ is rotated clockwise by 90°, we expect the optimal $X$ to exhibit the same rotation. For an in-depth definition and discussion on the methods of maintaining equivariance, please see the full paper. Aggregated Attention Figure 3: Variable-length coarse-to-fine backmapping via Aggregated Attention. We introduce a method, which we call Aggregated Attention, to learn the optimal variable length mapping from the latent CG representation to FG coordinates. This is a variable-length operation as a single molecule with $n$ atoms can map to any number of $N$ CG beads (each bead is represented by a single latent vector). The latent vector of a single CG bead $Z_{B}$ $\in R^{F \times 3}$ is used as the key and value of a single head attention operation with an embedding dimension of three to match the x, y, z coordinates. The query vector is the subset of the RDKit conformer corresponding to bead $B$ $\in R^{ n_{B} \times 3}$, where $n_B$ is variable-length as we know a priori how many FG atoms correspond to a certain CG bead. Leveraging attention, we efficiently learn the optimal blending of latent features for FG reconstruction. We call this Aggregated Attention because it aggregates 3D segments of FG information to form our latent query. Aggregated Attention is responsible for the efficient translation from the latent CG representation to viable FG coordinates (Figure 1(III)). Model CoarsenConf is a hierarchical VAE with an SE(3)-equivariant encoder and decoder. The encoder operates over SE(3)-invariant atom features $h \in R^{ n \times D}$, and SE(3)-equivariant atomistic coordinates $x \in R^{n \times 3}$. A single encoder layer is composed of three modules: fine-grained, pooling, and coarse-grained. Full equations for each module can be found in the full paper. The encoder produces a final equivariant CG tensor $Z \in R^{N \times F \times 3}$, where $N$ is the number of beads, and F is the user-defined latent size. The role of the decoder is two-fold. The first is to convert the latent coarsened representation back into FG space through a process we call channel selection, which leverages Aggregated Attention. The second is to refine the fine-grained representation autoregressively to generate the final low-energy coordinates (Figure 1 (IV)). We emphasize that by coarse-graining by torsion angle connectivity, our model learns the optimal torsion angles in an unsupervised manner as the conditional input to the decoder is not aligned. CoarsenConf ensures each next generated subgraph is rotated properly to achieve a low coordinate and distance error. Experimental Results Table 1: Quality of generated conformer ensembles for the GEOM-DRUGS test set ($\delta=0.75Å$) in terms of Coverage (%) and Average RMSD ($Å$). CoarsenConf (5 epochs) was restricted to using 7.3% of the data used by Torsional Diffusion (250 epochs) to exemplify a low-compute and data-constrained regime. The average error (AR) is the key metric that measures the average RMSD for the generated molecules of the appropriate test set. Coverage measures the percentage of molecules that can be generated within a specific error threshold ($\delta$). We introduce the mean and max metrics to better assess robust generation and avoid the sampling bias of the min metric. We emphasize that the min metric produces intangible results, as unless the optimal conformer is known a priori, there is no way to know which of the 2L generated conformers for a single molecule is best. Table 1 shows that CoarsenConf generates the lowest average and worst-case error across the entire test set of DRUGS molecules. We further show that RDKit, with an inexpensive physics-based optimization (MMFF), achieves better coverage than most deep learning-based methods. For formal definitions of the metrics and further discussions, please see the full paper linked below. For more details about CoarsenConf, read the paper on arXiv. BibTex If CoarsenConf inspires your work, please consider citing it with: @article{reidenbach2023coarsenconf, title={CoarsenConf: Equivariant Coarsening with Aggregated Attention for Molecular Conformer Generation}, author={Danny Reidenbach and Aditi S. Krishnapriyan}, journal={arXiv preprint arXiv:2306.14852}, year={2023}, }
Databricks Gains MosaicML and Its Generative AI for $1.3 Billion
Learn what the Databricks acquisition means for companies looking into public or private generative AI foundation models. The post Databricks Gains MosaicML and Its Generative AI for $1.3 Billion appeared first on TechRepublic.
Computer vision system marries image recognition and generation
MAGE merges the two key tasks of image generation and recognition, typically trained separately, into a single system.
Elon Musk Seeks Support Against Rules on Free Speech Online
During a tour in Europe to make a Neuralink announcement, Musk's real goal became apparent: Stop the European Commission’s proposed measures regarding online content moderation.