AI News
Google DeepMind’s AI Weather Forecaster Handily Beats a Global Standard
Machine learning algorithms that digested decades of weather data were able to forecast 90 percent of atmospheric measures more accurately than Europe’s top weather center.
YouTube adapts its policies for the coming surge of AI videos
YouTube today announced how it will approach handling AI-created content on its platform with a range of new policies surrounding responsible disclosure as well as new tools for requesting the removal of deepfakes, among other things. The company says that, although it already has policies that prohibit manipulated media, AI necessitated the creation of new […] © 2023 TechCrunch. All rights reserved. For personal use only.
Ghostbuster: Detecting Text Ghostwritten by Large Language Models
The structure of Ghostbuster, our new state-of-the-art method for detecting AI-generated text. Large language models like ChatGPT write impressively well—so well, in fact, that they’ve become a problem. Students have begun using these models to ghostwrite assignments, leading some schools to ban ChatGPT. In addition, these models are also prone to producing text with factual errors, so wary readers may want to know if generative AI tools have been used to ghostwrite news articles or other sources before trusting them. What can teachers and consumers do? Existing tools to detect AI-generated text sometimes do poorly on data that differs from what they were trained on. In addition, if these models falsely classify real human writing as AI-generated, they can jeopardize students whose genuine work is called into question. Our recent paper introduces Ghostbuster, a state-of-the-art method for detecting AI-generated text. Ghostbuster works by finding the probability of generating each token in a document under several weaker language models, then combining functions based on these probabilities as input to a final classifier. Ghostbuster doesn’t need to know what model was used to generate a document, nor the probability of generating the document under that specific model. This property makes Ghostbuster particularly useful for detecting text potentially generated by an unknown model or a black-box model, such as the popular commercial models ChatGPT and Claude, for which probabilities aren’t available. We’re particularly interested in ensuring that Ghostbuster generalizes well, so we evaluated across a range of ways that text could be generated, including different domains (using newly collected datasets of essays, news, and stories), language models, or prompts. Examples of human-authored and AI-generated text from our datasets. Why this Approach? Many current AI-generated text detection systems are brittle to classifying different types of text (e.g., different writing styles, or different text generation models or prompts). Simpler models that use perplexity alone typically can’t capture more complex features and do especially poorly on new writing domains. In fact, we found that a perplexity-only baseline was worse than random on some domains, including non-native English speaker data. Meanwhile, classifiers based on large language models like RoBERTa easily capture complex features, but overfit to the training data and generalize poorly: we found that a RoBERTa baseline had catastrophic worst-case generalization performance, sometimes even worse than a perplexity-only baseline. Zero-shot methods that classify text without training on labeled data, by calculating the probability that the text was generated by a specific model, also tend to do poorly when a different model was actually used to generate the text. How Ghostbuster Works Ghostbuster uses a three-stage training process: computing probabilities, selecting features, and classifier training. Computing probabilities: We converted each document into a series of vectors by computing the probability of generating each word in the document under a series of weaker language models (a unigram model, a trigram model, and two non-instruction-tuned GPT-3 models, ada and davinci). Selecting features: We used a structured search procedure to select features, which works by (1) defining a set of vector and scalar operations that combine the probabilities, and (2) searching for useful combinations of these operations using forward feature selection, repeatedly adding the best remaining feature. Classifier training: We trained a linear classifier on the best probability-based features and some additional manually-selected features. Results When trained and tested on the same domain, Ghostbuster achieved 99.0 F1 across all three datasets, outperforming GPTZero by a margin of 5.9 F1 and DetectGPT by 41.6 F1. Out of domain, Ghostbuster achieved 97.0 F1 averaged across all conditions, outperforming DetectGPT by 39.6 F1 and GPTZero by 7.5 F1. Our RoBERTa baseline achieved 98.1 F1 when evaluated in-domain on all datasets, but its generalization performance was inconsistent. Ghostbuster outperformed the RoBERTa baseline on all domains except creative writing out-of-domain, and had much better out-of-domain performance than RoBERTa on average (13.8 F1 margin). Results on Ghostbuster's in-domain and out-of-domain performance. To ensure that Ghostbuster is robust to the range of ways that a user might prompt a model, such as requesting different writing styles or reading levels, we evaluated Ghostbuster’s robustness to several prompt variants. Ghostbuster outperformed all other tested approaches on these prompt variants with 99.5 F1. To test generalization across models, we evaluated performance on text generated by Claude, where Ghostbuster also outperformed all other tested approaches with 92.2 F1. AI-generated text detectors have been fooled by lightly editing the generated text. We examined Ghostbuster’s robustness to edits, such as swapping sentences or paragraphs, reordering characters, or replacing words with synonyms. Most changes at the sentence or paragraph level didn’t significantly affect performance, though performance decreased smoothly if the text was edited through repeated paraphrasing, using commercial detection evaders such as Undetectable AI, or making numerous word- or character-level changes. Performance was also best on longer documents. Since AI-generated text detectors may misclassify non-native English speakers’ text as AI-generated, we evaluated Ghostbuster’s performance on non-native English speakers’ writing. All tested models had over 95% accuracy on two of three tested datasets, but did worse on the third set of shorter essays. However, document length may be the main factor here, since Ghostbuster does nearly as well on these documents (74.7 F1) as it does on other out-of-domain documents of similar length (75.6 to 93.1 F1). Users who wish to apply Ghostbuster to real-world cases of potential off-limits usage of text generation (e.g., ChatGPT-written student essays) should note that errors are more likely for shorter text, domains far from those Ghostbuster trained on (e.g., different varieties of English), text by non-native speakers of English, human-edited model generations, or text generated by prompting an AI model to modify a human-authored input. To avoid perpetuating algorithmic harms, we strongly discourage automatically penalizing alleged usage of text generation without human supervision. Instead, we recommend cautious, human-in-the-loop use of Ghostbuster if classifying someone’s writing as AI-generated could harm them. Ghostbuster can also help with a variety of lower-risk applications, including filtering AI-generated text out of language model training data and checking if online sources of information are AI-generated. Conclusion Ghostbuster is a state-of-the-art AI-generated text detection model, with 99.0 F1 performance across tested domains, representing substantial progress over existing models. It generalizes well to different domains, prompts, and models, and it’s well-suited to identifying text from black-box or unknown models because it doesn’t require access to probabilities from the specific model used to generate the document. Future directions for Ghostbuster include providing explanations for model decisions and improving robustness to attacks that specifically try to fool detectors. AI-generated text detection approaches can also be used alongside alternatives such as watermarking. We also hope that Ghostbuster can help across a variety of applications, such as filtering language model training data or flagging AI-generated content on the web. Try Ghostbuster here: ghostbuster.app Learn more about Ghostbuster here: [ paper ] [ code ] Try guessing if text is AI-generated yourself here: ghostbuster.app/experiment
Asymmetric Certified Robustness via Feature-Convex Neural Networks
Asymmetric Certified Robustness via Feature-Convex Neural Networks TLDR: We propose the asymmetric certified robustness problem, which requires certified robustness for only one class and reflects real-world adversarial scenarios. This focused setting allows us to introduce feature-convex classifiers, which produce closed-form and deterministic certified radii on the order of milliseconds. Figure 1. Illustration of feature-convex classifiers and their certification for sensitive-class inputs. This architecture composes a Lipschitz-continuous feature map $\varphi$ with a learned convex function $g$. Since $g$ is convex, it is globally underapproximated by its tangent plane at $\varphi(x)$, yielding certified norm balls in the feature space. Lipschitzness of $\varphi$ then yields appropriately scaled certificates in the original input space. Despite their widespread usage, deep learning classifiers are acutely vulnerable to adversarial examples: small, human-imperceptible image perturbations that fool machine learning models into misclassifying the modified input. This weakness severely undermines the reliability of safety-critical processes that incorporate machine learning. Many empirical defenses against adversarial perturbations have been proposed—often only to be later defeated by stronger attack strategies. We therefore focus on certifiably robust classifiers, which provide a mathematical guarantee that their prediction will remain constant for an $\ell_p$-norm ball around an input. Conventional certified robustness methods incur a range of drawbacks, including nondeterminism, slow execution, poor scaling, and certification against only one attack norm. We argue that these issues can be addressed by refining the certified robustness problem to be more aligned with practical adversarial settings. The Asymmetric Certified Robustness Problem Current certifiably robust classifiers produce certificates for inputs belonging to any class. For many real-world adversarial applications, this is unnecessarily broad. Consider the illustrative case of someone composing a phishing scam email while trying to avoid spam filters. This adversary will always attempt to fool the spam filter into thinking that their spam email is benign—never conversely. In other words, the attacker is solely attempting to induce false negatives from the classifier. Similar settings include malware detection, fake news flagging, social media bot detection, medical insurance claims filtering, financial fraud detection, phishing website detection, and many more. Figure 2. Asymmetric robustness in email filtering. Practical adversarial settings often require certified robustness for only one class. These applications all involve a binary classification setting with one sensitive class that an adversary is attempting to avoid (e.g., the “spam email” class). This motivates the problem of asymmetric certified robustness, which aims to provide certifiably robust predictions for inputs in the sensitive class while maintaining a high clean accuracy for all other inputs. We provide a more formal problem statement in the main text. Feature-convex classifiers We propose feature-convex neural networks to address the asymmetric robustness problem. This architecture composes a simple Lipschitz-continuous feature map ${\varphi: \mathbb{R}^d \to \mathbb{R}^q}$ with a learned Input-Convex Neural Network (ICNN) ${g: \mathbb{R}^q \to \mathbb{R}}$ (Figure 1). ICNNs enforce convexity from the input to the output logit by composing ReLU nonlinearities with nonnegative weight matrices. Since a binary ICNN decision region consists of a convex set and its complement, we add the precomposed feature map $\varphi$ to permit nonconvex decision regions. Feature-convex classifiers enable the fast computation of sensitive-class certified radii for all $\ell_p$-norms. Using the fact that convex functions are globally underapproximated by any tangent plane, we can obtain a certified radius in the intermediate feature space. This radius is then propagated to the input space by Lipschitzness. The asymmetric setting here is critical, as this architecture only produces certificates for the positive-logit class $g(\varphi(x)) > 0$. The resulting $\ell_p$-norm certified radius formula is particularly elegant: \[r_p(x) = \frac{ \color{blue}{g(\varphi(x))} } { \mathrm{Lip}_p(\varphi) \color{red}{\| \nabla g(\varphi(x)) \| _{p,*}}}.\] The non-constant terms are easily interpretable: the radius scales proportionally to the classifier confidence and inversely to the classifier sensitivity. We evaluate these certificates across a range of datasets, achieving competitive $\ell_1$ certificates and comparable $\ell_2$ and $\ell_{\infty}$ certificates—despite other methods generally tailoring for a specific norm and requiring orders of magnitude more runtime. Figure 3. Sensitive class certified radii on the CIFAR-10 cats vs dogs dataset for the $\ell_1$-norm. Runtimes on the right are averaged over $\ell_1$, $\ell_2$, and $\ell_{\infty}$-radii (note the log scaling). Our certificates hold for any $\ell_p$-norm and are closed form and deterministic, requiring just one forwards and backwards pass per input. These are computable on the order of milliseconds and scale well with network size. For comparison, current state-of-the-art methods such as randomized smoothing and interval bound propagation typically take several seconds to certify even small networks. Randomized smoothing methods are also inherently nondeterministic, with certificates that just hold with high probability. Theoretical promise While initial results are promising, our theoretical work suggests that there is significant untapped potential in ICNNs, even without a feature map. Despite binary ICNNs being restricted to learning convex decision regions, we prove that there exists an ICNN that achieves perfect training accuracy on the CIFAR-10 cats-vs-dogs dataset. Fact. There exists an input-convex classifier which achieves perfect training accuracy for the CIFAR-10 cats-versus-dogs dataset. However, our architecture achieves just $73.4\%$ training accuracy without a feature map. While training performance does not imply test set generalization, this result suggests that ICNNs are at least theoretically capable of attaining the modern machine learning paradigm of overfitting to the training dataset. We thus pose the following open problem for the field. Open problem. Learn an input-convex classifier which achieves perfect training accuracy for the CIFAR-10 cats-versus-dogs dataset. Conclusion We hope that the asymmetric robustness framework will inspire novel architectures which are certifiable in this more focused setting. Our feature-convex classifier is one such architecture and provides fast, deterministic certified radii for any $\ell_p$-norm. We also pose the open problem of overfitting the CIFAR-10 cats vs dogs training dataset with an ICNN, which we show is theoretically possible. This post is based on the following paper: Asymmetric Certified Robustness via Feature-Convex Neural Networks Samuel Pfrommer, Brendon G. Anderson, Julien Piet, Somayeh Sojoudi, 37th Conference on Neural Information Processing Systems (NeurIPS 2023). Further details are available on arXiv and GitHub. If our paper inspires your work, please consider citing it with: @inproceedings{ pfrommer2023asymmetric, title={Asymmetric Certified Robustness via Feature-Convex Neural Networks}, author={Samuel Pfrommer and Brendon G. Anderson and Julien Piet and Somayeh Sojoudi}, booktitle={Thirty-seventh Conference on Neural Information Processing Systems}, year={2023} }
The SAG Deal Sends a Clear Message About AI and Workers
The agreement between Hollywood actors, studios, and streamers isn’t perfect. But it could set the tone for how future labor movements confront changes brought about by artificial intelligence.
AI programs spat out known data and hardly learned specific chemical interactions when predicting drug potency
Artificial intelligence (AI) is on the rise. Until now, AI applications generally have 'black box' character: How AI arrives at its results remains hidden. A cheminformatics scientist has now developed a method that reveals how certain AI applications work in pharmaceutical research. The results are unexpected: the AI programs largely remembered known data and hardly learned specific chemical interactions when predicting drug potency.
The US Wants China to Start Talking About AI Weapons
As the US and China meet for the APEC summit in San Francisco this week, American officials are pushing for talks on the risks posed by military use of AI.
Netflix Killed 'The OA.' Now Its Creators Are Back With a Show About Tech’s Ubiquity
The OA had the kind of fans who held flash mobs to protest its cancelation. Now its creators are back with A Murder at the End of the World, and a warning about tech’s influence on people’s lives.
GitHub Universe: Open Source Trends Report and New AI Security Products
GitHub Advanced Security gains AI features, and GitHub Copilot now includes a chatbot option. GitHub Copilot Enterprise is expected in February 2024.
AI robotics’ ‘GPT moment’ is near
Building AI-powered robots that can learn how to interact with the physical world will enhance all forms of repetitive work © 2023 TechCrunch. All rights reserved. For personal use only.
Fei-Fei Li Started an AI Revolution By Seeing Like an Algorithm
Researcher Fei-Fei Li’s ImageNet project provided the feedstock for the deep learning boom that brought the world ChatGPT and other world-changing AI systems.
Tech Disrupted Hollywood. AI Almost Destroyed It
Streaming invigorated the film and TV industry and sent established studios scrambling. That was before AI sparked one of the biggest work stoppages in Hollywood history.
How to use AI for discovery -- without leading science astray
In the same way that chatbots sometimes 'hallucinate,' or make things up, machine learning models designed for scientific applications can sometimes present misleading or downright false results. Researchers now present a new statistical technique for safely using AI predictions to test scientific hypotheses.
Obamacare Call Center Staff Strike Over Steep Health Care Costs and Scarce Bathroom Breaks
Staff at US federal contractor Maximus claim they only get six minutes a day to use the bathroom, are monitored by an AI system that reports them for going off-script, and can’t afford health care.
Humane’s Ai Pin is a $700 Smartphone Alternative You Wear All Day
If you’re willing to clip the Ai Pin to your chest, you can talk, gesture, and tap to take photos or summon a powerful virtual assistant.
This New Breed of AI Assistant Wants to Do Your Boring Office Chores
An experimental AI helper attempts to operate a web browser in the same way a human does to take on office admin like processing invoices or screening job applicants.
How Humane’s Ai Pin Works
This week, we talk about the new Humane wearable and the future of phone alternatives.
Meta requires political advertisers to mark when deepfakes used
Advertisers will have to make clear when they use AI in political ads on the social media platforms.
OpenAI Data Partnerships
Working together to create open-source and private datasets for AI training.
Explained: Generative AI
How do powerful generative AI systems like ChatGPT work, and what makes them different from other types of artificial intelligence?