• Chaos Theory
  • Posts
  • 🥟 Chao-Down #287 Is prompt engineering dead? HuggingFace to develop an open-source robotics project, Elon Musk will release xAI's Grok chatbot in the open-source

🥟 Chao-Down #287 Is prompt engineering dead? HuggingFace to develop an open-source robotics project, Elon Musk will release xAI's Grok chatbot in the open-source

Plus, a look at the stereotypes inherent in generative AI models.

Is prompt engineering dead? Probably not, but maybe human prompt engineering will be.

Rick Battle and Teja Gollapudi from VMware recently conducted research showing that prompt engineering is better performed by the underlying large language model itself, rather than by a human engineer.

Citing a surprising lack of consistency from testing an LLM’s ability to answer elementary math questions, the two were puzzled by how LLM performance varied depending on different prompting techniques. For instance, asking models to show their reasoning step-by-step—a method known as chain-of-thought—boosted their performance on various math and logic problems. But this wasn’t always the case and sometimes worsened performance. Even more strangely, Battle found that using positive prompts, such as “this will be fun” or “you are as smart as ChatGPT,” sometimes gave better results.

According to the researchers:

The only real trend may be no trend. What’s best for any given model, dataset, and prompting strategy is likely to be specific to the particular combination at hand.”

Instead, simply asking the model to design a better prompt proved to be the most effective. In almost every case, the prompt that was automatically generated by the model outperformed the best prompt that was manually optimized. The process also was much quicker, taking only a few hours instead of several days of searching.

This raises questions about the future of prompt engineering—and have made some people suspect that many prompt-engineering jobs might be short-lived, at least as they are currently conceived.

-Alex, your resident Chaos Coordinator.

What happened in AI? 📰

Generative AI Takes Stereotypes and Bias From Bad to Worse (Bloomberg)

Elon Musk takes another swing at OpenAI, makes xAI's Grok chatbot open-source (Reuters)

Hugging Face nabs Tesla scientist for open source robotics project (VentureBeat)

Should AI Be Open-Source? Behind the Tweetstorm Over Its Dangers (WSJ)

The messy, secretive reality behind OpenAI’s bid to save the world (MIT Technology Review)

AI holds tantalising promise for the emerging world (Economist)

Always be Learnin’ 📕 📖

Is Synthetic Data the Key to AGI? - by Nabeel S. Qureshi (substack.com)

AI Chat Is Not (Always) the Answer (nngroup.com)

90% of designers are unhirable? Or why your cookie-cutter portfolio doesn’t cut it and how to fix it (UX Collective)

Projects to Keep an Eye On 🛠

UniModal4Reasoning/ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning (Github)

LoRA Land: Fine-Tuned Open-Source LLMs that Outperform GPT-4 (Predibase)

bananaml/fructose - LLM calls as strongly-typed functions (Github)

The Latest in AI Research 💡

Instruction-tuned Language Models are Better Knowledge Learners (arxiv)

Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for LLM Web Agents (arxiv)

Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey (arxiv.org)

The World Outside of AI 🌎

TikTok Crackdown Shifts Into Overdrive, With Sale or Shutdown on Table (WSJ)

Elon Musk Has a Giant Charity. Its Money Stays Close to Home. (The New York Times)

Can you solve it? The word game at the cutting edge of computer science | Mathematics (The Guardian)

Your fingerprints can be recreated from the sounds made when you swipe on a touchscreen (Tom's Hardware)

A massive experiment is testing no-strings cash aid for Americans (NPR)

Why Tech Job Interviews Became Such a Nightmare (WIRED)

One Last Bite 😋