large language models

12 Nov

Virtual Personas for Language Models via an Anthology of Backstories

stp2y0 CommentsAIcomputational social science, large language models, virtual personas

We introduce Anthology, a method for conditioning LLMs to representative, consistent, and diverse virtual personas by generating and utilizing naturalistic backstories with rich details of individual values and experience. What does it mean for large language models (LLMs) to be trained on massive text corpora, collectively produced by millions and billions of distinctive human authors? In “Language Models as Agent Models”, compelling evidence suggests that recent language models could be considered models of agents: provided with a textual context, LLMs are capable of generating conditional text that represents the characteristics of an agent likely to have produced that context. This…

31 Oct

Distill Your LLMs and Surpass Their Performance: spaCy’s Creator at InfoQ DevSummit Munich

stp2y0 CommentsNewsai, Architecture & Design, development, efficient mlops llm distillation, large language models, ML & Data Engineering, MLOps, Model Distillation

In her presentation at the inaugural edition of InfoQ Dev Summit Munich, Ines Montani built on top of the presentation she had earlier this year at QCon London and provided the audience with practical solutions for using the latest state-of-the-art models in real-world applications and distilling their knowledge into smaller and faster components that you can run and maintain in-house. She began by stating that using black box models hidden behind APIs would prevent us from satisfying the properties of good software: modular, transparent, explainable, data-private, reliable, and affordable. Further, Montani pointed out that GenAI can be helpful in multiple situations…

03 Jul

The AI Revolution Will Not Be Monopolized

stp2y0 CommentsNewsai, ai revolution not monopolized, artificial intelligence, chatgpt, gemini, google, large language models, ML & Data Engineering, openai, QCon London 2024, QCon Software Development Conference

Key Takeaways Open-source initiatives are pivotal in democratizing AI technology, offering transparent, extensible tools that empower users. The open-source community quickly turns new research into practical AI tools, making them stronger and more useful. Distilling large language models during development enables the creation of accurate, fast, and private task-specific models, reducing reliance on general-purpose APIs. Effective regulation should distinguish between human-facing AI applications and underlying machine-facing components, ensuring innovation while addressing concerns about data privacy, security, and equitable access. This is a summary of a talk that Ines Montani gave at QCon London in April 2024. Large language models…

24 May

Ines Montani at QCon London: Economies of Scale Can’t Monopolise the AI Revolution

stp2y0 CommentsNewsai, ai revolution monopol, Architecture & Design, artificial intelligence, Automated Machine Learning, deep learning, development, generative ai, large language models, machine learning, ML & Data Engineering, QCon London 2024

During her presentation at QCon London, Ines Montani, co-founder and CEO of explosion.ai (the maker of spaCy), stated that economies of scale are not enough to create monopolies in the AI space and that open-source techniques and models will allow everybody to keep up with the "Gen AI revolution". Montani opened her presentation by asking for a show of hands to identify the open-source users in the audience. The vast majority of the audience raised their hand, easily demonstrating that open-source is ubiquitous ("it would be easier to ask who doesn’t use open-source’"). She pointed out the multiple benefits of the…

22 May

Ghostbuster: Detecting Text Ghostwritten by Large Language Models

stp2y0 CommentsAIlarge language models, text generation

The structure of Ghostbuster, our new state-of-the-art method for detecting AI-generated text. Large language models like ChatGPT write impressively well—so well, in fact, that they’ve become a problem. Students have begun using these models to ghostwrite assignments, leading some schools to ban ChatGPT. In addition, these models are also prone to producing text with factual errors, so wary readers may want to know if generative AI tools have been used to ghostwrite news articles or other sources before trusting them. What can teachers and consumers do? Existing tools to detect AI-generated text sometimes do poorly on data that differs from…