Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation

As the field of natural language processing advances and new ideas develop, we’re seeing more and more ways to use compute efficiently, producing AI systems that are cheaper to run and easier to control. Large Language Models (LLMs) have enormous potential, but also challenge existing workflows in industry that require modularity, transparency and data privacy. In this talk, I’ll show some practical solutions for using the latest state-of-the-art models in real-world applications and distilling their knowledge into smaller and faster components that you can run and maintain in-house.

I’ll share some real-world case studies and approaches for using large generative models at development time instead of runtime, curate their structured predictions with an efficient human-in-the-loop workflow and distill task-specific components as small as 6mb that run cheaply, privately and reliably, and that you can compose into larger NLP systems.

If you’re trying to build a system that does a particular thing, you don’t need to transform your request into arbitrary language and call into the largest model that understands arbitrary language the best. The people developing those models are telling that story, but the rest of us aren’t obliged to believe them.

Case Study #1: https://speakerdeck.com/inesmontani/workshop-half-hour-of-labeling-power-can-we-beat-gpt

Source link
lol