39 Lessons from Industry ML Conferences in 2024

Industry machine learning conferences are intense. There’s so much information, learning, and context switching between talks and posters that leaves you exhausted each day. Nonetheless, taking a few minutes to reflect and jot down notes helps to solidify the learning. Here are my notes from the conferences I attended in 2024.

P.S., I also had the opportunity to share my work at a few conferences. Slides available for my talks at the Netflix PRS Workshop and the AI Engineer World’s Fair. Unfortunately, my oral presentation at the Amazon ML Conference is internal only.

• • •

Building effective machine learning systems

1. The real world is messy. To build systems that work, we need to define reward functions (that define labels), operationalize the world as data, find levers that make a difference, and measure what matters. Beware of those who tell you ML is a walk in the park.

2. Getting reward functions right is half the battle won. Short-term rewards don’t show the full picture. Long-term rewards are slow and tricky to attribute. Proxy rewards are finicky and may not correlate well with business metrics. And most rewards are sparse, with feedback few and far between.

3. You don’t always need machine learning. Heuristics and SQL queries are valuable baselines. Start simple and see if the juice is worth the squeeze. Remember the first rule of machine learning: Start without machine learning.

4. Machine learning involves trade-offs. Recall vs. precision. Explore vs. exploit. Relevance vs. diversity vs. serendipity. Accuracy vs. speed vs. cost. The challenge is figuring out the right balance for your user experience.

5. Have realistic expectations. Most problems have a ceiling on what can be achieved, especially problems that involve predicting the behavior of unpredictable humans (e.g., recsys, search, fraud). It may not make sense to aim beyond the ceiling, unless you’re doing core research to push the boundaries on what’s possible.

6. Don’t overlook the dimension of time. User preferences change. Inventory gets drawn down. Content relevance shifts. Daily, seasonally, over months and years. If time is a key factor in your problem (e.g., recommendations, search, news feed), ensure your systems and models are time-aware.

7. Evals are a differentiator and moat. Over the past two years, teams with solid evals have been able to continuously ship reliable, delightful experiences. No one ever regrets investing in a robust evaluation framework.

8. Brandolini’s law: The amount of energy needed to refute bullshit is an order of magnitude larger than needed to produce it. The same applies to using LLMs. Generating content is easy relative to evaluating and guardrailing the defects. But the latter is how we earn—and keep—customer trust. Thus, we should invest our efforts accordingly.

9. We probably won’t have one model to rule them all. Instead, each product will likely have several models supporting it. Maybe a bigger model orchestrating several smaller models. This way, each smaller model can give their undivided attention to their task.

10. Altman’s law: When a new model drops, are you nervous, or are you super pumped? Deliberately design your product so that you’re in the latter camp. Be prepared to swap out the model anytime and reap the benefits of constantly improving models. The model isn’t your product—the system around it is.

11. Build with an eye toward the future. Flexibility beats specialization in the long run. Remember The Bitter Lesson. An LLM that’s a bit worse now will likely outperform a custom finetune later, especially as LLMs get cheaper (two orders of magnitude in 18 months!), faster, and more capable. Other examples include using a generative/extractive approach vs. named entity recognition/classification, and explore-exploit/reinforcement learning vs. supervised learning.

Production and scaling

12. Don’t underestimate the effort it takes to go from demo to production. “There’s a large class of problems that are easy to imagine and build demos for, but extremely hard to make products out of. For example, self-driving. It’s easy to demo a car self-driving around a block but making it into a product takes a decade.” — Andrej Karpathy

13. Scale makes everything harder. While we may not need to face it head-on from day one, we should be ready for it. Especially we have to support multiple existing markets, languages, devices, user experiences, etc.

14. Each 10x-ing of scale/traffic will uncover new bugs and issues. Prepare early for the operational challenges that come with scale and just keeping the lights on.

15. Depending on your product, LLMs are expensive or cheap. “Even the most expensive LLMs are not that expensive for B2B scale; even the cheapest LLMs are not that cheap for consumer scale.” — Will Larson. Thus, if you’re serving mostly B2B customer or internal workflows, sticking to LLM APIs could be viable. Scale when you need to, not prematurely.

16. [Corollary] The economics aren’t in the way; it’s trust, reliability, security, permissions, etc. Costs will keep dropping. But faithfulness, hallucinations, prompt injections, etc. are still open problems and the real bottleneck to wider integration of LLMs in customer-facing experiences.

17. Get the fundamentals right. Robust data pipelines, instrumentation, evaluation, guardrails, experimentation, metrics. These core capabilities are essential—and reusable—across most products. Investing in them early and reap the dividends with each new product you build.

18. Start simple, always. Complexity will creep in as we patch edge cases and extend an existing system for new needs and features. A system that starts complex will inevitably buckle under its own weight.

19. Not everything needs to be in real-time. If your user experience allows it, batch and asynchronous workflows simplify the system design and reduce operational costs. Thus, when designing a new system, ask yourself: Can this be async?

20. Design for fast failure detection and recovery. No system is perfect; stuff will slip through. And there’s an optimal investment balance between prevention and cure. Monitoring, alerts, rollbacks, Andon cords—these will go a long way.

Execution and collaboration

21. Execution is everything. Execution is navigating from where we are today to our long-term vision. It’s everything from idea to design to implementation to launch to measurement to operations and everything else in between. Executing well is the difference between success and failure.

22. Your rate of iteration = your rate of innovation. Focus on experimenting fast, getting rapid feedback, and updating or pivoting quickly. Velocity matters.

23. Breakthroughs will take longer than you think. Nine women can’t give birth to a baby in a month, and we can’t rush game-changing research and innovation. Start early, be patient, and keep pushing.

24. Not every challenge is technical. Some challenges are about alignment, culture, and organizations. Working with people is hard. When you zoom out, it turns out that tech is often the easy part.

25. It takes a village to raise a machine learning system. Infra, engineering, data, ML, design, product, business. No role is more important than the others. 1 + 1 + 1 = 5.

26. Genius can come from anywhere. Not just the ivory towers of research labs. Some of the most brilliant ideas come from engineers and non-technical people who understand the user best. Tap on the wisdom of the crowd.

27. You don’t have to go it alone. Whatever you’re working on, there will be others tackling similar problems. Reach out to them and collaborate, either within your organization or on the internet. Together, we’re stronger.

28. People want to help. Especially if you’ve done your homework, bring data, and keep an open mind. Everyone, and I mean everyone, at these conferences is incredibly kind and generous. I experienced this warmth firsthand interacting with some of the leaders in the field. The strongest are also the kindest.

29. What seems obvious to you can be game-changing for others. Expertise is a ladder. No matter where you stand, there are people a few rungs below who are eager to learn from you. Thus, share what you know, even if you think it’s too basic and not worth sharing; it could help someone unlock a challenge they’re facing.

30. Tune out the noise; focus on building. Don’t get nerd-sniped by the daily barrage of shiny new techniques. Most don’t pan out anyway. For real alpha on what actually works, have hallway conversations and DMs with the practitioners who are quietly crushing it.

Building for users

31. To create winning products, dream big and sweat the small stuff. Aim for the stars but don’t forget the details. Leaders have vision and get their hands dirty with the details. You need to do both.

32. Always work backwards from the customer. Why are we solving this problem? How does it help the customer or user? If you’re not a researcher, don’t do science for the sake of science. Focus on the customer.

33. Design with the data flywheel in mind. Data alone is not the competitive advantage; it is the data flywheels. How will you gather user feedback to enhance your model, system, or product and fuel a better customer experience? See this Tesla self-driving example from Andrej Karpathy. Whoever turns the data flywheel faster, wins.

34. Humans are insatiable. We automate. We simplify. But there will always be new problems to solve, and more things to build. Don’t worry; AI won’t steal your job—it’ll just make it more interesting.

Speaking at and attending conferences

35. Speaking at conferences is a tightrope act. It’s a delicate balance between sharing valuable insights and protecting the secret sauce. You’ll see this when speakers rely on public data, reference published work and public tech blogs, stick to high-level ideas, and choose their words carefully. (This is also why many of the best practitioners decline to speak.) Nonetheless, everyone understands.

36. GOATs are GOATs because they work hard. I saw them diligently taking notes, pulling up the papers being presented to review the tables and charts, and following up on points they didn’t understand. This is why they’re the best.

37. Conference insiders keep it friendly during Q&A. The hard questions and debates are reserved for hallway conversations, happy hours, and over dinner. Keep it friendly during the open Q&A.

38. No one really cares how good your slides look. Fancy layouts? Dark mode? Not a priority. Never really has been, from what I can tell. Some of the top speakers just use the default font on white backgrounds. Substance over style, always.

39. Put enough geeks together in one place and magic happens. The energy, the inspiration, the ideas. It’s infectious. I leave every conference on a high, fired up by ideas on what to explore next and motivated by the conversations. Highly recommend attending a good conference at least once a year.

If you found this useful, please cite this write-up as:

Yan, Ziyou. (Nov 2024). 39 Lessons from Industry ML Conferences in 2024. eugeneyan.com.
https://eugeneyan.com/writing/conf-lessons/.

@article{yan2024conflessons,
  title   = {39 Lessons from Industry ML Conferences in 2024},
  author  = {Yan, Ziyou},
  journal = {eugeneyan.com},
  year    = {2024},
  month   = {Nov},
  url     = {https://eugeneyan.com/writing/conf-lessons/}
}

Share on:

Join 8,800+ readers getting updates on machine learning, RecSys, LLMs, and engineering.

Source link
lol