News

LLM Task-Specific Evals that Do & Don’t Work

LLM Task-Specific Evals that Do & Don’t Work

If you’ve ran off-the-shelf evals for your tasks, you may have found that most don’t work. They barely correlate with application-specific performance and aren’t discriminative enough to use in production. As a result, we could spend weeks and still not have evals that reliably measure how we’re doing on our tasks. To save us some time, I’m sharing some evals I’ve found useful. The goal is to spend less time figuring out evals so we can spend more time shipping to users. We’ll focus on simple, common tasks like classification/extraction, summarization, and translation. (Although classification evals are basic, having a…
Read More
OpenAI will reportedly pay $250 million to put News Corp’s journalism in ChatGPT

OpenAI will reportedly pay $250 million to put News Corp’s journalism in ChatGPT

OpenAI and News Corp, the owner of The Wall Street Journal, MarketWatch, The Sun, and more than a dozen other publishing brands, have struck a multi-year deal to display news from these publications in ChatGPT, News Corp announced on Wednesday. OpenAI will be able to access both current and well as archived content from News Corp’s publications and use the data to further train its AI models. Neither company disclosed the terms of the deal, but a report in The Wall Street Journal estimated that News Corp would get $250 million over five years in cash and credits.“The pact acknowledges…
Read More
Beautiful dashboards in Python with first-class real-time integration | Deephaven

Beautiful dashboards in Python with first-class real-time integration | Deephaven

from deephaven import ui, agg, empty_tablefrom deephaven.stream.table_publisher import table_publisherfrom deephaven.stream import blink_to_append_onlyfrom deephaven.plot import express as dxfrom deephaven import updateby as ubyfrom deephaven import dtypes as dhtstocks = dx.data.stocks().reverse()def set_bol_properties(fig): fig.update_layout(showlegend=False) fig.update_traces(fill="tonexty", fillcolor='rgba(255,165,0,0.08)')@ui.componentdef line_plot( filtered_source, exchange, window_size, bol_bands): window_size_key = { "5 seconds": ("priceAvg5s", "priceStd5s"), "30 seconds": ("priceAvg30s", "priceStd30s"), "1 minute": ("priceAvg1m", "priceStd1m"), "5 minutes": ("priceAvg5m", "priceStd5m")} bol_bands_key = {"None": None, "80%": 1.282, "90%": 1.645, "95%": 1.960, "99%": 2.576} base_plot = ui.use_memo(lambda: ( dx.line(filtered_source, x="timestamp", y="price", by="exchange" if exchange == "All" else None, unsafe_update_figure=lambda fig: fig.update_traces(opacity=0.4)) ), [filtered_source, exchange]) window_size_avg_key_col = window_size_key[window_size][0] window_size_std_key_col = window_size_key[window_size][1] avg_plot = ui.use_memo(lambda: dx.line(filtered_source, x="timestamp",…
Read More
Data Centers’ Doubling Power Demand Seen Stressing Energy Grids – EE Times

Data Centers’ Doubling Power Demand Seen Stressing Energy Grids – EE Times

//php echo do_shortcode('[responsivevoice_button voice="US English Male" buttontext="Listen to Post"]') ?> An expected doubling in power consumption by the world’s data centers during the next few years is expected to strain the capacity of electricity suppliers, according to experts who spoke with EE Times. Those power constraints, without improvements in data center efficiency, will potentially impede the expansion of AI. Electricity demand from data centers, AI and cryptocurrency miners will surge by 2026, the Paris-based International Energy Agency (IEA) said in a January report. After consuming an estimated 460 terawatt-hours (TWh) worldwide in 2022, data centers’ total energy intake could more…
Read More
YugabyteDB ♥️ Hashicorp Vault – Fun Times

YugabyteDB ♥️ Hashicorp Vault – Fun Times

I have been working with YugabyteDB for a while now. I am always experiment with yugbayte + (something). Today, its Vault. I have also worked on Vault for a bit and did a a lightening talk earlier this year. That talks was primarily around the data masking. But today, I was exploring the database secret engine. For the uninitiated, Vault provides you with ability to dynamically generate database credentials for your application. It does this by leveraging the simple RBAC SQLs provided by the database engine. It supports variety of databases including Postgres, and YugabyteDB by compatibility. What triggered this…
Read More
Imec raises €2.5B for advanced chip tech R&D pilot line

Imec raises €2.5B for advanced chip tech R&D pilot line

Global R&D hub Imec Flanders today announced at its annual Imec Technology Forum in Antwerp (ITF World 2024) that it has raised a €2.5 billion investment to establish a new R&D pilot line for advanced chip technology and systems-on-chip.  Imec is the world's largest independent research centre in nanoelectronics and digital technology. The NanoIC pilot line is part of the EU Chips Act's vision to accelerate innovation in Europe, stimulate economic growth, and strengthen the European chip industry's ecosystem.  The EU Chips Act focuses on four strategic pilot lines, spread across several European member states to bridge the gap between…
Read More
Looking ahead to the AI Seoul Summit

Looking ahead to the AI Seoul Summit

How summits in Seoul, France and beyond can galvanize international cooperation on frontier AI safetyLast year, the UK Government hosted the first major global Summit on frontier AI safety at Bletchley Park. It focused the world’s attention on rapid progress at the frontier of AI development and delivered concrete international action to respond to potential future risks, including the Bletchley Declaration; new AI Safety Institutes; and the International Scientific Report on Advanced AI Safety.Six months on from Bletchley, the international community has an opportunity to build on that momentum and galvanize further global cooperation at this week’s AI Seoul Summit.…
Read More
Microsoft announces Copilot+ PCs and AI-powered Recall feature – gHacks Tech News

Microsoft announces Copilot+ PCs and AI-powered Recall feature – gHacks Tech News

On a special event at Microsoft Campus, Microsoft unveiled Copilot+ PCs officially. This new type of Windows PCs, formerly known as AI PCs, mark the first step into introducing AI capabilities in Windows devices. Much of what Microsoft revealed on Monday was already known through unverified leaks. The first batch of Copilot+ PCs are powered by Qualcomm processors and not Intel or AMD silicon. These will come later this year though. As far as requirements are concerned, these match the leaks: at least 16GB of RAM. at least 256GB SSD storage. Integrated NPU. No word on the Copilot key requirement…
Read More
Reflections On Qualtrics X4: AI-Powered Research Is Promising If We Stick To The Research Basics

Reflections On Qualtrics X4: AI-Powered Research Is Promising If We Stick To The Research Basics

Qualtrics introduced its AI-powered Strategy & Research suite at its summit, X4, and launched its “Strategic UX” product — officially entering the experience research space. The new product supports various UX research methods like video feedback, unmoderated usability testing, card sorting, and tree testing, and it leverages AI to generate insights and recommended actions. Adding UX research to its experience management solutions, Qualtrics aims to support organizations’ research efforts at scale with a single platform.  Qualtrics AI was the highlight of the event as it aims to help users get deeper insights through interactive dashboards, data analysis, and recommendations on…
Read More
Databricks

Databricks

Databricks is the second company in Generational’s late-stage company series. This was fun to write. As part of the research, I got the Lakehouse and Generative AI Fundamentals badges from Databricks Academy. Disclaimer: I have a financial interest in Databricks. Don’t take this as investment advice.In this deep dive, you’ll learn insights from conversations with many of Databricks’ customers and ex-employees. I want to thank Tegus for giving me access to their centralized expert call transcripts. With a platform as broad as Databricks, it is almost impossible to parse signal from the noise without primary research. If you’re curious about…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.