Viral News

Building DBRX-class Custom LLMs with Mosaic AI Training

Building DBRX-class Custom LLMs with Mosaic AI Training

We recently introduced DBRX: an open, state-of-the-art, general-purpose LLM. DBRX was trained, fine-tuned, and evaluated using Mosaic AI Training, scaling training to 3072 NVIDIA H100s and processing more than 12 trillion tokens in the process.Training LLMs, and in particular MoE models such as DBRX, is hard. It requires overcoming many infrastructure, performance, and scientific challenges. Mosaic AI Training was purposely built to address these challenges and was battle-tested through the training of DBRX, the MPT series of models, and many other LLMs such as Ola’s Krutrim, AI2’s OLMo, Dynamo AI’s Dynamo 8B, Refuel’s LLM-2, and others.Figure 1: Yet another insightful…
Read More
Fine Tuning Qwen 1.5 for Coding

Fine Tuning Qwen 1.5 for Coding

In this article, we will be fine tuning the Qwen 1.5 model for coding. Nowadays, there are several chat based LLMs available online. Sometimes, the options are so vast, that choosing a model mostly falls upon personal preference and the task to be accomplished. However, most of the chat based models are fine tuned on a general SFT (Supervised Fine Tuning) dataset. The smaller chat based LLMs (or SLMs [Small Language Models]) do not perform so well on task specific objectives, like coding, out of the box. Fine tuning the base models according to the task is a much better…
Read More
Data Machina #240

Data Machina #240

Foundation Models, Transformers and Time-Series. Statisticians and econometricians have been searching for the Holy Grail of time-series forecasting (TSF) for more than 40 years. “Classical” models like ARIMA still work remarkably well in some TSF scenarios. But today, the cool stuff is all about transformer-based, DL & foundation models for TSF. How “good” are these new DL models for TSF? How do we evaluate these models? Do these new models really achieve SOTA performance as some papers claim? Are DL researchers cherrypicking ts datasets to easily fit a SOTA TSF DL model?...Real world time-series data is complex, messy, and it…
Read More
Research Survey: Productivity benefits from Databricks Assistant

Research Survey: Productivity benefits from Databricks Assistant

In the fast-paced landscape of data science and engineering, integrating Artificial Intelligence (AI) has become integral for enhancing productivity. We’ve seen many tools emerge and transform the lives of data practitioners, making complex tasks easier and encouraging innovation. When we launched Databricks Assistant in Public Preview in July of 2023, we designed it exclusively for streamlining efficiency amongst data scientists, analysts, and engineers. To better understand how well we’re achieving this goal, we decided to survey some of our top users across multiple organizations, varying in experience. Purpose of the SurveyTo better understand Databricks Assistant’s impact on data professionals, we meticulously…
Read More
Data Machina #241

Data Machina #241

AI World Models and Video. My world vision model for waking up at 4am to travel overseas is frankly a bit fuzzy and unreliable. But What’s an AI world model? My 2 cents explainer: It’s a model that builds an internal representation of a real-world, physical, [human] environment, and uses that to predict or simulate future events within that environment.Until recently, research in AI world models has been very much focused on video games and Reinforcement Learning. But now, the boom of GenAI and large multi-modal models have triggered a new trend in AI world models based on large scale…
Read More
How a Leading Venture Capital Firm is Building GenAI with Databricks

How a Leading Venture Capital Firm is Building GenAI with Databricks

Successfully building GenAI applications means going beyond just leveraging the latest cutting-edge models. It requires the development of compound AI systems that integrate data, models, and infrastructure in a flexible, scalable, and production-ready way. This entails access to both open source and proprietary models, vector databases, the ability to fine-tune models and query structured data, create endpoints, prepare data, manage costs, and monitor solutions.In this blog, we’ll walk through the GenAI transformation for a leading venture capital firm (referred to as "VC" throughout the blog) that is also an investor in Databricks. In addition to driving innovation internally, this VC…
Read More
Data Machina #242

Data Machina #242

AI and Causality. The introduction of OpenAI Sora (simulate real worlds from video understanding) has sparked a bit of a debate among some prominent AI researchers. First, What do AI researchers mean by “causal”?Secondly: Do LLMs have causal reasoning capabilities? Can LLMs learn causality from just real world training data? Can LLMs learn, represent, and understand world models and physics? Judea Pearl - a world’s top researchers in Probabilistic AI, Bayesian Networks, and Causal Inference- once famously said in an interview:Deep Learning -albeit complex and non-trivial- it’s a curve fitting exercise. To build truly Intelligent Machines, teach them cause and…
Read More
Databricks Is a Glassdoor Best-Led Company in 2024

Databricks Is a Glassdoor Best-Led Company in 2024

Databricks is pleased to announce we are ranked #2 in the inaugural annual Glassdoor Award List of  Best-Led Companies in 2024! At Databricks, we're not just building cutting-edge technology; we're cultivating a culture of transparency. Our leadership mirrors our internal commitment to collaborative and transparent work practices. Databricks, a Glassdoor Best-Led CompanyOur CEO and Co-founder Ali Ghodsi’s leadership is rooted in truth-seeking and first principles thinking, two of Databricks’ core values that stay true to our origins in academia. As Databricks has grown, we’ve strived to maintain the open and transparent culture we had in our early days at the AMP research lab in UC Berkeley.…
Read More
Data Machina #254

Data Machina #254

On the State of AI Coding Agents. “How could we start using AI to migrate years of messy, flimsy legacy code to a modern stack? ... Perhaps an AI Code Migration Agent ???” We’re doing AI chat & espresso at Level 39, One Canada Square. James -a veteran CTO with all the scars- is asking these rather funny, rhetorical questions. There is a deep silence in the room, pensive faces around. Everyone is staring through the massive windows overlooking The City skyline as the sunset strikes. We wonder in perplexity -in the very philosophical and information theory sense- whether AI…
Read More
Semiconductors on the Data Intelligence Platform

Semiconductors on the Data Intelligence Platform

In the semiconductor industry, research and development tasks, manufacturing processes, and enterprise planning systems produce an array of data artifacts that can be fused to create an intelligent semiconductor enterprise. Through intelligent data use, an intelligent semiconductor enterprise accelerates time to market, increases manufacturing yield, and enhances product reliability.The Databricks Intelligence Platform suits semiconductor enterprises’ unique needs for performance, collaboration, and self-service access. Built on a lakehouse architecture with leading technologies of Delta Lake, Apache Spark™, MLflow, Mosaic AI, and Unity Catalog, the Data Intelligence Platform is the substrate for semiconductor companies to connect engineering technology (ET), operational technology (OT),…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.