Viral News

Reservoir Computing with Generalized Readout based on Generalized Synchronization

Reservoir Computing with Generalized Readout based on Generalized Synchronization

[Submitted on 3 May 2024] View a PDF of the paper titled Reservoir Computing with Generalized Readout based on Generalized Synchronization, by Akane Ookubo and Masanobu Inubushi View PDF HTML (experimental) Abstract:Reservoir computing is a machine learning framework that exploits nonlinear dynamics, exhibiting significant computational capabilities. One of the defining characteristics of reservoir computing is its low cost and straightforward training algorithm, i.e. only the readout, given by a linear combination of reservoir variables, is trained. Inspired by recent mathematical studies based on dynamical system theory, in particular generalized synchronization, we propose a novel reservoir computing framework with generalized readout,…
Read More
Investigating Robustness of Open-Vocabulary Foundation Object Detectors under Distribution Shifts

Investigating Robustness of Open-Vocabulary Foundation Object Detectors under Distribution Shifts

[Submitted on 1 Apr 2024] View a PDF of the paper titled Investigating Robustness of Open-Vocabulary Foundation Object Detectors under Distribution Shifts, by Prakash Chandra Chhipa and 4 other authors View PDF Abstract:The challenge of Out-Of-Distribution (OOD) robustness remains a critical hurdle towards deploying deep vision models. Open-vocabulary object detection extends the capabilities of traditional object detection frameworks to recognize and classify objects beyond predefined categories. Investigating OOD robustness in open-vocabulary object detection is essential to increase the trustworthiness of these models. This study presents a comprehensive robustness comparison of zero-shot capabilities of three recent open-vocabulary foundation object detection models,…
Read More
DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning

DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning

arXiv:2405.14899v1 Announce Type: new Abstract: In-context learning (ICL) allows transformer-based language models that are pre-trained on general text to quickly learn a specific task with a few "task demonstrations" without updating their parameters, significantly boosting their flexibility and generality. ICL possesses many distinct characteristics from conventional machine learning, thereby requiring new approaches to interpret this learning paradigm. Taking the viewpoint of recent works showing that transformers learn in context by formulating an internal optimizer, we propose an influence function-based attribution technique, DETAIL, that addresses the specific characteristics of ICL. We empirically verify the effectiveness of our approach for demonstration attribution…
Read More
Data Machina #238

Data Machina #238

Non-stop AI Innovation Every Single Week. Well yeah, thats’s right: There is no single week without something new, exciting, or amazing happening in AI. This is a selection of interesting, cool stuff that happened in the last 7 days or so:OpenAI introduced new, faster, and more efficient embedding models. Buried in the blog announcement, it says: “the new embedding models were trained with a technique that allows developers to shorten embeddings without the embedding losing its concept-representing properties.” Well - for some reason- it seems the blog fails to mention that the technique is called Matryoshka Representation Learning (paper, repo),…
Read More
Improving Text2SQL Performance with Ease on Databricks

Improving Text2SQL Performance with Ease on Databricks

Want to raise your LLM into the top 10 of Spider, a widely used benchmark for text-to-SQL tasks? Spider evaluates how well LLMs can convert text queries into SQL code.For those unfamiliar with text-to-SQL, its significance lies in transforming how businesses interact with their data. Instead of relying on SQL experts to write queries, people can simply ask questions of their data in plain English and receive precise answers. This democratizes access to data, enhancing business intelligence and enabling more informed decision-making.The Spider benchmark is a widely recognized standard for evaluating the performance of text-to-SQL systems. It challenges LLMs to…
Read More
Data Machina #239

Data Machina #239

The Power of Truly Open Source AI. The spin doctors of some big closed-AI companies have been busy inflating the “AGI is here soon, AGI will be an existential risk” bubble. But that thankfully that is deflating quickly, and backfiring somehow. In the meantime, the open source AI community is stubbornly embarked upon releasing truly open source, efficient, smallish, powerful AI models that match or beat the closed AI models from big companies. The reaction from these big closed AI companies: “Oh! open source AI models are dangerous, we need to regulate open source AI. And btw: We’re dropping the…
Read More
Building DBRX-class Custom LLMs with Mosaic AI Training

Building DBRX-class Custom LLMs with Mosaic AI Training

We recently introduced DBRX: an open, state-of-the-art, general-purpose LLM. DBRX was trained, fine-tuned, and evaluated using Mosaic AI Training, scaling training to 3072 NVIDIA H100s and processing more than 12 trillion tokens in the process.Training LLMs, and in particular MoE models such as DBRX, is hard. It requires overcoming many infrastructure, performance, and scientific challenges. Mosaic AI Training was purposely built to address these challenges and was battle-tested through the training of DBRX, the MPT series of models, and many other LLMs such as Ola’s Krutrim, AI2’s OLMo, Dynamo AI’s Dynamo 8B, Refuel’s LLM-2, and others.Figure 1: Yet another insightful…
Read More
Fine Tuning Qwen 1.5 for Coding

Fine Tuning Qwen 1.5 for Coding

In this article, we will be fine tuning the Qwen 1.5 model for coding. Nowadays, there are several chat based LLMs available online. Sometimes, the options are so vast, that choosing a model mostly falls upon personal preference and the task to be accomplished. However, most of the chat based models are fine tuned on a general SFT (Supervised Fine Tuning) dataset. The smaller chat based LLMs (or SLMs [Small Language Models]) do not perform so well on task specific objectives, like coding, out of the box. Fine tuning the base models according to the task is a much better…
Read More
Data Machina #240

Data Machina #240

Foundation Models, Transformers and Time-Series. Statisticians and econometricians have been searching for the Holy Grail of time-series forecasting (TSF) for more than 40 years. “Classical” models like ARIMA still work remarkably well in some TSF scenarios. But today, the cool stuff is all about transformer-based, DL & foundation models for TSF. How “good” are these new DL models for TSF? How do we evaluate these models? Do these new models really achieve SOTA performance as some papers claim? Are DL researchers cherrypicking ts datasets to easily fit a SOTA TSF DL model?...Real world time-series data is complex, messy, and it…
Read More
Research Survey: Productivity benefits from Databricks Assistant

Research Survey: Productivity benefits from Databricks Assistant

In the fast-paced landscape of data science and engineering, integrating Artificial Intelligence (AI) has become integral for enhancing productivity. We’ve seen many tools emerge and transform the lives of data practitioners, making complex tasks easier and encouraging innovation. When we launched Databricks Assistant in Public Preview in July of 2023, we designed it exclusively for streamlining efficiency amongst data scientists, analysts, and engineers. To better understand how well we’re achieving this goal, we decided to survey some of our top users across multiple organizations, varying in experience. Purpose of the SurveyTo better understand Databricks Assistant’s impact on data professionals, we meticulously…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.