Viral News

RE-Adapt: Reverse Engineered Adaptation of Large Language Models

RE-Adapt: Reverse Engineered Adaptation of Large Language Models

arXiv:2405.15007v1 Announce Type: new Abstract: We introduce RE-Adapt, an approach to fine-tuning large language models on new domains without degrading any pre-existing instruction-tuning. We reverse engineer an adapter which isolates what an instruction-tuned model has learned beyond its corresponding pretrained base model. Importantly, this requires no additional data or training. We can then fine-tune the base model on a new domain and readapt it to instruction following with the reverse engineered adapter. RE-Adapt and our low-rank variant LoRE-Adapt both outperform other methods of fine-tuning, across multiple popular LLMs and datasets, even when the models are used in conjunction with retrieval-augmented…
Read More
Production-Quality RAG Applications with Databricks

Production-Quality RAG Applications with Databricks

In December, we announced a new suite of tools to get Generative AI applications to production using Retrieval Augmented Generation (RAG). Since then, we have seen an explosion of RAG applications being built by thousands of customers on the Databricks Data Intelligence Platform.Today, we are excited to make several announcements to make it easy for enterprises to build high-quality RAG applications with native capabilities available directly in the Databricks Data Intelligence Platform - including the General Availability of Vector Search and major updates to Model Serving.The Challenge of High Quality AI Applications As we collaborated closely with our customers to build…
Read More
Best Data Annotation Tools for Machine Learning

Best Data Annotation Tools for Machine Learning

Source: AuthorIntroductionJust like having a massive pile of books won't make you a genius unless you read and understand them, a mountain of data won't make a powerful AI if it's not properly labeled.Source: https://x.com/Propellorai/status/1215594158087462912Data annotation involves labeling data points, such as images or text, with relevant information, enabling the algorithms to learn and make sense of the patterns within the data. In simple terms, data annotation helps the algorithms distinguish between what's important and what's not with the help of labels and annotations, allowing them to make informed decisions and predictions. While annotation can be performed manually by humans,…
Read More
Data Mixing Made Efficient: A Bivariate Scaling Law for Language Model Pretraining

Data Mixing Made Efficient: A Bivariate Scaling Law for Language Model Pretraining

arXiv:2405.14908v1 Announce Type: new Abstract: Large language models exhibit exceptional generalization capabilities, primarily attributed to the utilization of diversely sourced data. However, conventional practices in integrating this diverse data heavily rely on heuristic schemes, lacking theoretical guidance. This research tackles these limitations by investigating strategies based on low-cost proxies for data mixtures, with the aim of streamlining data curation to enhance training efficiency. Specifically, we propose a unified scaling law, termed BiMix, which accurately models the bivariate scaling behaviors of both data quantity and mixing proportions. We conduct systematic experiments and provide empirical evidence for the predictive power and fundamental…
Read More
Visual Deformation Detection Using Soft Material Simulation for Pre-training of Condition Assessment Models

Visual Deformation Detection Using Soft Material Simulation for Pre-training of Condition Assessment Models

arXiv:2405.14877v1 Announce Type: new Abstract: This paper addresses the challenge of geometric quality assurance in manufacturing, particularly when human assessment is required. It proposes using Blender, an open-source simulation tool, to create synthetic datasets for machine learning (ML) models. The process involves translating expert information into shape key parameters to simulate deformations, generating images for both deformed and non-deformed objects. The study explores the impact of discrepancies between real and simulated environments on ML model performance and investigates the effect of different simulation backgrounds on model sensitivity. Additionally, the study aims to enhance the model's robustness to camera positioning by…
Read More
Linking In-context Learning in Transformers to Human Episodic Memory

Linking In-context Learning in Transformers to Human Episodic Memory

arXiv:2405.14992v1 Announce Type: new Abstract: Understanding the connections between artificial and biological intelligent systems can reveal fundamental principles underlying general intelligence. While many artificial intelligence (AI) models have a neuroscience counterpart, such connections are largely missing in Transformer models and the self-attention mechanism. Here, we examine the relationship between attention heads and human episodic memory. We focus on the induction heads, which contribute to the in-context learning capabilities of Transformer-based large language models (LLMs). We demonstrate that induction heads are behaviorally, functionally, and mechanistically similar to the contextual maintenance and retrieval (CMR) model of human episodic memory. Our analyses of…
Read More
Data Machina #236

Data Machina #236

Mix, Bind & Merge OS Small AI Models FTW! Y2024: The year that open-source, small AI model-combos beat the big boys? The open source AI community and small AI startups are releasing a plethora of open-source, small AI models that are matching or -in some instances- even outperforming AI Titans’ huge models. I’m rooting for the open-source AI community!These new os small models are leveraging supper efficient techniques like quantisation and fine-tuning with QLoRA, and fine-tuning with DPO. There’s a whole new range of open-source tools like LLamaFactory designed to easily, efficiently fine-tune these os models.Using e.g. the free LM…
Read More
Accelerate GenAI App Development with New Updates to Databricks Model Serving

Accelerate GenAI App Development with New Updates to Databricks Model Serving

Last year, we launched foundation model support in Databricks Model Serving to enable enterprises to build secure and custom GenAI apps on a unified data and AI platform. Since then, thousands of organizations have used Model Serving to deploy GenAI apps customized to their unique datasets.Today, we're excited to announce new updates that make it easier to experiment, customize, and deploy GenAI apps. These updates include access to new large language models (LLMs), easier discovery, simpler customization options, and improved monitoring. Together, these improvements help you develop and scale GenAI apps more quickly and at a lower cost. Databricks Model…
Read More
YUI: Day-ahead Electricity Price Forecasting Using Invariance Simplified Supply and Demand Curve

YUI: Day-ahead Electricity Price Forecasting Using Invariance Simplified Supply and Demand Curve

arXiv:2405.14893v1 Announce Type: new Abstract: In day-ahead electricity market, it is crucial for all market participants to have access to reliable and accurate price forecasts for their decision-making processes. Forecasting methods currently utilized in industrial applications frequently neglect the underlying mechanisms of price formation, while economic research from the perspective of supply and demand have stringent data collection requirements, making it difficult to apply in actual markets. Observing the characteristics of the day-ahead electricity market, we introduce two invariance assumptions to simplify the modeling of supply and demand curves. Upon incorporating the time invariance assumption, we can forecast the supply…
Read More
Precise and Robust Sidewalk Detection: Leveraging Ensemble Learning to Surpass LLM Limitations in Urban Environments

Precise and Robust Sidewalk Detection: Leveraging Ensemble Learning to Surpass LLM Limitations in Urban Environments

[Submitted on 2 Apr 2024] View a PDF of the paper titled Precise and Robust Sidewalk Detection: Leveraging Ensemble Learning to Surpass LLM Limitations in Urban Environments, by Ibne Farabi Shihab and 3 other authors View PDF Abstract:This study aims to compare the effectiveness of a robust ensemble model with the state-of-the-art ONE-PEACE Large Language Model (LLM) for accurate detection of sidewalks. Accurate sidewalk detection is crucial in improving road safety and urban planning. The study evaluated the model's performance on Cityscapes, Ade20k, and the Boston Dataset. The results showed that the ensemble model performed better than the individual models,…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.