Viral News

The Vital Role of Data Governance in Communications, Media and Entertainment

The Vital Role of Data Governance in Communications, Media and Entertainment

Data, analytics and AI governance is perhaps the most important yet challenging aspect of any data and AI democratization effort. For your data, analytics and AI needs, you've likely deployed two different systems — data warehouses for business intelligence and data lakes for AI. And now you've created data silos with data movement across two systems, each with a different governance model.But data isn't limited to files or tables. You also have assets like dashboards, ML models and notebooks, each with their own permission models, making it difficult to manage access permissions for all these assets consistently. The problem gets…
Read More
How Does Bayes Error Limit Probabilistic Robust Accuracy

How Does Bayes Error Limit Probabilistic Robust Accuracy

arXiv:2405.14923v1 Announce Type: new Abstract: Adversarial examples pose a security threat to many critical systems built on neural networks. Given that deterministic robustness often comes with significantly reduced accuracy, probabilistic robustness (i.e., the probability of having the same label with a vicinity is $ge 1-kappa$) has been proposed as a promising way of achieving robustness whilst maintaining accuracy. However, existing training methods for probabilistic robustness still experience non-trivial accuracy loss. It is unclear whether there is an upper bound on the accuracy when optimising towards probabilistic robustness, and whether there is a certain relationship between $kappa$ and this bound. This…
Read More
DiffuseMix: Label-Preserving Data Augmentation with Diffusion Models

DiffuseMix: Label-Preserving Data Augmentation with Diffusion Models

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
AGRaME: Any-Granularity Ranking with Multi-Vector Embeddings

AGRaME: Any-Granularity Ranking with Multi-Vector Embeddings

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
Building High-Quality and Trusted Data Products with Databricks

Building High-Quality and Trusted Data Products with Databricks

IntroductionOrganizations aiming to become AI and data-driven often need to provide their internal teams with high-quality and trusted data products. Building such data products ensures that organizations establish standards and a trustworthy foundation of business truth for their data and AI objectives. One approach for putting quality and usability at the forefront is through the use of the data mesh paradigm to democratize the ownership and management of data assets. Our blog posts (Part 1, Part 2) offer guidance on how customers can leverage Databricks in their enterprise to address data mesh's foundational pillars, one of which is "data as…
Read More
AnalogCoder: Analog Circuit Design via Training-Free Code Generation

AnalogCoder: Analog Circuit Design via Training-Free Code Generation

arXiv:2405.14918v1 Announce Type: new Abstract: Analog circuit design is a significant task in modern chip technology, focusing on the selection of component types, connectivity, and parameters to ensure proper circuit functionality. Despite advances made by Large Language Models (LLMs) in digital circuit design, the complexity and scarcity of data in analog circuitry pose significant challenges. To mitigate these issues, we introduce AnalogCoder, the first training-free LLM agent for designing analog circuits through Python code generation. Firstly, AnalogCoder incorporates a feedback-enhanced flow with tailored domain-specific prompts, enabling the automated and self-correcting design of analog circuits with a high success rate. Secondly,…
Read More
Dissecting Query-Key Interaction in Vision Transformers

Dissecting Query-Key Interaction in Vision Transformers

arXiv:2405.14880v1 Announce Type: new Abstract: Self-attention in vision transformers has been thought to perform perceptual grouping where tokens attend to other tokens with similar embeddings, which could correspond to semantically similar features in an image. However, contextualization is also an important and necessary computation for processing signals. Contextualization potentially requires tokens to attend to dissimilar tokens such as those corresponding to backgrounds or different objects, but this effect has not been reported in previous studies. In this study, we investigate whether self-attention in vision transformers exhibits a preference for attending to similar tokens or dissimilar tokens, providing evidence of perceptual…
Read More
Extracting Prompts by Inverting LLM Outputs

Extracting Prompts by Inverting LLM Outputs

arXiv:2405.15012v1 Announce Type: new Abstract: We consider the problem of language model inversion: given outputs of a language model, we seek to extract the prompt that generated these outputs. We develop a new black-box method, output2prompt, that learns to extract prompts without access to the model's logits and without adversarial or jailbreaking queries. In contrast to previous work, output2prompt only needs outputs of normal user queries. To improve memory efficiency, output2prompt employs a new sparse encoding techique. We measure the efficacy of output2prompt on a variety of user and system prompts and demonstrate zero-shot transferability across different LLMs. Source link…
Read More
Pushing the Boundaries of Innovation with Data and AI: Announcing the 2024 Finalists of the Databricks Data Team Transformation Award

Pushing the Boundaries of Innovation with Data and AI: Announcing the 2024 Finalists of the Databricks Data Team Transformation Award

The Data Team Awards celebrates enterprise data teams’ essential role in helping businesses across sectors face their most pressing challenges. With more than 200 nominations submitted, the resulting finalists showcase impressive innovation in data and artificial intelligence. As we near the Data + AI Summit, Databricks is excited to share the success stories from all of our nominees. With the Transformation Award for 2024 — a distinguished honor designed to recognize teams that have gone above and beyond to transform their organizations through data and AI — we celebrate the architects of change pushing this transformation at an unprecedented scale. Reimagining the…
Read More
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models

SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models

arXiv:2405.14917v1 Announce Type: new Abstract: Large language models (LLMs) achieve remarkable performance in natural language understanding but require substantial computation and memory resources. Post-training quantization (PTQ) is a powerful compression technique extensively investigated in LLMs. However, existing PTQ methods are still not ideal in terms of accuracy and efficiency, especially with below 4 bit-widths. Standard PTQ methods using group-wise quantization suffer difficulties in quantizing LLMs accurately to such low-bit, but advanced methods remaining high-precision weights element-wisely are hard to realize their theoretical hardware efficiency. This paper presents a Salience-Driven Mixed-Precision Quantization scheme for LLMs, namely SliM-LLM. The scheme exploits the…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.