Viral News

Explainable machine learning multi-label classification of Spanish legal judgements

Explainable machine learning multi-label classification of Spanish legal judgements

[Submitted on 27 May 2024] View a PDF of the paper titled Explainable machine learning multi-label classification of Spanish legal judgements, by Francisco de Arriba-P'erez and 3 other authors View PDF HTML (experimental) Abstract:Artificial Intelligence techniques such as Machine Learning (ML) have not been exploited to their maximum potential in the legal domain. This has been partially due to the insufficient explanations they provided about their decisions. Automatic expert systems with explanatory capabilities can be specially useful when legal practitioners search jurisprudence to gather contextual knowledge for their cases. Therefore, we propose a hybrid system that applies ML for multi-label…
Read More
How Open Will Snowflake Go at Data Cloud Summit?

How Open Will Snowflake Go at Data Cloud Summit?

Snowflake is holding its Data Cloud Summit 24 conference next week, and the company is expected to make a slew of announcements, which you will be able to find on these Datanami pages. But among the most closely watched questions is how far Snowflake will go in embracing the Apache Iceberg table format and opening itself up to outside query engines? And is it possible that Snowflake may try to “out open” its rival Databricks, whose conference is the following week? Snowflake has evolved considerably since it burst onto the scene a handful of years ago as a cloud data…
Read More
On margin-based generalization prediction in deep neural networks

On margin-based generalization prediction in deep neural networks

[Submitted on 20 May 2024] View a PDF of the paper titled On margin-based generalization prediction in deep neural networks, by Coenraad Mouton View PDF Abstract:Understanding generalization in deep neural networks is an active area of research. A promising avenue of exploration has been that of margin measurements: the shortest distance to the decision boundary for a given sample or that sample's representation internal to the network. Margin-based complexity measures have been shown to be correlated with the generalization ability of deep neural networks in some circumstances but not others. The reasons behind the success or failure of these metrics…
Read More
How to train your ViT for OOD Detection

How to train your ViT for OOD Detection

arXiv:2405.17447v1 Announce Type: new Abstract: VisionTransformers have been shown to be powerful out-of-distribution detectors for ImageNet-scale settings when finetuned from publicly available checkpoints, often outperforming other model types on popular benchmarks. In this work, we investigate the impact of both the pretraining and finetuning scheme on the performance of ViTs on this task by analyzing a large pool of models. We find that the exact type of pretraining has a strong impact on which method works well and on OOD detection performance in general. We further show that certain training schemes might only be effective for a specific type of…
Read More
Incremental Comprehension of Garden-Path Sentences by Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention

Incremental Comprehension of Garden-Path Sentences by Large Language Models: Semantic Interpretation, Syntactic Re-Analysis, and Attention

arXiv:2405.16042v1 Announce Type: new Abstract: When reading temporarily ambiguous garden-path sentences, misinterpretations sometimes linger past the point of disambiguation. This phenomenon has traditionally been studied in psycholinguistic experiments using online measures such as reading times and offline measures such as comprehension questions. Here, we investigate the processing of garden-path sentences and the fate of lingering misinterpretations using four large language models (LLMs): GPT-2, LLaMA-2, Flan-T5, and RoBERTa. The overall goal is to evaluate whether humans and LLMs are aligned in their processing of garden-path sentences and in the lingering misinterpretations past the point of disambiguation, especially when extra-syntactic information (e.g.,…
Read More
Building a Strong AI Foundation: The Critical Role of High-Quality Data

Building a Strong AI Foundation: The Critical Role of High-Quality Data

Whether it's manufacturing and supply chain management or the healthcare industry, Artificial Intelligence (AI) has the power to revolutionize operations. AI holds the power to boost efficiency, personalize customer experiences and spark innovation.  That said, getting reliable, actionable results from any AI process hinges on the quality of data it is fed. Let's take a closer look at what's needed to prepare your data for AI-driven success. How Does Data Quality Impact AI Systems? Using poor quality data can result in expensive, embarrassing mistakes like the time Air Canada‘s chatbot gave a grieving customer incorrect information. In areas like healthcare, using…
Read More
CataLM: Empowering Catalyst Design Through Large Language Models

CataLM: Empowering Catalyst Design Through Large Language Models

arXiv:2405.17440v1 Announce Type: new Abstract: The field of catalysis holds paramount importance in shaping the trajectory of sustainable development, prompting intensive research efforts to leverage artificial intelligence (AI) in catalyst design. Presently, the fine-tuning of open-source large language models (LLMs) has yielded significant breakthroughs across various domains such as biology and healthcare. Drawing inspiration from these advancements, we introduce CataLM Cata}lytic Language Model), a large language model tailored to the domain of electrocatalytic materials. Our findings demonstrate that CataLM exhibits remarkable potential for facilitating human-AI collaboration in catalyst knowledge exploration and design. To the best of our knowledge, CataLM stands…
Read More
Towards Gradient-based Time-Series Explanations through a SpatioTemporal Attention Network

Towards Gradient-based Time-Series Explanations through a SpatioTemporal Attention Network

arXiv:2405.17444v1 Announce Type: new Abstract: In this paper, we explore the feasibility of using a transformer-based, spatiotemporal attention network (STAN) for gradient-based time-series explanations. First, we trained the STAN model for video classifications using the global and local views of data and weakly supervised labels on time-series data (i.e. the type of an activity). We then leveraged a gradient-based XAI technique (e.g. saliency map) to identify salient frames of time-series data. According to the experiments using the datasets of four medically relevant activities, the STAN model demonstrated its potential to identify important frames of videos. Source link lol
Read More
Evaluating the Adversarial Robustness of Retrieval-Based In-Context Learning for Large Language Models

Evaluating the Adversarial Robustness of Retrieval-Based In-Context Learning for Large Language Models

arXiv:2405.15984v1 Announce Type: new Abstract: With the emergence of large language models, such as LLaMA and OpenAI GPT-3, In-Context Learning (ICL) gained significant attention due to its effectiveness and efficiency. However, ICL is very sensitive to the choice, order, and verbaliser used to encode the demonstrations in the prompt. Retrieval-Augmented ICL methods try to address this problem by leveraging retrievers to extract semantically related examples as demonstrations. While this approach yields more accurate results, its robustness against various types of adversarial attacks, including perturbations on test samples, demonstrations, and retrieved data, remains under-explored. Our study reveals that retrieval-augmented models can…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.