On Evaluating Explanation Utility for Human-AI Decision Making in NLP

On Evaluating Explanation Utility for Human-AI Decision Making in NLP

arXiv:2407.03545v1 Announce Type: new Abstract: Is explainability a false promise? This debate has emerged from the insufficient evidence that explanations aid people in situations they are introduced for. More human-centered, application-grounded evaluations of explanations are needed to settle this. Yet, with no established guidelines for such studies in NLP, researchers accustomed to standardized proxy evaluations must discover appropriate measurements, tasks, datasets, and sensible models for human-AI teams in their studies. To help with this, we first review fitting existing metrics. We then establish requirements for datasets to be suitable for application-grounded evaluations. Among over 50 datasets available for explainability research…
Read More
AI-Powered Super Soldiers Are More Than Just a Pipe Dream

AI-Powered Super Soldiers Are More Than Just a Pipe Dream

The day is slowly turning into night, and the American special operators are growing concerned. They are deployed to a densely populated urban center in a politically volatile region, and local activity has grown increasingly frenetic in recent days, the roads and markets overflowing with more than the normal bustle of city life. Intelligence suggests the threat level in the city is high, but the specifics are vague, and the team needs to maintain a low profile—a firefight could bring known hostile elements down upon them. To assess potential threats, the Americans decide to take a more cautious approach. Eschewing…
Read More
Nothing’s budget-friendly brand CMF announced three new products, including a $200 smartphone

Nothing’s budget-friendly brand CMF announced three new products, including a $200 smartphone

CMF, a budget-friendly sub-brand Nothing announced last August, has officially unveiled a trio of new products. There’s a smartphone, a watch and earbuds, all of which seem to be modest in both price and features.Let’s start with the smartphone. Nothing made a splash with its original smartphone, the Nothing Phone 1, and the appropriately-named CMF Phone 1 hopes to follow suit. It wouldn’t be a Nothing-adjacent product without some quirky design elements, and the CMF Phone 1 certainly has its share. CMF by NothingThe back cover is interchangeable, so users can swap out to different colors and designs on the…
Read More
I’m an American mom living in Denmark. Here families take a long summer vacation, and I’m still getting used to that.

I’m an American mom living in Denmark. Here families take a long summer vacation, and I’m still getting used to that.

I'm an American living in Denmark, and for the first time I am having to play by the cultural and legal rules of Danish summer vacation. Let me explain.My oldest child just turned 6, which is when you start school in Denmark, as opposed to 5 years old in the US. If you have preschool-age kids there is always a variation of "summer day care" where institutions join together, and you can be flexible with the weeks you take. When your child starts attending public school, schools close down during the summer period. While it's up to the individual schools…
Read More
Taming the State Beast: React, TypeScript, and the Power of Redux

Taming the State Beast: React, TypeScript, and the Power of Redux

In the dynamic world of React applications, managing the application's state effectively can be a real challenge. As applications grow, components multiply, and data flows become intricate, maintaining order and predictability in how your application stores and accesses data becomes paramount. This is where state management solutions come in, and one of the most popular and robust libraries for this purpose is Redux. When combined with the type safety of TypeScript, you have a powerful toolkit for building scalable and maintainable React applications. Why State Management Matters: The Challenges of Shared State React's component-based architecture encourages developers to think in…
Read More
Decision-Focused Evaluation of Worst-Case Distribution Shift

Decision-Focused Evaluation of Worst-Case Distribution Shift

arXiv:2407.03557v1 Announce Type: new Abstract: Distribution shift is a key challenge for predictive models in practice, creating the need to identify potentially harmful shifts in advance of deployment. Existing work typically defines these worst-case shifts as ones that most degrade the individual-level accuracy of the model. However, when models are used to make a downstream population-level decision like the allocation of a scarce resource, individual-level accuracy may be a poor proxy for performance on the task at hand. We introduce a novel framework that employs a hierarchical model structure to identify worst-case distribution shifts in predictive resource allocation settings by…
Read More
Lift, Splat, Map: Lifting Foundation Masks for Label-Free Semantic Scene Completion

Lift, Splat, Map: Lifting Foundation Masks for Label-Free Semantic Scene Completion

arXiv:2407.03425v1 Announce Type: new Abstract: Autonomous mobile robots deployed in urban environments must be context-aware, i.e., able to distinguish between different semantic entities, and robust to occlusions. Current approaches like semantic scene completion (SSC) require pre-enumerating the set of classes and costly human annotations, while representation learning methods relax these assumptions but are not robust to occlusions and learn representations tailored towards auxiliary tasks. To address these limitations, we propose LSMap, a method that lifts masks from visual foundation models to predict a continuous, open-set semantic and elevation-aware representation in bird's eye view (BEV) for the entire scene, including regions…
Read More
Social Bias in Large Language Models For Bangla: An Empirical Study on Gender and Religious Bias

Social Bias in Large Language Models For Bangla: An Empirical Study on Gender and Religious Bias

arXiv:2407.03536v1 Announce Type: new Abstract: The rapid growth of Large Language Models (LLMs) has put forward the study of biases as a crucial field. It is important to assess the influence of different types of biases embedded in LLMs to ensure fair use in sensitive fields. Although there have been extensive works on bias assessment in English, such efforts are rare and scarce for a major language like Bangla. In this work, we examine two types of social biases in LLM generated outputs for Bangla language. Our main contributions in this work are: (1) bias studies on two different social…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.