Viral News

Improved GUI Grounding via Iterative Narrowing

Improved GUI Grounding via Iterative Narrowing

arXiv:2411.13591v1 Announce Type: new Abstract: GUI grounding, the task of identifying a precise location on an interface image from a natural language query, plays a crucial role in enhancing the capabilities of Vision-Language Model (VLM) agents. While general VLMs, such as GPT-4V, demonstrate strong performance across various tasks, their proficiency in GUI grounding remains suboptimal. Recent studies have focused on fine-tuning these models specifically for one-shot GUI grounding, yielding significant improvements over baseline performance. We introduce a visual prompting framework called Iterative Narrowing (IN) to further enhance the performance of both general and fine-tuned models in GUI grounding. For evaluation,…
Read More
Explaining GPT-4’s Schema of Depression Using Machine Behavior Analysis

Explaining GPT-4’s Schema of Depression Using Machine Behavior Analysis

[Submitted on 21 Nov 2024] Authors:Adithya V Ganesan, Vasudha Varadarajan, Yash Kumar Lal, Veerle C. Eijsbroek, Katarina Kjell, Oscar N.E. Kjell, Tanuja Dhanasekaran, Elizabeth C. Stade, Johannes C. Eichstaedt, Ryan L. Boyd, H. Andrew Schwartz, Lucie Flek View a PDF of the paper titled Explaining GPT-4's Schema of Depression Using Machine Behavior Analysis, by Adithya V Ganesan and 11 other authors View PDF HTML (experimental) Abstract:Use of large language models such as ChatGPT (GPT-4) for mental health support has grown rapidly, emerging as a promising route to assess and help people with mood disorders, like depression. However, we have a…
Read More
Replicable Online Learning

Replicable Online Learning

arXiv:2411.13730v1 Announce Type: new Abstract: We investigate the concept of algorithmic replicability introduced by Impagliazzo et al. 2022, Ghazi et al. 2021, Ahn et al. 2024 in an online setting. In our model, the input sequence received by the online learner is generated from time-varying distributions chosen by an adversary (obliviously). Our objective is to design low-regret online algorithms that, with high probability, produce the exact same sequence of actions when run on two independently sampled input sequences generated as described above. We refer to such algorithms as adversarially replicable. Previous works (such as Esfandiari et al. 2022) explored replicability…
Read More
RadPhi-3: Small Language Models for Radiology

RadPhi-3: Small Language Models for Radiology

arXiv:2411.13604v1 Announce Type: new Abstract: LLM based copilot assistants are useful in everyday tasks. There is a proliferation in the exploration of AI assistant use cases to support radiology workflows in a reliable manner. In this work, we present RadPhi-3, a Small Language Model instruction tuned from Phi-3-mini-4k-instruct with 3.8B parameters to assist with various tasks in radiology workflows. While impression summary generation has been the primary task which has been explored in prior works w.r.t radiology reports of Chest X-rays, we also explore other useful tasks like change summary generation comparing the current radiology report and its prior report,…
Read More
NewsInterview: a Dataset and a Playground to Evaluate LLMs’ Ground Gap via Informational Interviews

NewsInterview: a Dataset and a Playground to Evaluate LLMs’ Ground Gap via Informational Interviews

[Submitted on 21 Nov 2024] View a PDF of the paper titled NewsInterview: a Dataset and a Playground to Evaluate LLMs' Ground Gap via Informational Interviews, by Michael Lu and 4 other authors View PDF HTML (experimental) Abstract:Large Language Models (LLMs) have demonstrated impressive capabilities in generating coherent text but often struggle with grounding language and strategic dialogue. To address this gap, we focus on journalistic interviews, a domain rich in grounding communication and abundant in data. We curate a dataset of 40,000 two-person informational interviews from NPR and CNN, and reveal that LLMs are significantly less likely than human…
Read More
Exploring Large Language Models for Climate Forecasting

Exploring Large Language Models for Climate Forecasting

arXiv:2411.13724v1 Announce Type: new Abstract: With the increasing impacts of climate change, there is a growing demand for accessible tools that can provide reliable future climate information to support planning, finance, and other decision-making applications. Large language models (LLMs), such as GPT-4, present a promising approach to bridging the gap between complex climate data and the general public, offering a way for non-specialist users to obtain essential climate insights through natural language interaction. However, an essential challenge remains under-explored: evaluating the ability of LLMs to provide accurate and reliable future climate predictions, which is crucial for applications that rely on…
Read More
Enhancing Bidirectional Sign Language Communication: Integrating YOLOv8 and NLP for Real-Time Gesture Recognition & Translation

Enhancing Bidirectional Sign Language Communication: Integrating YOLOv8 and NLP for Real-Time Gesture Recognition & Translation

[Submitted on 18 Nov 2024] View a PDF of the paper titled Enhancing Bidirectional Sign Language Communication: Integrating YOLOv8 and NLP for Real-Time Gesture Recognition & Translation, by Hasnat Jamil Bhuiyan and 4 other authors View PDF HTML (experimental) Abstract:The primary concern of this research is to take American Sign Language (ASL) data through real time camera footage and be able to convert the data and information into text. Adding to that, we are also putting focus on creating a framework that can also convert text into sign language in real time which can help us break the language barrier…
Read More
Transformer-Based Contextualized Language Models Joint with Neural Networks for Natural Language Inference in Vietnamese

Transformer-Based Contextualized Language Models Joint with Neural Networks for Natural Language Inference in Vietnamese

[Submitted on 20 Nov 2024 (v1), last revised 21 Nov 2024 (this version, v2)] View a PDF of the paper titled Transformer-Based Contextualized Language Models Joint with Neural Networks for Natural Language Inference in Vietnamese, by Dat Van-Thanh Nguyen and 3 other authors View PDF HTML (experimental) Abstract:Natural Language Inference (NLI) is a task within Natural Language Processing (NLP) that holds value for various AI applications. However, there have been limited studies on Natural Language Inference in Vietnamese that explore the concept of joint models. Therefore, we conducted experiments using various combinations of contextualized language models (CLM) and neural networks.…
Read More
The Digital Transformation in Health: How AI Can Improve the Performance of Health Systems

The Digital Transformation in Health: How AI Can Improve the Performance of Health Systems

[Submitted on 24 Sep 2024 (v1), last revised 21 Nov 2024 (this version, v2)] View a PDF of the paper titled The Digital Transformation in Health: How AI Can Improve the Performance of Health Systems, by 'Africa Peri'a~nez and 6 other authors View PDF Abstract:Mobile health has the potential to revolutionize health care delivery and patient engagement. In this work, we discuss how integrating Artificial Intelligence into digital health applications-focused on supply chain, patient management, and capacity building, among other use cases-can improve the health system and public health performance. We present an Artificial Intelligence and Reinforcement Learning platform that…
Read More
FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting

FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting

[Submitted on 18 Nov 2024 (v1), last revised 21 Nov 2024 (this version, v2)] View a PDF of the paper titled FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting, by Fangyu Wu and 1 other authors View PDF HTML (experimental) Abstract:In the real world, objects reveal internal textures when sliced or cut, yet this behavior is not well-studied in 3D generation tasks today. For example, slicing a virtual 3D watermelon should reveal flesh and seeds. Given that no available dataset captures an object's full internal structure and collecting data from all slices is impractical, generative methods become the obvious…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.