SituationalLLM: Proactive Language Models with Scene Awareness for Dynamic, Contextual Task Guidance

stp2yJanuary 22, 20250 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 19 Jun 2024 (v1), last revised 20 Jan 2025 (this version, v2)]

View a PDF of the paper titled SituationalLLM: Proactive Language Models with Scene Awareness for Dynamic, Contextual Task Guidance, by Muhammad Saif Ullah Khan and Didier Stricker

View PDF
HTML (experimental)

Abstract:Large language models (LLMs) have achieved remarkable success in text-based tasks but often struggle to provide actionable guidance in real-world physical environments. This is because of their inability to recognize their limited understanding of the user’s physical context. We present SituationalLLM, a novel approach that integrates structured scene information into an LLM to deliver proactive, context-aware assistance. By encoding objects, attributes, and relationships in a custom Scene Graph Language, SituationalLLM actively identifies gaps in environmental context and seeks clarifications during user interactions. This behavior emerges from training on the Situational Awareness Database for Instruct-Tuning (SAD-Instruct), which combines diverse, scenario-specific scene graphs with iterative, dialogue-based refinements. Experimental results indicate that SituationalLLM outperforms generic LLM baselines in task specificity, reliability, and adaptability, paving the way for environment-aware AI assistants capable of delivering robust, user-centric guidance under real-world constraints.

Submission history

From: Muhammad Saif Ullah Khan [view email]
[v1]
Wed, 19 Jun 2024 07:42:48 UTC (3,201 KB)
[v2]
Mon, 20 Jan 2025 14:34:42 UTC (3,592 KB)

Source link
lol

By stp2y