Viral News

IFShip: A Large Vision-Language Model for Interpretable Fine-grained Ship Classification via Domain Knowledge-Enhanced Instruction Tuning

IFShip: A Large Vision-Language Model for Interpretable Fine-grained Ship Classification via Domain Knowledge-Enhanced Instruction Tuning

arXiv:2408.06631v1 Announce Type: new Abstract: End-to-end interpretation is currently the prevailing paradigm for remote sensing fine-grained ship classification (RS-FGSC) task. However, its inference process is uninterpretable, leading to criticism as a black box model. To address this issue, we propose a large vision-language model (LVLM) named IFShip for interpretable fine-grained ship classification. Unlike traditional methods, IFShip excels in interpretability by accurately conveying the reasoning process of FGSC in natural language. Specifically, we first design a domain knowledge-enhanced Chain-of-Thought (COT) prompt generation mechanism. This mechanism is used to semi-automatically construct a task-specific instruction-following dataset named TITANIC-FGS, which emulates human-like logical decision-making.…
Read More
Leveraging Priors via Diffusion Bridge for Time Series Generation

Leveraging Priors via Diffusion Bridge for Time Series Generation

arXiv:2408.06672v1 Announce Type: new Abstract: Time series generation is widely used in real-world applications such as simulation, data augmentation, and hypothesis test techniques. Recently, diffusion models have emerged as the de facto approach for time series generation, emphasizing diverse synthesis scenarios based on historical or correlated time series data streams. Since time series have unique characteristics, such as fixed time order and data scaling, standard Gaussian prior might be ill-suited for general time series generation. In this paper, we exploit the usage of diverse prior distributions for synthesis. Then, we propose TimeBridge, a framework that enables flexible synthesis by leveraging…
Read More
ActiveNeRF: Learning Accurate 3D Geometry by Active Pattern Projection

ActiveNeRF: Learning Accurate 3D Geometry by Active Pattern Projection

arXiv:2408.06592v1 Announce Type: new Abstract: NeRFs have achieved incredible success in novel view synthesis. However, the accuracy of the implicit geometry is unsatisfactory because the passive static environmental illumination has low spatial frequency and cannot provide enough information for accurate geometry reconstruction. In this work, we propose ActiveNeRF, a 3D geometry reconstruction framework, which improves the geometry quality of NeRF by actively projecting patterns of high spatial frequency onto the scene using a projector which has a constant relative pose to the camera. We design a learnable active pattern rendering pipeline which jointly learns the scene geometry and the active…
Read More
Generalized knowledge-enhanced framework for biomedical entity and relation extraction

Generalized knowledge-enhanced framework for biomedical entity and relation extraction

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
Is the GenAI Bubble Finally Popping?

Is the GenAI Bubble Finally Popping?

(Nicoleta Ionescu/Shutterstock) Doubt is creeping into discussion over generative AI, as industry analysts begin to publicly question whether the huge investments in GenAI will ever pay off. The lack of a “killer app” besides coding co-pilots and chatbots is the most pressing concern, critics in a Goldman Sachs Research letter say, while data availability, chip shortages, and power concerns also provide headwinds. However, many remain bullish on the long-term prospects of GenAI for business and society. The amount of sheer, unadulterated hype layered onto GenAI over the past year and a half certainly caught the attention of seasoned tech journalists,…
Read More
RW-NSGCN: A Robust Approach to Structural Attacks via Negative Sampling

RW-NSGCN: A Robust Approach to Structural Attacks via Negative Sampling

arXiv:2408.06665v1 Announce Type: new Abstract: Node classification using Graph Neural Networks (GNNs) has been widely applied in various practical scenarios, such as predicting user interests and detecting communities in social networks. However, recent studies have shown that graph-structured networks often contain potential noise and attacks, in the form of topological perturbations and weight disturbances, which can lead to decreased classification performance in GNNs. To improve the robustness of the model, we propose a novel method: Random Walk Negative Sampling Graph Convolutional Network (RW-NSGCN). Specifically, RW-NSGCN integrates the Random Walk with Restart (RWR) and PageRank (PGR) algorithms for negative sampling and…
Read More
HDRGS: High Dynamic Range Gaussian Splatting

HDRGS: High Dynamic Range Gaussian Splatting

arXiv:2408.06543v1 Announce Type: new Abstract: Recent years have witnessed substantial advancements in the field of 3D reconstruction from 2D images, particularly following the introduction of the neural radiance field (NeRF) technique. However, reconstructing a 3D high dynamic range (HDR) radiance field, which aligns more closely with real-world conditions, from 2D multi-exposure low dynamic range (LDR) images continues to pose significant challenges. Approaches to this issue fall into two categories: grid-based and implicit-based. Implicit methods, using multi-layer perceptrons (MLP), face inefficiencies, limited solvability, and overfitting risks. Conversely, grid-based methods require significant memory and struggle with image quality and long training times.…
Read More
A Perspective on Large Language Models, Intelligent Machines, and Knowledge Acquisition

A Perspective on Large Language Models, Intelligent Machines, and Knowledge Acquisition

arXiv:2408.06598v1 Announce Type: new Abstract: Large Language Models (LLMs) are known for their remarkable ability to generate synthesized 'knowledge', such as text documents, music, images, etc. However, there is a huge gap between LLM's and human capabilities for understanding abstract concepts and reasoning. We discuss these issues in a larger philosophical context of human knowledge acquisition and the Turing test. In addition, we illustrate the limitations of LLMs by analyzing GPT-4 responses to questions ranging from science and math to common sense reasoning. These examples show that GPT-4 can often imitate human reasoning, even though it lacks understanding. However, LLM…
Read More
COD: Learning Conditional Invariant Representation for Domain Adaptation Regression

COD: Learning Conditional Invariant Representation for Domain Adaptation Regression

arXiv:2408.06638v1 Announce Type: new Abstract: Aiming to generalize the label knowledge from a source domain with continuous outputs to an unlabeled target domain, Domain Adaptation Regression (DAR) is developed for complex practical learning problems. However, due to the continuity problem in regression, existing conditional distribution alignment theory and methods with discrete prior, which are proven to be effective in classification settings, are no longer applicable. In this work, focusing on the feasibility problems in DAR, we establish the sufficiency theory for the regression model, which shows the generalization error can be sufficiently dominated by the cross-domain conditional discrepancy. Further, to…
Read More
Benchmarking tree species classification from proximally-sensed laser scanning data: introducing the FOR-species20K dataset

Benchmarking tree species classification from proximally-sensed laser scanning data: introducing the FOR-species20K dataset

arXiv:2408.06507v1 Announce Type: new Abstract: Proximally-sensed laser scanning offers significant potential for automated forest data capture, but challenges remain in automatically identifying tree species without additional ground data. Deep learning (DL) shows promise for automation, yet progress is slowed by the lack of large, diverse, openly available labeled datasets of single tree point clouds. This has impacted the robustness of DL models and the ability to establish best practices for species classification. To overcome these challenges, the FOR-species20K benchmark dataset was created, comprising over 20,000 tree point clouds from 33 species, captured using terrestrial (TLS), mobile (MLS), and drone laser…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.