Viral News

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

[Submitted on 9 Nov 2023 (v1), last revised 19 Nov 2024 (this version, v2)] Authors:Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, Ting Liu View a PDF of the paper titled A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions, by Lei Huang and 10 other authors View PDF HTML (experimental) Abstract:The emergence of large language models (LLMs) has marked a significant breakthrough in natural language processing (NLP), fueling a paradigm shift in information acquisition. Nevertheless, LLMs are prone to hallucination, generating plausible…
Read More
Pentaho President Maggie Laird Discusses Company’s Return to Data Management Roots

Pentaho President Maggie Laird Discusses Company’s Return to Data Management Roots

The AI boom is forcing companies to come to grips with an uncomfortable reality: Their data is not well managed. That’s good news for data intelligence companies like Pentaho, which is refocusing its efforts on data management and governance under Maggie Laird, who was promoted to president of the Hitachi-Vantara subsidiary in April. Laird recently joined the Big Data Debrief to chat about the impact of AI on data management, the massive opportunity it poses for data intelligence, and the new direction she is leading Pentaho to monetize that opportunity and help customers come to grips with their big data…
Read More
Cascaded Diffusion Models for 2D and 3D Microscopy Image Synthesis to Enhance Cell Segmentation

Cascaded Diffusion Models for 2D and 3D Microscopy Image Synthesis to Enhance Cell Segmentation

[Submitted on 18 Nov 2024 (v1), last revised 19 Nov 2024 (this version, v2)] View a PDF of the paper titled Cascaded Diffusion Models for 2D and 3D Microscopy Image Synthesis to Enhance Cell Segmentation, by R"uveyda Yilmaz and 2 other authors View PDF HTML (experimental) Abstract:Automated cell segmentation in microscopy images is essential for biomedical research, yet conventional methods are labor-intensive and prone to error. While deep learning-based approaches have proven effective, they often require large annotated datasets, which are scarce due to the challenges of manual annotation. To overcome this, we propose a novel framework for synthesizing densely…
Read More
Masked Pre-training Enables Universal Zero-shot Denoiser

Masked Pre-training Enables Universal Zero-shot Denoiser

[Submitted on 26 Jan 2024 (v1), last revised 17 Nov 2024 (this version, v2)] View a PDF of the paper titled Masked Pre-training Enables Universal Zero-shot Denoiser, by Xiaoxiao Ma and 7 other authors View PDF HTML (experimental) Abstract:In this work, we observe that model trained on vast general images via masking strategy, has been naturally embedded with their distribution knowledge, thus spontaneously attains the underlying potential for strong image denoising. Based on this observation, we propose a novel zero-shot denoising paradigm, i.e., Masked Pre-train then Iterative fill (MPI). MPI first trains model via masking and then employs pre-trained weight…
Read More
Key-Element-Informed sLLM Tuning for Document Summarization

Key-Element-Informed sLLM Tuning for Document Summarization

[Submitted on 7 Jun 2024 (v1), last revised 19 Nov 2024 (this version, v3)] View a PDF of the paper titled Key-Element-Informed sLLM Tuning for Document Summarization, by Sangwon Ryu and 4 other authors View PDF HTML (experimental) Abstract:Remarkable advances in large language models (LLMs) have enabled high-quality text summarization. However, this capability is currently accessible only through LLMs of substantial size or proprietary LLMs with usage fees. In response, smaller-scale LLMs (sLLMs) of easy accessibility and low costs have been extensively studied, yet they often suffer from missing key information and entities, i.e., low relevance, in particular, when input…
Read More
Tips on Building a Winning Data and AI Strategy from JPMC

Tips on Building a Winning Data and AI Strategy from JPMC

(Lewis-Tse/Shutterstock) With $274 billion in revenue last year and $3.3 trillion in assets under management, JPMorgan Chase has more resources than most to devote to building a winning data and AI strategy. But as James Massa, JPMorgan Chase’s senior executive director of software engineering and architecture, explained during his SolixEmpower keynote last week, even the biggest companies in the world must pay close attention to the data and AI details in order to succeed. In his Solix Empower 2024 keynote address, titled “Data Quality and Data Strategy for AI, Measuring AI Value, Testing LLMs, and AI Use Cases,” Massa provided…
Read More
4+3 Phases of Compute-Optimal Neural Scaling Laws

4+3 Phases of Compute-Optimal Neural Scaling Laws

[Submitted on 23 May 2024 (v1), last revised 17 Nov 2024 (this version, v2)] View a PDF of the paper titled 4+3 Phases of Compute-Optimal Neural Scaling Laws, by Elliot Paquette and 3 other authors View PDF Abstract:We consider the solvable neural scaling model with three parameters: data complexity, target complexity, and model-parameter-count. We use this neural scaling model to derive new predictions about the compute-limited, infinite-data scaling law regime. To train the neural scaling model, we run one-pass stochastic gradient descent on a mean-squared loss. We derive a representation of the loss curves which holds over all iteration counts…
Read More
The Oxford Spires Dataset: Benchmarking Large-Scale LiDAR-Visual Localisation, Reconstruction and Radiance Field Methods

The Oxford Spires Dataset: Benchmarking Large-Scale LiDAR-Visual Localisation, Reconstruction and Radiance Field Methods

arXiv:2411.10546v1 Announce Type: new Abstract: This paper introduces a large-scale multi-modal dataset captured in and around well-known landmarks in Oxford using a custom-built multi-sensor perception unit as well as a millimetre-accurate map from a Terrestrial LiDAR Scanner (TLS). The perception unit includes three synchronised global shutter colour cameras, an automotive 3D LiDAR scanner, and an inertial sensor - all precisely calibrated. We also establish benchmarks for tasks involving localisation, reconstruction, and novel-view synthesis, which enable the evaluation of Simultaneous Localisation and Mapping (SLAM) methods, Structure-from-Motion (SfM) and Multi-view Stereo (MVS) methods as well as radiance field methods such as Neural…
Read More
A Framework for Leveraging Partially-Labeled Data for Product Attribute-Value Identification

A Framework for Leveraging Partially-Labeled Data for Product Attribute-Value Identification

[Submitted on 17 May 2024 (v1), last revised 18 Nov 2024 (this version, v2)] View a PDF of the paper titled A Framework for Leveraging Partially-Labeled Data for Product Attribute-Value Identification, by D. Subhalingam and 3 other authors View PDF HTML (experimental) Abstract:In the e-commerce domain, the accurate extraction of attribute-value pairs (e.g., Brand: Apple) from product titles and user search queries is crucial for enhancing search and recommendation systems. A major challenge with neural models for this task is the lack of high-quality training data, as the annotations for attribute-value pairs in the available datasets are often incomplete. To…
Read More
Without Data Context, There is No AI

Without Data Context, There is No AI

Sponsored Content by Precisely Approximately 80% of data has a location attribute associated with it – and that location data provides a connection with the physical world. For Generali Real Estate*, the addition of alternative forms of data, such as spatial data, created greater context for its data and helped to power highly accurate AI-driven insights for data-driven decision-making. Let’s take a closer look at their journey. Generali Real Estate is one of the world’s leading real estate asset managers. Headquartered in Italy and with operations across Europe, the company has €36.9 billion assets under management (Q2 2024). When Generali…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.