Viral News

Computer-Vision-Enabled Worker Video Analysis for Motion Amount Quantification

Computer-Vision-Enabled Worker Video Analysis for Motion Amount Quantification

[Submitted on 22 May 2024 (v1), last revised 19 Nov 2024 (this version, v2)] View a PDF of the paper titled Computer-Vision-Enabled Worker Video Analysis for Motion Amount Quantification, by Hari Iyer and 3 other authors View PDF HTML (experimental) Abstract:The performance of physical workers is significantly influenced by the extent of their motions. However, monitoring and assessing these motions remains a challenge. Recent advancements have enabled in-situ video analysis for real-time observation of worker behaviors. This paper introduces a novel framework for tracking and quantifying upper and lower limb motions, issuing alerts when critical thresholds are reached. Using joint…
Read More
MemoryFormer: Minimize Transformer Computation by Removing Fully-Connected Layers

MemoryFormer: Minimize Transformer Computation by Removing Fully-Connected Layers

arXiv:2411.12992v1 Announce Type: new Abstract: In order to reduce the computational complexity of large language models, great efforts have been made to to improve the efficiency of transformer models such as linear attention and flash-attention. However, the model size and corresponding computational complexity are constantly scaled up in pursuit of higher performance. In this work, we present MemoryFormer, a novel transformer architecture which significantly reduces the computational complexity (FLOPs) from a new perspective. We eliminate nearly all the computations of the transformer model except for the necessary computation required by the multi-head attention operation. This is made possible by utilizing…
Read More
Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress?

Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress?

[Submitted on 6 Nov 2024 (v1), last revised 19 Nov 2024 (this version, v2)] View a PDF of the paper titled Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress?, by Daniel P. Jeong and 3 other authors View PDF HTML (experimental) Abstract:Several recent works seek to develop foundation models specifically for medical applications, adapting general-purpose large language models (LLMs) and vision-language models (VLMs) via continued pretraining on publicly available biomedical corpora. These works typically claim that such domain-adaptive pretraining (DAPT) improves performance on downstream medical tasks, such as answering medical licensing exam questions. In this paper,…
Read More
Fair Distillation: Teaching Fairness from Biased Teachers in Medical Imaging

Fair Distillation: Teaching Fairness from Biased Teachers in Medical Imaging

arXiv:2411.11939v1 Announce Type: new Abstract: Deep learning has achieved remarkable success in image classification and segmentation tasks. However, fairness concerns persist, as models often exhibit biases that disproportionately affect demographic groups defined by sensitive attributes such as race, gender, or age. Existing bias-mitigation techniques, including Subgroup Re-balancing, Adversarial Training, and Domain Generalization, aim to balance accuracy across demographic groups, but often fail to simultaneously improve overall accuracy, group-specific accuracy, and fairness due to conflicts among these interdependent objectives. We propose the Fair Distillation (FairDi) method, a novel fairness approach that decomposes these objectives by leveraging biased ``teacher'' models, each optimized…
Read More
Loss-to-Loss Prediction: Scaling Laws for All Datasets

Loss-to-Loss Prediction: Scaling Laws for All Datasets

[Submitted on 19 Nov 2024] View a PDF of the paper titled Loss-to-Loss Prediction: Scaling Laws for All Datasets, by David Brandfonbrener and 4 other authors View PDF HTML (experimental) Abstract:While scaling laws provide a reliable methodology for predicting train loss across compute scales for a single data distribution, less is known about how these predictions should change as we change the distribution. In this paper, we derive a strategy for predicting one loss from another and apply it to predict across different pre-training datasets and from pre-training data to downstream task data. Our predictions extrapolate well even at 20x…
Read More
Trojan Cleansing with Neural Collapse

Trojan Cleansing with Neural Collapse

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
Rethinking cluster-conditioned diffusion models for label-free image synthesis

Rethinking cluster-conditioned diffusion models for label-free image synthesis

[Submitted on 1 Mar 2024 (v1), last revised 19 Nov 2024 (this version, v2)] View a PDF of the paper titled Rethinking cluster-conditioned diffusion models for label-free image synthesis, by Nikolas Adaloglou and Tim Kaiser and Felix Michels and Markus Kollmann View PDF HTML (experimental) Abstract:Diffusion-based image generation models can enhance image quality when conditioned on ground truth labels. Here, we conduct a comprehensive experimental study on image-level conditioning for diffusion models using cluster assignments. We investigate how individual clustering determinants, such as the number of clusters and the clustering method, impact image synthesis across three different datasets. Given the…
Read More
Training Bilingual LMs with Data Constraints in the Targeted Language

Training Bilingual LMs with Data Constraints in the Targeted Language

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models

From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models

[Submitted on 24 Jun 2024 (v1), last revised 20 Nov 2024 (this version, v2)] View a PDF of the paper titled From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models, by Sean Welleck and 7 other authors View PDF Abstract:One of the most striking findings in modern research on large language models (LLMs) is that scaling up compute during training leads to better results. However, less attention has been given to the benefits of scaling compute during inference. This survey focuses on these inference-time approaches. We explore three areas under a unified mathematical formalism: token-level generation algorithms, meta-generation algorithms,…
Read More
Calibrated and Efficient Sampling-Free Confidence Estimation for LiDAR Scene Semantic Segmentation

Calibrated and Efficient Sampling-Free Confidence Estimation for LiDAR Scene Semantic Segmentation

arXiv:2411.11935v1 Announce Type: new Abstract: Reliable deep learning models require not only accurate predictions but also well-calibrated confidence estimates to ensure dependable uncertainty estimation. This is crucial in safety-critical applications like autonomous driving, which depend on rapid and precise semantic segmentation of LiDAR point clouds for real-time 3D scene understanding. In this work, we introduce a sampling-free approach for estimating well-calibrated confidence values for classification tasks, achieving alignment with true classification accuracy and significantly reducing inference time compared to sampling-based methods. Our evaluation using the Adaptive Calibration Error (ACE) metric for LiDAR semantic segmentation shows that our approach maintains well-calibrated…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.