Viral News

Design2Code: Benchmarking Multimodal Code Generation for Automated Front-End Engineering

Design2Code: Benchmarking Multimodal Code Generation for Automated Front-End Engineering

[Submitted on 5 Mar 2024 (v1), last revised 21 Nov 2024 (this version, v2)] View a PDF of the paper titled Design2Code: Benchmarking Multimodal Code Generation for Automated Front-End Engineering, by Chenglei Si and 5 other authors View PDF Abstract:Generative AI has made rapid advancements in recent years, achieving unprecedented capabilities in multimodal understanding and code generation. This can enable a new paradigm of front-end development in which multimodal large language models (MLLMs) directly convert visual designs into code implementations. In this work, we construct Design2Code - the first real-world benchmark for this task. Specifically, we manually curate 484 diverse…
Read More
Enhancing Medical Image Segmentation with Deep Learning and Diffusion Models

Enhancing Medical Image Segmentation with Deep Learning and Diffusion Models

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
Guided MRI Reconstruction via Schrödinger Bridge

Guided MRI Reconstruction via Schrödinger Bridge

arXiv:2411.14269v1 Announce Type: cross Abstract: Magnetic Resonance Imaging (MRI) is a multi-contrast imaging technique in which different contrast images share similar structural information. However, conventional diffusion models struggle to effectively leverage this structural similarity. Recently, the Schr"odinger Bridge (SB), a nonlinear extension of the diffusion model, has been proposed to establish diffusion paths between any distributions, allowing the incorporation of guided priors. This study proposes an SB-based, multi-contrast image-guided reconstruction framework that establishes a diffusion bridge between the guiding and target image distributions. By using the guiding image along with data consistency during sampling, the target image is reconstructed more…
Read More
DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization

DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization

arXiv:2411.14055v1 Announce Type: new Abstract: Large language models (LLMs) deliver impressive results but face challenges from increasing model sizes and computational costs. Structured pruning reduces model size and speeds up inference but often causes uneven degradation across domains, leading to biased performance. To address this, we propose DRPruning, which incorporates distributionally robust optimization to restore balanced performance across domains, along with further improvements to enhance robustness. Experiments in monolingual and multilingual settings show that our method surpasses similarly sized models in pruning and continued pretraining over perplexity, downstream tasks, and instruction tuning. We further provide analysis demonstrating the robustness of…
Read More
Microsoft Ignite 2024 Roundup: AI Innovations, Copilots, Power Platform Updates, and More

Microsoft Ignite 2024 Roundup: AI Innovations, Copilots, Power Platform Updates, and More

Microsoft Ignite 2024 is taking place in Chicago this week, bringing together technology leaders and innovators from around the globe. The event explores how the latest updates to Azure, Copilot, Microsoft 365, Windows, and other Microsoft tools are set to share the future of business technology Over 80 new products and features have been unveiled across Microsoft’s product portfolio. It’s no surprise that, just like last year, AI takes center stage at this year’s event, with a focus on agentic AI. Here is an overview of some of the key announcements at the event.  Copilot Actions Microsoft’s new Copilot Actions,…
Read More
AutoMixQ: Self-Adjusting Quantization for High Performance Memory-Efficient Fine-Tuning

AutoMixQ: Self-Adjusting Quantization for High Performance Memory-Efficient Fine-Tuning

arXiv:2411.13814v1 Announce Type: new Abstract: Fine-tuning large language models (LLMs) under resource constraints is a significant challenge in deep learning. Low-Rank Adaptation (LoRA), pruning, and quantization are all effective methods for improving resource efficiency. However, combining them directly often results in suboptimal performance, especially with uniform quantization across all model layers. This is due to the complex, uneven interlayer relationships introduced by pruning, necessitating more refined quantization strategies. To address this, we propose AutoMixQ, an end-to-end optimization framework that selects optimal quantization configurations for each LLM layer. AutoMixQ leverages lightweight performance models to guide the selection process, significantly reducing time…
Read More
Robust SG-NeRF: Robust Scene Graph Aided Neural Surface Reconstruction

Robust SG-NeRF: Robust Scene Graph Aided Neural Surface Reconstruction

arXiv:2411.13620v1 Announce Type: new Abstract: Neural surface reconstruction relies heavily on accurate camera poses as input. Despite utilizing advanced pose estimators like COLMAP or ARKit, camera poses can still be noisy. Existing pose-NeRF joint optimization methods handle poses with small noise (inliers) effectively but struggle with large noise (outliers), such as mirrored poses. In this work, we focus on mitigating the impact of outlier poses. Our method integrates an inlier-outlier confidence estimation scheme, leveraging scene graph information gathered during the data preparation phase. Unlike previous works directly using rendering metrics as the reference, we employ a detached color network that…
Read More
Language Models as Hierarchy Encoders

Language Models as Hierarchy Encoders

[Submitted on 21 Jan 2024 (v1), last revised 21 Nov 2024 (this version, v4)] View a PDF of the paper titled Language Models as Hierarchy Encoders, by Yuan He and 3 other authors View PDF HTML (experimental) Abstract:Interpreting hierarchical structures latent in language is a key limitation of current language models (LMs). While previous research has implicitly leveraged these hierarchies to enhance LMs, approaches for their explicit encoding are yet to be explored. To address this, we introduce a novel approach to re-train transformer encoder-based LMs as Hierarchy Transformer encoders (HiTs), harnessing the expansive nature of hyperbolic space. Our method…
Read More
Prediction-Guided Active Experiments

Prediction-Guided Active Experiments

[Submitted on 18 Nov 2024 (v1), last revised 20 Nov 2024 (this version, v2)] View a PDF of the paper titled Prediction-Guided Active Experiments, by Ruicheng Ao and 2 other authors View PDF HTML (experimental) Abstract:In this work, we introduce a new framework for active experimentation, the Prediction-Guided Active Experiment (PGAE), which leverages predictions from an existing machine learning model to guide sampling and experimentation. Specifically, at each time step, an experimental unit is sampled according to a designated sampling distribution, and the actual outcome is observed based on an experimental probability. Otherwise, only a prediction for the outcome is…
Read More
MetaCropFollow: Few-Shot Adaptation with Meta-Learning for Under-Canopy Navigation

MetaCropFollow: Few-Shot Adaptation with Meta-Learning for Under-Canopy Navigation

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs. Source link lol
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.