stp2y

30692 Posts
High Performance Im2win and Direct Convolutions using Three Tensor Layouts on SIMD Architectures

High Performance Im2win and Direct Convolutions using Three Tensor Layouts on SIMD Architectures

arXiv:2408.00278v1 Announce Type: new Abstract: Convolution is the core component within deep neural networks and it is computationally intensive and time consuming. Tensor data layouts significantly impact convolution operations in terms of memory access and computational efficiency. Yet, there is still a lack of comprehensive performance characterization on data layouts on SIMD architectures concerning convolution methods. This paper proposes three novel data layouts for im2win convolution: NHWC, CHWN, and CHWN8, and introduces a set of general optimization techniques for both direct and im2win convolutions. We compare the optimized im2win convolution with the direct convolution and PyTorch's im2col-based convolution across the…
Read More
A Prior Embedding-Driven Architecture for Long Distance Blind Iris Recognition

A Prior Embedding-Driven Architecture for Long Distance Blind Iris Recognition

arXiv:2408.00210v1 Announce Type: new Abstract: Blind iris images, which result from unknown degradation during the process of iris recognition at long distances, often lead to decreased iris recognition rates. Currently, little existing literature offers a solution to this problem. In response, we propose a prior embedding-driven architecture for long distance blind iris recognition. We first proposed a blind iris image restoration network called Iris-PPRGAN. To effectively restore the texture of the blind iris, Iris-PPRGAN includes a Generative Adversarial Network (GAN) used as a Prior Decoder, and a DNN used as the encoder. To extract iris features more efficiently, we then…
Read More
Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like Spontaneous Representation

Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like Spontaneous Representation

arXiv:2408.00284v1 Announce Type: new Abstract: Large-scale text-to-speech (TTS) models have made significant progress recently.However, they still fall short in the generation of Chinese dialectal speech. Toaddress this, we propose Bailing-TTS, a family of large-scale TTS models capable of generating high-quality Chinese dialectal speech. Bailing-TTS serves as a foundation model for Chinese dialectal speech generation. First, continual semi-supervised learning is proposed to facilitate the alignment of text tokens and speech tokens. Second, the Chinese dialectal representation learning is developed using a specific transformer architecture and multi-stage training processes. With the proposed design of novel network architecture and corresponding strategy, Bailing-TTS is…
Read More
A New Trick Could Block the Misuse of Open Source AI

A New Trick Could Block the Misuse of Open Source AI

When Meta released its large language model Llama 3 for free this April, it took outside developers just a couple days to create a version without the safety restrictions that prevent it from spouting hateful jokes, offering instructions for cooking meth, or misbehaving in other ways.A new training technique developed by researchers at the University of Illinois Urbana-Champaign, UC San Diego, Lapis Labs, and the nonprofit Center for AI Safety could make it harder to remove such safeguards from Llama and other open source AI models in the future. Some experts believe that, as AI becomes ever more powerful, tamperproofing…
Read More
Why workers who lose their jobs to AI might not stay unemployed for long

Why workers who lose their jobs to AI might not stay unemployed for long

One of the big questions facing US economists is what the coming AI revolution will mean for workers.In the years ahead, there are concerns that companies' huge AI investments — and their ongoing adoption of these technologies — could eliminate jobs. A Goldman Sachs report published last year estimated that 300 million full-time jobs across the globe could be disrupted — not necessarily replaced — by AI in the coming years. It's already costing some workers their jobs.Similar to US manufacturing workers who lost their jobs in recent decades to advancements like automation, those displaced by AI could find themselves…
Read More
The Google Pixel 8a drops to a new low of $399

The Google Pixel 8a drops to a new low of $399

Our pick for the is looking even better right now as the Google Pixel 8a has dropped to a new all-time-low price. You can . That's 20 percent off the regular price, and it's even lower than any of the deals we saw for it during Prime Day.The Pixel 8a has the same Tensor G3 chip as the rest of the Pixel 8 lineup, which means you get access to the same AI features that its higher-end siblings have. We're fans of the cameras, 120Hz OLED display and battery life too (it lasted 20-and-a-half hours on our video rundown test).…
Read More
Understanding Immutability in JavaScript: A Dive into Mutable and Immutable Data

Understanding Immutability in JavaScript: A Dive into Mutable and Immutable Data

In the vast world of programming, JavaScript stands out for its dynamic nature, allowing developers to manipulate data with ease. However, within this flexibility lies a nuanced aspect known as immutability, a concept that might seem straightforward yet holds profound implications for code quality and predictability. This article aims to demystify the realms of mutable and immutable data in JavaScript, shedding light on why certain practices are recommended over others. The Concept of Immutability At its core, immutability refers to the characteristic of data that, once created, cannot be altered. This principle is deeply rooted in functional programming paradigms, emphasizing…
Read More
Google’s Gemini 1.5 Pro dethrones GPT-4o

Google’s Gemini 1.5 Pro dethrones GPT-4o

Google’s experimental Gemini 1.5 Pro model has surpassed OpenAI’s GPT-4o in generative AI benchmarks. For the past year, OpenAI’s GPT-4o and Anthropic’s Claude-3 have dominated the landscape. However, the latest version of Gemini 1.5 Pro appears to have taken the lead. One of the most widely recognised benchmarks in the AI community is the LMSYS Chatbot Arena, which evaluates models on various tasks and assigns an overall competency score. On this leaderboard, GPT-4o achieved a score of 1,286, while Claude-3 secured a commendable 1,271. A previous iteration of Gemini 1.5 Pro had scored 1,261. The experimental version of Gemini 1.5…
Read More
Mobility-Aware Federated Self-supervised Learning in Vehicular Network

Mobility-Aware Federated Self-supervised Learning in Vehicular Network

arXiv:2408.00256v1 Announce Type: new Abstract: Federated Learning (FL) is an advanced distributed machine learning approach, that protects the privacy of each vehicle by allowing the model to be trained on multiple devices simultaneously without the need to upload all data to a road side unit (RSU). This enables FL to handle scenarios with sensitive or widely distributed data. However, in these fields, it is well known that the labeling costs can be a significant expense, and models relying on labels are not suitable for these rapidly evolving fields especially in vehicular networks, or mobile internet of things (MIoT), where new…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.