stp2y

30696 Posts
Transformers on Markov Data: Constant Depth Suffices

Transformers on Markov Data: Constant Depth Suffices

arXiv:2407.17686v1 Announce Type: new Abstract: Attention-based transformers have been remarkably successful at modeling generative processes across various domains and modalities. In this paper, we study the behavior of transformers on data drawn from kth Markov processes, where the conditional distribution of the next symbol in a sequence depends on the previous $k$ symbols observed. We observe a surprising phenomenon empirically which contradicts previous findings: when trained for sufficiently long, a transformer with a fixed depth and $1$ head per layer is able to achieve low test loss on sequences drawn from kth Markov sources, even as $k$ grows. Furthermore, this…
Read More
ISPs are fighting to raise the price of low-income broadband

ISPs are fighting to raise the price of low-income broadband

A new government program is trying to encourage Internet service providers (ISPs) to offer lower rates for lower income customers by distributing federal funds through states. The only problem is the ISPs don’t want to offer the proposed rates. obtained a letter sent to US Commerce Secretary Gina Raimondo signed by more than 30 broadband industry trade groups like ACA Connects and the Fiber Broadband Association as well as several state based organizations. The letter raises “both a sense of alarm and urgency” about their ability to participate in the Broadband Equity, Access and Deployment (BEAD) program. The newly formed BEAD…
Read More
22 details you might’ve missed during the Paris 2024 opening ceremony

22 details you might’ve missed during the Paris 2024 opening ceremony

The 2024 Paris Games kicked off with an impressive opening ceremony.There were several allusions to famous French works of art, including "Les Misérables."The bells of Notre-Dame were rung for the first time since the destructive fire in 2019. Thanks for signing up! Access your favorite topics in a personalized feed while you're on the go. download the app By clicking “Sign Up”, you accept our Terms of Service and Privacy Policy. You can opt-out at any time by visiting our Preferences page or by clicking "unsubscribe" at the bottom of the email. The 2024 Olympics are being held in Paris…
Read More
CoMoTo: Unpaired Cross-Modal Lesion Distillation Improves Breast Lesion Detection in Tomosynthesis

CoMoTo: Unpaired Cross-Modal Lesion Distillation Improves Breast Lesion Detection in Tomosynthesis

arXiv:2407.17620v1 Announce Type: new Abstract: Digital Breast Tomosynthesis (DBT) is an advanced breast imaging modality that offers superior lesion detection accuracy compared to conventional mammography, albeit at the trade-off of longer reading time. Accelerating lesion detection from DBT using deep learning is hindered by limited data availability and huge annotation costs. A possible solution to this issue could be to leverage the information provided by a more widely available modality, such as mammography, to enhance DBT lesion detection. In this paper, we present a novel framework, CoMoTo, for improving lesion detection in DBT. Our framework leverages unpaired mammography data to…
Read More
Priming My Workspace

Priming My Workspace

Separation of concerns. In the tech world, that phrase generally refers to the design principle of separating a complex system into more manageable parts. However, ever since I started working from home, I have started seeing it in other places outside of my codebase. When I worked in an office, I found it really easy to keep my space clean and organized. In hindsight that was because that space was dedicated to my job only. It was easier to disconnect from that headspace once I was done with my work because the work was literally in a different location. But…
Read More
Open Source AI Has Founders—and the FTC—Buzzing

Open Source AI Has Founders—and the FTC—Buzzing

Many of yesterday’s talks were littered with the acronyms you’d expect from this assemblage of high-minded panelists: YC, FTC, AI, LLMs. But threaded throughout the conversations—foundational to them, you might say—was boosterism for open source AI.It was a stark left turn (or return, if you’re a Linux head) from the app-obsessed 2010s, when developers seemed happy to containerize their technologies and hand them over to bigger platforms for distribution.The event also happened just two days after Meta CEO Mark Zuckerberg declared that “open source AI is the path forward” and released Llama 3.1, the latest version of Meta’s own open…
Read More
Demystifying Verbatim Memorization in Large Language Models

Demystifying Verbatim Memorization in Large Language Models

arXiv:2407.17817v1 Announce Type: new Abstract: Large Language Models (LLMs) frequently memorize long sequences verbatim, often with serious legal and privacy implications. Much prior work has studied such verbatim memorization using observational data. To complement such work, we develop a framework to study verbatim memorization in a controlled setting by continuing pre-training from Pythia checkpoints with injected sequences. We find that (1) non-trivial amounts of repetition are necessary for verbatim memorization to happen; (2) later (and presumably better) checkpoints are more likely to verbatim memorize sequences, even for out-of-distribution sequences; (3) the generation of memorized sequences is triggered by distributed model…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.