Research Papers in Nov 2023: Tackling Hallucinations, Boosting Reasoning Abilities, and New Insights into the Transformer Architecture

Research Papers in Nov 2023: Tackling Hallucinations, Boosting Reasoning Abilities, and New Insights into the Transformer Architecture

This month, I want to focus on three papers that address three distinct problem categories of Large Language Models (LLMs): Reducing hallucinations.Enhancing the reasoning capabilities of small, openly available models.Deepening our understanding of, and potentially simplifying, the transformer architecture.Reducing hallucinations is important because, while LLMs like GPT-4 are widely used for knowledge generation, they can still produce plausible yet inaccurate information.Improving the reasoning capabilities of smaller models is also important. Right now, ChatGPT & GPT-4 (vs. private or personal LLMs) are still our go-to when it comes to many tasks. Enhancing the reasoning abilities of these smaller models is one…
Read More
The Onion’s Take on OpenAI’s Scarlett Johansson Disaster Is Pretty Much Perfect

The Onion’s Take on OpenAI’s Scarlett Johansson Disaster Is Pretty Much Perfect

Bravo.Stupid SimulacrumThe accusations that Scarlett Johansson has leveled against OpenAI have launched a storm of controversy, turning the ChatGPT creator into a near-pariah overnight for its alleged copying of the actress's voice without her permission.There's already plenty of good writing out there on how the incident encapsulates the AI industry's arrogance and its astounding lack of integrity. But leave it to The Onion, of course, to perfectly sum up the ridiculousness of this whole thing in just one satirical headline."Jerky, 7-Fingered Scarlett Johansson Appears In Video To Express Full-Fledged Approval Of OpenAI," it reads."'It is me, Scar Johnson, to express to…
Read More
The third New England RLHF Hackers Hackathon

The third New England RLHF Hackers Hackathon

At the third New England RLHF Hackathon, several interesting projects were showcased, each focusing on different aspects of machine learning and reinforcement learning. Participants and those interested in future events are encouraged to join the Discord community for more information and updates. Join the discord community The highlighted projects include: Pink Elephants Pt 3 (Authors: Sid Verma, Louis Castricato): This project aimed to train a pink elephant model via ILQL (Inverse Learning from Q-learning), using the standard trlX implementation. The team faced challenges in finding optimal hyperparameters and proposed future research that includes more nuanced reward shaping and combining different…
Read More
FasterViT for Image Classification

FasterViT for Image Classification

FasterViT is a family of Vision Transformer models that is both fast and provides better accuracy than other ViT models. It combines the local representation learning of CNNs and the global learning properties of ViTs. In this article, we will cover the FasterViT model for image classification. Figure 1. FasterViT architecture, throughput, and benchmark on ImageNet1K. We will go through image inference using the pretrained network along with a brief of its architectural components. Furthermore, we will also fine-tune a FasterViT model for image classification. We will cover the following topics in this article We will start with a discussion…
Read More
Language Modeling Reading List (to Start Your Paper Club)

Language Modeling Reading List (to Start Your Paper Club)

Some friends and I started a weekly paper club to read and discuss fundamental papers in language modeling. By pooling together our shared knowledge, experience, and questions, we learned more as a group than we could have individually. To encourage others to do the same, here’s the list of papers we covered, and a one-sentence summary for each. I’ll update this list with new papers as we discuss them. (Also, why and how to read papers .) Attention Is All You Need: Query, Key, and Value are all you need* (*Also position embeddings, multiple heads, feed-forward layers, skip-connections, etc.) GPT:…
Read More
Scarlett Johansson’s OpenAI Feud Makes Her an Uncanny Folk Hero

Scarlett Johansson’s OpenAI Feud Makes Her an Uncanny Folk Hero

There is a distinct moment in the Marvel Cinematic Universe when Black Widow became a hero for the everyfan. It happens early in 2012’s The Avengers: She’s tied to a chair. Agent Coulson calls. A nondescript military leader who has been interrogating her hands her the phone. Coulson explains that S.H.I.E.L.D. needs to pull her out of the field. She kicks her questioner in the shin, smashes the chair she’s tied to, takes out three dudes, grabs her heels, and leaves.The Avengers went on to make $1.5 billion globally and catapulted nearly everyone in it to superstardom, even the actors…
Read More
AI at the crossroads of cybersecurity, space and national security in the digital age

AI at the crossroads of cybersecurity, space and national security in the digital age

Technological prowess, especially regarding humanity’s increased presence in space, is increasingly becoming the linchpin of global competitiveness and national security. There, new opportunities to integrate AI are accompanied by a new generation of risks. Artificial intelligence in particular plays a crucial role in democratizing access to space exploration and research, opening it to many beyond just governmental space agencies, as evidenced by the large number of commercially financed and operated space launches over the last five years. As launch companies adopt AI-enabled autonomous flight safety systems, Space Launch Delta 45 is saving on mission control chairs and looping out about…
Read More
Enterprises Have Just Two Years to Harness the Full Potential of GenAI: Genpact and HFS Report

Enterprises Have Just Two Years to Harness the Full Potential of GenAI: Genpact and HFS Report

(Berit Kessler/Shutterstock) The advent of GenAI has proven to be the first real innovation to disrupt industry since the advent of the internet. While GenAI is only over a year old, it has left enterprises scrambling to gain a competitive advantage. However, the window of opportunity for these enterprises may be shorter than anticipated. Enterprises have only two years to adopt GenAI before competitive disadvantages emerge, according to a new report by Genpact and HFS Research. The report also highlights that only 5% of enterprises have mature GenAI initiatives, signaling an urgent need for acceleration of GenAI adoption.  Genpact is…
Read More
Training Diffusion Models with  Reinforcement Learning

Training Diffusion Models with Reinforcement Learning

Training Diffusion Models with Reinforcement Learning replay Diffusion models have recently emerged as the de facto standard for generating complex, high-dimensional outputs. You may know them for their ability to produce stunning AI art and hyper-realistic synthetic images, but they have also found success in other applications such as drug design and continuous control. The key idea behind diffusion models is to iteratively transform random noise into a sample, such as an image or protein structure. This is typically motivated as a maximum likelihood estimation problem, where the model is trained to generate samples that match the training data as closely as…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.