How Good Are the Latest Open LLMs? And Is DPO Better Than PPO?

How Good Are the Latest Open LLMs? And Is DPO Better Than PPO?

April 2024, what a month! My birthday, a new book release, spring is finally here, and four major open LLM releases: Mixtral, Meta AI's Llama 3, Microsoft's Phi-3, and Apple's OpenELM.This article reviews and discusses all four major transformer-based LLM model releases that have been happening in the last few weeks, followed by new research on reinforcement learning with human feedback methods for instruction finetuning using PPO and DPO algorithms.1. How Good are Mixtral, Llama 3, and Phi-3?2. OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework3. Is DPO Superior to PPO for LLM Alignment? A Comprehensive…
Read More
Dell Technologies building AI Factory with Nvidia, growing AI efforts with Hugging Face, Meta and Microsoft

Dell Technologies building AI Factory with Nvidia, growing AI efforts with Hugging Face, Meta and Microsoft

Join us in returning to NYC on June 5th to collaborate with executive leaders in exploring comprehensive methods for auditing AI models regarding bias, performance, and ethical compliance across diverse organizations. Find out how you can attend here. Dell Technologies is growing its generative AI capabilities with a series of new capabilities announced today at the annual Dell Technologies World conference. The Dell AI Factory is the company’s new strategy for technologies and services designed to help make AI adoption simpler, more secure and more economical for enterprises. The offering includes a significant expansion of capabilities with Nvidia, going beyond…
Read More
Large language models can’t effectively recognize users’ motivation, but can support behavior change for those ready to act

Large language models can’t effectively recognize users’ motivation, but can support behavior change for those ready to act

Large language model-based chatbots have the potential to promote healthy changes in behavior. But researchers from the ACTION Lab at the University of Illinois Urbana-Champaign have found that the artificial intelligence tools don't effectively recognize certain motivational states of users and therefore don't provide them with appropriate information. Michelle Bak, a doctoral student in information sciences, and information sciences professor Jessie Chin reported their research in the Journal of the American Medical Informatics Association. Large language model-based chatbots -- also known as generative conversational agents -- have been used increasingly in healthcare for patient education, assessment and management. Bak and…
Read More
AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

arXiv:2405.10385v1 Announce Type: new Abstract: The SemEval 2024 BRAINTEASER task represents a pioneering venture in Natural Language Processing (NLP) by focusing on lateral thinking, a dimension of cognitive reasoning that is often overlooked in traditional linguistic analyses. This challenge comprises of Sentence Puzzle and Word Puzzle subtasks and aims to test language models' capacity for divergent thinking. In this paper, we present our approach to the BRAINTEASER task. We employ a holistic strategy by leveraging cutting-edge pre-trained models in multiple choice architecture, and diversify the training data with Sentence and Word Puzzle datasets. To gain further improvement, we fine-tuned the…
Read More
Major Deephaven Interactive Broker Updates | Deephaven

Major Deephaven Interactive Broker Updates | Deephaven

Deephaven recently introduced major updates to deephaven-ib, our Python package that allows users to interact with Interactive Brokers (IB) using Deephaven. This update includes a new build and deployment process, bug fixes, and updates to the Java, Deephaven, and Interactive Brokers versions used by deephaven-ib.Deephaven-IB now uses the latest versions of Java, Deephaven, and Interactive Brokers, and can be run without Docker.Updated examples that demonstrate what deephaven-ib can do.deephaven-ib now uses the latest versions of Java, Deephaven, and Interactive Brokers:Java has been updated from version 11 to 17.Interactive Brokers has been updated from version 10.19.01 to 10.19.04.Deephaven has been updated…
Read More
OpenAI’s Long-Term AI Risk Team Has Disbanded

OpenAI’s Long-Term AI Risk Team Has Disbanded

In July last year, OpenAI announced the formation of a new research team that would prepare for the advent of supersmart artificial intelligence capable of outwitting and overpowering its creators. Ilya Sutskever, OpenAI’s chief scientist and one of the company’s cofounders, was named as the colead of this new team. OpenAI said the team would receive 20 percent of its computing power.Now OpenAI’s “superalignment team” is no more, the company confirms. That comes after the departures of several researchers involved, Tuesday’s news that Sutskever was leaving the company, and the resignation of the team’s other colead. The group’s work will…
Read More
Discovering the future of AI – Introducing AI Parabellum (an AI tools directory) – AI News

Discovering the future of AI – Introducing AI Parabellum (an AI tools directory) – AI News

As artificial intelligence continues to progress at an unprecedented rate, developing new and innovative AI tools has become crucial for shaping the future of the industry. However, keeping up with all the latest advancements can often feel overwhelming, with new tools emerging every day across diverse domains and applications. This is where AI Parabellum steps in as a one-stop destination to uncover the most cutting-edge AI tools from around the world. As the premier AI tools directory, AI Parabellum aims to bring all innovators and enthusiasts together on a single platform. By spotlighting the top AI tools curated from the…
Read More
Indian Voters Are Being Bombarded With Millions of Deepfakes. Political Candidates Approve

Indian Voters Are Being Bombarded With Millions of Deepfakes. Political Candidates Approve

On a stifling April afternoon in Ajmer, in the Indian state of Rajasthan, local politician Shakti Singh Rathore sat down in front of a greenscreen to shoot a short video. He looked nervous. It was his first time being cloned.Wearing a crisp white shirt and a ceremonial saffron scarf bearing a lotus flower—the logo of the BJP, the country’s ruling party—Rathore pressed his palms together and greeted his audience in Hindi. “Namashkar,” he began. “To all my brothers—”Before he could continue, the director of the shoot walked into the frame. Divyendra Singh Jadoun, a 31-year-old with a bald head and…
Read More
2024 BAIR Graduate Directory

2024 BAIR Graduate Directory

Every year, the Berkeley Artificial Intelligence Research (BAIR) Lab graduates some of the most talented and innovative minds in artificial intelligence and machine learning. Our Ph.D. graduates have each expanded the frontiers of AI research and are now ready to embark on new adventures in academia, industry, and beyond. These fantastic individuals bring with them a wealth of knowledge, fresh ideas, and a drive to continue contributing to the advancement of AI. Their work at BAIR, ranging from deep learning, robotics, and natural language processing to computer vision, security, and much more, has contributed significantly to their fields and has…
Read More
Building Generative AI prompt chaining workflows with human in the loop | Amazon Web Services

Building Generative AI prompt chaining workflows with human in the loop | Amazon Web Services

Generative AI is a type of artificial intelligence (AI) that can be used to create new content, including conversations, stories, images, videos, and music. Like all AI, generative AI works by using machine learning models—very large models that are pretrained on vast amounts of data called foundation models (FMs). FMs are trained on a broad spectrum of generalized and unlabeled data. They’re capable of performing a wide variety of general tasks with a high degree of accuracy based on input prompts. Large language models (LLMs) are one class of FMs. LLMs are specifically focused on language-based tasks such as summarization, text…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.