Viral News

Data Machina #250

Data Machina #250

Llama 3: A Watershed AI moment? I reckon that the release of Llama 3 is perhaps one of the most important moments in AI development so far. The Llama 3 stable is already giving birth to all sorts of amazing animals and model derivatives. You can expect Llama 3 will unleash the mother of all battles against closed AI models like GPT-4.Meta AI just posted: ”Our largest Llama 3 models are over 400B parameters. And they are still being trained.” The upcoming Llama-400B will change the playing field for many independent researchers, little AI startups, one-man AI developers, and also…
Read More
The Ethical Implications of AI in the Travel Industry

The Ethical Implications of AI in the Travel Industry

Artificial Intelligence (AI) is transforming the travel industry by enhancing efficiency, personalization, and customer experience. However, as with any technological advancement, the adoption of AI brings a range of ethical implications that must be carefully considered. This article explores five key areas of ethical concern regarding AI in the travel industry. 1. Privacy and Data Security Data Collection and Usage: AI systems rely on vast amounts of personal data to function effectively, raising significant privacy concerns. The collection of data such as travel preferences, personal identification, and payment details necessitates stringent data protection measures. Risk of Data Breaches: The travel…
Read More
Announcing simplified XML data ingestion

Announcing simplified XML data ingestion

We're excited to announce native support in Databricks for ingesting XML data.XML is a popular file format for representing complex data structures in different use cases for manufacturing, healthcare, law, travel, finance, and more. As these industries find new opportunities for analytics and AI, they increasingly need to leverage their troves of XML data. Databricks customers ingest this data into the Data Intelligence Platform, where other capabilities like Mosaic AI and Databricks SQL can then be used to drive business value.However, it can take a lot of work to build resilient XML pipelines. Since XML files are semi-structured and arbitrarily…
Read More
Instruction Tuning OPT-125M

Instruction Tuning OPT-125M

Large language models are pretrained on terabytes of language datasets. However, the pretraining dataset and strategy teach the model to generate the next token or word. In a real world sense, this is not much useful. Because in the end, we want to accomplish a task using the LLM, either through chat or instruction. We can do so through fine-tuning an LLM. Generally, we call this instruction tuning of the language model. To this end, in this article, we will use the OPT-125M model for instruction tuning. Figure 1. Output sample after instruction tuning OPT-125M on the Open Assistant Guanaco…
Read More
Sigma Secures $200M Round to Advance Its BI and Analytics Solutions

Sigma Secures $200M Round to Advance Its BI and Analytics Solutions

(NicoElNino/Shutterstock) Sigma Computing, a cloud-based analytics solutions provider, has raised $200 million in Series D funding to further advance its efforts in broadening BI use within organizations by enabling users to query and analyze data without writing code.  The latest rounding of funding takes the vendor’s total funding to $581.3 million with a valuation estimated to be around $1.5 billion, a staggering rise of 60% since the last funding round in 2021. The steep rise in valuation is partially a result of rising demand for greater productivity and monetization in the era of cloud data transition.  Spark Capital and Avenir…
Read More
Data Machina #251

Data Machina #251

Three New Powerful Open AI Models. I’m told by colleagues at Hugging Face that just a week since LLama-3 was released, more than +10,000 model derivatives have been developed! The pressure on black-box, closed AI models is huge, and achieving GPT-4 performance with open, smallish models is upon us. Which is great. In the last few days, three new, smallish, powerful open AI models were released. Interestingly enough, the power of these 3 models is based on a combination of: 1) Innovative training architectures and optimisation techniques, and 2) Data quality for different types of data (synthetic, public or private).…
Read More
How Science Fiction Shapes Tomorrow’s Tech

How Science Fiction Shapes Tomorrow’s Tech

The below is a summary of the first episode of my Synthetic Minds podcast. Think science fiction is just for entertainment? Think again. It's time for businesses to read today's sci-fi to shape tomorrow's reality. In the debut episode of the Synthetic Minds podcast, Dr. Mark van Rijmenam chats with Karl Schroeder, a science fiction author and strategic foresight consultant. They delve into how sci-fi narratives, like Schroeder's “Stealing Worlds” and “Lady of Mazes,” provide valuable insights for navigating future technological landscapes. These stories blend AI, blockchain, and mixed reality to imagine radical shifts in governance and personal freedoms, offering…
Read More
Unveiling the Leaders in Data and AI: The 2024 Finalists for the Databricks Data Visionary Award

Unveiling the Leaders in Data and AI: The 2024 Finalists for the Databricks Data Visionary Award

The Data Team Awards annually recognize the indispensable roles of enterprise data teams across industries, celebrating their resilience and innovation from around the world.With more than 200 nominations, the awards showcase the behind-the-scenes successes in data science and artificial intelligence. We look forward to highlighting these forward-thinking clients across six categories at the Data + AI Summit in June.The Data and AI Visionary Award is presented to an executive innovator who has spearheaded the integration of data, analytics, and AI into their company’s strategic initiatives. These visionaries exemplify unparalleled foresight and inventiveness, charting new paths for data's role in predictive…
Read More
Instruction Tuning GPT2 on Alpaca Dataset

Instruction Tuning GPT2 on Alpaca Dataset

Fine-tuning language models to follow instructions is a major step in making them more useful. In this article, we will train the GPT2 model for following simple instructions. Instruction tuning GPT2 on the Alpaca dataset will reveal how well very small language models perform at following instructions. Figure 1. Instruction tuned GPT2 on Alpaca dataset inference result. In particular, we will train the GPT2 base model which contains just 124 million parameters. This is much smaller than what the industry considers as SLMs (Small Language Models), which us typically 7 bllion (7B) parameters. In fact, any language model below 3…
Read More
Breaking Down Silos, Building Up Insights: Implementing a Data Fabric 

Breaking Down Silos, Building Up Insights: Implementing a Data Fabric 

(amiak/Shutterstock) Data is the lifeblood of modern business, but for commercial-sized companies, managing and leveraging data can feel like navigating a maze. But what if there was a way to simplify the journey and unlock the full potential of a company’s data? Read on to learn how a data fabric can add value by maximizing the value of a company’s data infrastructure. Large, global enterprises have massive data teams set up to transfer and manage their data, using approaches like a data mesh. But commercial-sized companies are also dealing with more and more complex data landscapes, and finding that a…
Read More
No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.