//php echo do_shortcode(‘[responsivevoice_button voice=”US English Male” buttontext=”Listen to Post”]’) ?>
Introduction
Open-source tools have become increasingly popular over the past few decades, spanning operating systems, applications, programming languages, web servers and AI/ML libraries and frameworks. Now, in a transformative shift for the AI/ML industry, SensiML has announced that it will begin open sourcing the core IP of its flagship AutoML development product for IoT edge devices, Analytics Studio. This initiative demonstrates the company’s commitment to fostering an open, collaborative environment for the rapidly growing TinyML ecosystem and will serve as the foundation for a new open-source community collaboration project.
Analytics Studio is our server-based AutoML engine that rapidly generates sensor-based inference models from user-supplied ML datasets and optimizes the resulting embedded code for IoT edge devices to create TinyML® models. In addition to automating and speeding up the model-building process, the AutoML capability in Analytics Studio allows users of all data science skill levels to successfully create accurate sensor inference code for their bespoke IoT device applications.
What Analytics Studio Can Do
SensiML’s Analytics Studio has long been recognized as a powerful AutoML engine that facilitates the rapid development of sensor-based inference models that execute locally on low-power embedded MCUs and SoCs. It caters to a diverse range of IoT edge devices, supporting applications from acoustic event detection to anomaly and vibration classification. Historically available as a proprietary tool and cloud-based service, Analytics Studio is known for its ability to democratize ML model development, enabling users with varying levels of data science expertise to produce efficient, embedded code tailored to specific IoT applications.
Focused on time-series sensors, SensiML’s Analytics Studio can quickly create self-standing
C code suitable for a variety of applications.
By Chris Rogers, CEO, SensiML 06.10.2024
By P-DUKE Technology 06.08.2024
By Lisleapex 06.08.2024
Now SensiML is making a variant of Analytics Studio available as an open-source application, a decision that underscores our proactive approach to addressing some of the most pressing challenges in the IoT and TinyML landscapes.
Why Open Source?
Our decision to open source is motivated by a multifaceted strategy aimed at enhancing transparency, accelerating innovation, and expanding community engagement within the AI/ML industry. Below we’ve listed some of the reasons behind this decision and the expected benefits for the TinyML community.
Innovation and Agility: Open-source projects are natural incubators for innovation, as they allow developers worldwide to contribute to and iterate on project features rapidly. This collective development model helps ensure that the software stays at the cutting edge of technology and meets the evolving needs of the community.
Promoting open, hardware-agnostic solutions for the IoT edge: By embracing open source, SensiML is empowering users with easy-to-use, complete AI tools that avoid the pitfalls of vendor lock-in. This flexibility allows enterprises and developers to adapt their software stacks according to their needs without being constrained by a single vendor’s ecosystem.
Community and Support: One of the best consequences of open-source software is its tendency to create a vibrant user community. Our initiative is designed to foster a supportive network of developers who can share knowledge, troubleshoot issues, and collectively improve the Analytics Studio platform.
Quality and Security: Open-source software benefits from transparent, community-driven development processes that often lead to higher-quality and more secure code. The collaborative nature of these projects facilitates more thorough reviews and quicker resolutions of issues.
Tackling TinyML Ecosystem Challenges
The open-source benefits we’ve listed above are generally well-understood across the community of open-source adopters but are also somewhat abstract. To put these benefits into context for specific challenges faced by the TinyML ecosystem, let’s delve a bit deeper into a couple of these and examine how they relate specifically to problems faced by current TinyML adopters.
Overcoming the Dataset Bottleneck
The lack of sufficient training data is a significant hurdle for TinyML applications. Open-source contributions can help create more robust solutions to generate, augment, and utilize data more effectively, including techniques such as synthetic data generation and transfer learning.
The use of deep learning techniques to create accurate predictive models relies on the availability of sufficient model training data to cover the sources and ranges of variance that can be expected in actual use. Such training dataset requirements can thus be quite large. Well-known extreme cases are large language models (LLMs) with trillions of model parameters, hundreds of thousands of GPU training hours, and training datasets that approach the total amount of human text available from the internet.
TinyML models involve much smaller training datasets, but the nature of sensor-derived input data makes the dataset challenge arguably a more intractable problem than for LLMs. While LLMs are enormously large in scale, they at least benefit from a scalable data source of human language text acquired through the readily automated scraping of texts, documents, and Wiki pages off the internet. For sensor applications, there is typically no such equivalent readily scalable data source.
This dataset bottleneck problem spans most use cases within the TinyML realm. It demands that developers invest substantial time, effort, and cost to collect empirical data specific to their desired use case. They must do so in sufficient quantity and over a diverse enough set of conditions to effectively train the model for the full range of conditions that could be expected in actual use. In our motor example, a large multinational motor manufacturer may possess or have the means to produce enough data to develop robust models, but smaller companies and innovators lacking such resources are limited to simpler models. The result is constrained user adoption for TinyML due to the high barrier of acquiring train/test data for each application.
How Open-Source TinyML Tools Can Help Resolve the Dataset Bottleneck
Current active research into reducing the training dataset bottleneck shows promise and includes techniques such as transfer learning, data augmentation, synthetic data generation from simulations and Generative Adversarial Networks (GANs), semi-supervised learning, and model compression. Such methods are evolving rapidly, and effective approaches differ across the many use cases encompassed within the TinyML ecosystem.
As an example, data augmentation for image recognition would typically involve rotations, translations, scaling, or chromatic shifts whereas audio data would involve a completely different set of transforms for pitch, timbre, cadence, and noise suppression. Faced with the pace of rapidly changing state-of-the-art methods and approaches that differ widely by application, the need for open-source community-based collaboration is critical.
By opening a common TinyML development platform for community contribution and improvement, we believe the ecosystem can benefit from the collective efforts of developers and researchers contributing to a common open codebase focused on overcoming the dataset bottleneck.
Solving Another Key TinyML Ecosystem Challenge: Reducing Fragmentation
The IoT development landscape is often fragmented by proprietary solutions that tie developers to specific platforms. SensiML’s open-source approach aims to reduce this fragmentation, providing a unified platform that supports a broad array of hardware and software configurations.
Over the past several years we’ve witnessed many AutoML development tool companies being acquired by hardware vendors seeking to lock users into their silicon offerings by creating high switching costs associated with a captive ML development tool. While that motivation is understandable from the silicon vendor’s point of view, the resulting fragmented ecosystem is far from ideal from the IoT developer’s standpoint.
Want toolkit X but need to use silicon Y for other design or business reasons? With these captive solutions, users are faced with difficult choices between software tool functionality and hardware selection criteria such as datasheet specs, cost, and second-source alternatives. When the two goals conflict, the all-too-common result is that IoT developers will simply push out planned ML features until ML tool maturity and feature support exists for the specific required hardware and application needs.
How Open-Source TinyML Tools Can Help Solve Fragmentation
Rather than being tied to the offerings of select hardware vendors, we believe that providing TinyML implementers with choice and flexibility better serves users’ needs. This flexibility can even be seen as a strategic decision by preserving value for invested efforts in developing ML tool skills and datasets that can be ported across hardware and specific tool implementations.
By contributing a baseline AutoML toolchain to open-source, SensiML envisions the potential for a de facto open and flexible platform in much the same way that Eclipse serves as a common IDE technology behind both many vendor-specific implementations as well as that maintained by the Eclipse Foundation itself.
SensiML’s dual licensing approach will allow for either open-source access under AGPL or? commercial product licensing such that vendor specific derivatives that can be built upon the SensiML OSS core engine, preserving vendor specific innovation opportunities while also supporting and benefiting from an inclusive open-source model.
SensiML’s decision to open-source Analytics Studio represents a pivotal development in the field of edge AI/ML. It not only enhances the capabilities of developers across the globe but also enables us to play a leading role in promoting open, innovative solutions in the TinyML space. As we embark on this new chapter, the potential for transformative impacts on the industry is immense, promising to accelerate the adoption and sophistication of AI technologies in edge devices.
How You Can Participate
As we open our technology, we invite developers, engineers, and industry professionals to join us. Whether you’re looking to contribute to the project, learn from the community, or simply explore the possibilities of edge AI, SensiML’s open-source initiative offers a unique opportunity to engage with cutting-edge technology and drive the future of IoT development. The SensiML OSS GitHub repository will launch later this summer as well as the project website at https://sensiml.org. To get involved and stay updated on the latest developments and launch date, sign up and join the SensiML OSS newsletter today.
Source link
lol