For most of Explosion’s life we’ve been a very small company, running off revenues. In 2021 that changed, and we became a slightly less small company running off venture capital. We’ve been unable to make that configuration work, so we’re back to running Explosion as an independent-minded self-sufficient company. We’re going to stay small and not look for any more venture capital. spaCy and Prodigy will continue, maintained by their original authors. We’ll keep updating our stack with the latest technologies, without changing its core identity or purpose.
We’re very grateful to Hugging Face for a $250,000 grant to support our open-source work, and we’ve applied successfully for a German R&D reimbursement grant that will give us up to €1.5m in unconditional funding. This grant money will give us a nice reserve, and we’ll stay sustainable via Prodigy sales and other offerings, including enterprise spaCy support and consulting.
Explosion was founded in 2016, a bit over a year after spaCy was first released. We started out taking some consulting projects to understand how people were using spaCy and what they needed. We saw that most people who were using spaCy seriously would need to annotate some data. For most of these annotation projects, the upfront work of developing the annotation scheme and curating the data was much more expensive than the annotation itself. This was the opposite of how most annotation solutions were designed. Most annotation solutions are built around driving the per-unit costs as low as possible, generally by outsourcing the work. This makes sense if you’re building a massive dataset, but if you’re building something like an information extraction system, it’s not the way to go.
So in 2017 we released our annotation tool Prodigy. We were able to pause the consulting work and focus on development, but the company was still really only two people – Ines and Matt. As Prodigy sales increased, we were able to add a few more people, until in 2021 we were 4-6 people.
In 2021 we decided to sell part of the company in order to get some investment capital. We had been trying to develop a new SaaS product built on top of Prodigy, but the project proved too complex for the team size we had, and there was a lot we wanted to get done for spaCy and Prodigy. By 2022 we had 10-15 people working on spaCy, and another 10-15 people working on Prodigy and Prodigy Teams.
We were able to do amazing things as hands-on developers, but unfortunately that success didn’t translate to leading a team and growing a larger company. We’re back to being a smaller company again now, just as we were for most of the company’s life. spaCy and Prodigy will continue to do the jobs you adopted them to do. They’ll stay up to date with the latest technologies, without changing their core identity or purpose.
What this means for spaCy and our other OSS
- Matt will be back working hands-on on spaCy and our other open-source projects.
- Priority will be given to installation and dependency issues, critical bugs, and new improvements in that order. While we know a lot of people will be keen to help out, we’re not looking for pull requests that add major new functionality at this point. Reviewing new changes takes time, and we want to get our feet under ourselves and get into a good release cadence first.
- If you’re a company interested in priority support, you can get in touch here.
- The
spacy-llm
initiative, which we launched in 2023, needs a lot of work to really fulfill its ambition. We’re particularly concerned about the work required to stay up-to-date with current LLM models. The best thing for the project would be a more horizontal approach with lots of contributors. If you’re interested in taking up maintenance of the repo and building that sort of collaboration, please get in touch. - The
curated-transformers
initiative also needs more maintenance than we can easily provide. We use the library within spaCy, so basic maintenance of it will continue. However, we’re unlikely to be able to update the library with new functionality and new transformer models, so this is another project we’d like to turn over to the community. If you’re interested in taking over maintenance, please get in touch. - Finally, one area that we can always use help in is usage questions. We’re unlikely to have time to answer usage questions on Stack Overflow or the discussions section of the GitHub repo at the moment. The library has been out for a long time now and there’s a lot of information already available, so we hope that the community is able to help each other when it comes to these questions.
What this means for Prodigy
Prodigy users can expect even more updates and improvements in the months to come. We expect to release v2 this year. If you’re a spaCy user and you haven’t checked it out yet, you really should. Prodigy lets you start and complete annotation tasks incredibly quickly, which is more valuable than ever now as models become more and more sample efficient. Upcoming features include workflows for human-in-the-loop distillation with LLMs, which has shown promising results, better support for interactive models in the loop and ecosystem integrations.
What this means for Prodigy Teams
We have to pause development of Prodigy Teams. We got close to finishing it, but we didn’t quite get there. We still want to finish it and get it released, but we need to focus on our other projects at the moment.
What this means for consulting services
If you think we can help with your NLP project, you can reach out via our inquiry form. However, we’ll be an even smaller team than before, so we have to be selective about what projects we take on. We’re currently working on a few projects, but could be available to start later in the year.
We’re incredibly thankful for all your support over the years, for the amazing team that helped us get where we are and for the many developers and companies that have been putting their trust in our stack and building on top of it. We’ll continue doing what we’re doing.
For more background and reflections, see Matt’s more detailed blog post.
Source link
lol