Thinking About Research Ideas vs. Technology
In this article, I want to share some thoughts on the difference between research ideas and technology, particularly in machine learning. This distinction is have been contemplating since starting my PhD. After joining Google DeepMind and being involved in product releases such as SynthID, I realized that it can be useful to distinguish between research ideas and technology in many projects, both in industry and academia.
These are very much “thoughts in progress” — reach out on Twitter or LinkedIn if you have comments or opinions.
A core component of research is the communication of research ideas. Top-tier venues, especially in machine learning, expect these research ideas to be novel, have significant impact and influence in the community and be appropriately empirically or theoretically tested. This is reflected in the way that research is communicated: through papers, talks, posters, tutorials, etc. where the idea is clearly the central element. While some venues encourage open sourcing of code or datasets or try to shift focus away from novelty or “subjective significance” (see TMLR), the underlying technology is usually secondary.
At this point, I want to clarify how I like to think about research ideas vs. technology. A research idea is a pretty general method or insight. For example, a new training algorithm or network architecture is a research idea. In papers, we (as reviewers or research community) expect that these ideas are described in sufficient detail and motivated appropriately. More importantly, we want to see that the research idea results in advantages or solves a problem. This latter part is usually where technology enters the scene. In my mind, technology is the actual implementation of a research idea — often tied to a specific problem, application or experiment. As a result, research idea and technology are complementary. Parts of the research idea, such as the underlying theory, might not be part of the actual implementation but has to be communicated as part of the corresponding paper. Technology usually also includes a lot of utilities (in machine learning typically evaluation, data loading, pre/post-processing, etc.) that are not of interest in the paper or do not affect the research idea.
So why do I like to think about this distinction? Especially in academia, the research idea is usually put first. However, I belive that adoption actually happens on the technology level. Of course, this assumes a good research idea in the first place. But I repeatedly observed good research ideas being less “successful” if there is no adoption on the technology level — for simplicity, using citations, impact, follow-up work as a proxy for success. In contrast, I saw (and personally experienced) research work being more successful than expected from the significance of the research idea by the community adapting open sourced technology or datasets. In fact, I feel that very often — considering specific applications, problems, or datasets — a specific technology can quickly become dominant.
With technology being adopted quickly, I do not mean that people favor specific frameworks or programming languages. Instead, I saw that specific pieces of code are very popular in that you can find, for example, more or less the same ResNet implementation in lots of works, the same data loaders for dataset X in lots of works, etc. Of course there are multiple ResNet implementations across frameworks, companies, etc. but looking at specific applications where researchers often build on the work of other researchers, specific implementations may become dominant. This means, that a particular piece of technology (a specific implementation of a method or algorithm) is particularly widely used across a field. This can have different reasons: good coding practices, easy-to-follow documentation or tutorials, support for specific hardware, quick installation, etc. Interestingly, this implementation may not come from the original authors of the research idea.
While the above describes what I observed in academia, I also found it very useful to think about this distinction in an industrial setting. In fact, I feel it can be useful for any project or collaboration. For example, some projects or situations might benefit more from an actual piece of technology rather than “just” communicating the corresponding (research) idea. Sometimes, there is an immediate need for a technological solution to a problem. Different teams may pursue different ideas; not always the best idea is adopted, but the most practical one. This is easily assessed by simply trying out a piece of software. In such situtation it might be worth to just implement a quick prototype over drafting a nice paper or presentation. In other settings, communicating the idea clearly, convincingly and timely can be most impactful because it motivates other sto then execute on these ideas. Unfortunately, I have yet to learn when to favor which approach …
Overall, when discussing with other researchers or engineers or thinking about collaborations and projects, I find myself more and more often thinking what would be most useful: putting effort into communicating an idea or implementing a first prototype. Of course, there might be a trade-off, but I think answering this question can be incredibly useful in research.
Source link
lol