Every new IT job role is trying to be as effective as possible and therefore requires its own dedicated toolset. Very often, the first set of tools is adopted from adjacent sectors and then improved over time, adding a specialization. It’s a process that creates huge new markets every few years and opens a lot of opportunities for serving new communities of technical experts. Data science is still at the beginning of this process.
My favourite quote about technology is attributed to author and aviation pioneer Antoine de Saint-Exupéry: “Technology always develops from the primitive via the complicated to the simple”. It’s a pattern that you can see in all kinds of technology markets (remember how complicated smartphones used to be before the beautifully simple iPhone was released?), and that includes IT tools.
Data science for quite some time was stuck in the age of the Primitive Tool. Data scientists adopted software engineering tools like Github to manage their code and typically came up with some home-grown hack to manage experiments as they built out their machine learning models.
The last few years saw the rise of the Complicated Tool: End-to-end data science platforms that are trying to do everything. Since so many data science teams were formed only recently, this can make a lot of sense: Buy one big tool platform for your new data science department and you’re all set — in theory. Of course, the same rules as always apply to these integrated platforms: They’re typically really good at one or two things, but mediocre at the rest. They try to lock you into their ecosystem, which makes it hard to use other best-of-breed tools. And they don’t work at all for more advanced data science groups that need to build their own stack.
This is where Neptune comes in: Neptune’s founders have worked in data science for years. They know these challenges cold and have an impressive and unusually detailed understanding of the nuances of a data scientist’s daily work. This deep market insight motivated them to come up with the tool that was missing from their own toolset: A beautifully simple, but powerful collaboration platform.
Data science is not like software engineering. Its iteration cycles are much more rapid. The precise direction of a data science project is often initially unknown and evolves over time in a nuanced discovery project (hence “science”). Data science is at least as much about collaborative innovation and storytelling with visualizations as it is about writing code. That’s why traditional software engineering tools fall way short of data scientists’ needs.
Neptune covers the most important three pillars of data science work: Efficient and transparent experiment tracking; notebook management, comparison and versioning; and deep collaboration and documentation. As a former CTO with responsibility for a data science department, I know from my own experience how hard it is to run a data science team effectively without this kind of toolset.