14 thoughts on the economics of the open-source data space and how to become the next MongoDB or Databricks

Image by the author.

The data space is booming, with companies like mongoDB (valued at 18 billion USD), databricks (30 billion), or Confluent, and many others. The startup space is overflowing with money and lots of founders want a share of the pie.

But in my opinion, the data space is set up to be dominated by open source solutions in the near future. Open source spaces have a very clear winner takes most dynamic making them extremely hard to compete. And that’s not even considering the fact that you don’t get paid to provide open source solutions, a priori.

And yet, open-source-based companies…


Data Mesh at JP Morgan Chase, Meltanos Hub got launched and a quick and dirty guide to data platforms.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

🎁 (1) Data Mesh at JP Morgan Chase

I just watched JPMCs talk about their data mesh journey. It’s an interesting watch, and there are definitely some important parts in it, but I’d like to highlight four things:

  1. JPMC moves from an on-premise network to the cloud and calls it “Data Lake via Data Mesh”, so they do both, a cloud move as well as a data mesh transformation.
  2. They seem…

How a commercial open-source software company should develop & price open-core products to fight the hyper-cloud “service-wrappers”.

(Photo by Tim Mossholder on Unsplash; Are you still open for contributions?)

Dbt Labs, formerly Fishtown Analytics, recently did a large Series C. In the announcement blog post, Tristan Handy, CEO of dbt Labs outlined the major risk he currently sees for open-core products like the one dbt Labs sells: commoditization by the hyper-clouds.

Turns out, Sid Sijbrandji, CEO of GitLab, also a company selling an open-core product, thinks very much alike. He has a thorough analysis of how an open-core product can potentially be developed & priced.

This article is a deep dive into this issue and explores Sids’ thoughts with real-world examples. I believe it’s essential to uncover the assumptions…


Follow HelloFresh along their data mesh journey, see Meltano get spun out, and understand how to properly price COSS products with GitLab's CEO Sid.

Good data strategies derive from company strategy, bad data strategies go wiggle-diggle.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

🔮 (1) Data Mesh at HelloFresh

The German startup HelloFresh is in a transition towards a data mesh architecture, coming from a very centralized analytics-heavy perspective. While they are still very much in transition, I like a couple of things about their approach and would like to highlight them.

First of all, HelloFresh identified its flywheel and its data as a strategic asset. That’s very much in line with…


It’s a fact, data people are in short supply, Fishtown analytics raised a big series C and how to make data engineers love working for you.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

🎁 (1) Dice Tech Job Reports

One article referred to the boom in “data engineering” hiring, so I did check up on the source. And indeed, the company Dice reports a growth of 50% of “data engineering” positions from 2018->2019, and still strong…


How RudderStack sees the future of data engineering, different approaches to personalization in machine learning models, and what ml observability actually is.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.

If you want to support this, please share it on Twitter, Linked In, or Facebook.

(1)🔮 RudderStack about the Future of Data Engineering

RudderStack highlights a few interesting points in a recent article. One is the coming rise of C-level data executives, which is already happening.

Second is a shift towards data becoming important in every single development team, which is something that’s already being carried forward by the data mesh paradigm or in general platform teams as a concept.

They feel that moving data…


How TechStyle created their modern data platform, why ETL needs open-source, and whether airflow is good enough as a data orchestrator.

Notice how completely normal the data mesh concept appears in the light of the last 15 years of decentralization in tech?

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

(1) TechStyles modern data platform

The data world is in turmoil, so I love every piece of experience I can get my hands on. I really enjoyed this article by Prukalpa Sankar, featuring a modern data stack with Snowflake, Atlan, and Tableau.

I’ll just share two quotes and would simply recommend you to read the whole article. It’s really well written.

““Things are moving so fast now…” […]…


How to measure a data team’s success, why dagster is a kool tool, and how Airbyte compares to meltano in the EL(T) open-source space.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

(1)🔮 Measuring Data Teams

Einar Orr, Co-founder of lakeFS and long-time data hero makes a good case for using meaningful metrics to evaluate data teams. The three metrics she suggests are:

  1. Data quality
  2. Data development velocity
  3. Data uptime

She makes a good case for the three and gives some insights on how to treat each of these metrics. I like that approach and find it feasible as…


How to revive your dead dashboards, a cool new graphDB called terminusDB, and developer experience for data guys.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

(1)🔮 Make Dashboards Less Dead

This is a great post by Tristan Handy. I think there’s a good use case for dashboards just as there is for any of your “BI artifacts”. This post contains an excellent plan on making dashboards “less dead”.

The post contains a roll-out plan for trustworthy dashboards & reports written by Alexander Jia:

  • Create a single source of truth in your dwh for…


How AI creates an award-winning whisky, how data companies make money, and which tools you can use to version data.

Data will power every piece of our existence in the near future. I collect “Data Points” to help understand & shape this future.

If you want to support this, please share it on Twitter, LinkedIn, or Facebook.

(1)🔮 Data Open Source Business Models

I just stumbled across some weird data orchestrator business models, so I started researching…

I’m sharing this article because as I said before, I believe the data space will be dominated by open source solutions pretty soon. As such I think it’s interesting to understand how open source companies actually make money and make sure they survive. Something we as end-users actually have…

Sven Balnojan

Ph.D., Product Manager, DevOps & Data enthusiast, and author of “Three Data Point Thursday”: http://www.thdpth.com.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store