Data as a product is a principle that forces domain teams to apply product thinking to the data they own, i.e., to ensure their data is usable. Applying this principle is creating, owning, and maintaining data products entirely up to organizational usability and governance standards and ready for use with no additional work required.
The following characteristics separate raw data from a fully prepared dataset:
- Sharable and discoverable: publish data products for the rest of the organization.
- Self-describing: data products should be very well defined.
- Addressable (same location, predictable in the future): data products should be able to be accessed.
- Trustworthy: data products should have high data quality.
- Interoperable: data products should work with other data products.
- Secure: protected from unauthorized access.
Data products will have the following built-in features:
- A way to extract new data from sources
- A way to output data in a well-defined interface
- The code it needs to run to transform the data
- Storage infrastructure
- Control ports where you can call the data product, run transformations, and request the data
Having data in this ready-to-use format makes it much more accessible and efficient when data users need it. It’s also an important pillar of the data mesh concept because having data ownership fall squarely on the domain teams requires them to prepare their data in such a way that it is universally beneficial.