As each sector of the economy undergoes a digital transformation through the explosion of data, we are creating a new natural resource: digital data. If we view digital data as a new commodity, much like oil or gold, and digital data flows as new representations of business processes, it is natural to ask: What is the value of the data? Many companies depend on the availability of the data in order to be able to provide services and decision enabling insights to their customers. The more data a company has, the better the service it can potentially provide its customers. When public or personal data is monetized by private industry how do we assign value to the data? How is data accounted for in a company’s ledger? Should public data be treated like a natural resource? Should the value of personal data be established for the individual? How can the individual whose data is being collected participate in this marketplace? How can such collected data be accessed and its value be shared by every contributor in a democratic way without being controlled by a small number of players? How fluidity of data can be enabled among the participants in the ecosystem? What new businesses can be created when data is fluid and could be accessed based on the consent of the owner? What are the legal and policy implications in this new world?
Extrapolating from individual people to individual datasets, the value of a collection of individual datasets is greater than the sum of the values of each of the individual datasets. For example, the success of state-of-the-art machine learning algorithms stems from such large collections. Fraud detection is better when banks amass data from each of their branches. The promise of precision medicine is based on having access to medical records across multiple hospital datasets—worldwide. What incentives can we provide for organizations to share their data for social good? What policies and mechanisms can be implemented to avoid biases in such collection processes? How can one provide guarantees of security and privacy?
Advances in technologies such as blockchain, homomorphic encryption, secure multi-party computation, and even secure hardware now make it possible for mutually distrusting organizations to share their data with each other in a secure, privacy-preserving, and tamper-proof manner. Some of these technologies do not rely on a (centralized) trusted third-party. These capabilities are already leading to novel applications in supply-chain management and food security, but the potential to impact all sectors, such as health, finance, energy, legal, media and entertainment, is enormous.
This center aims to support not just cutting-edge research in the underlying technologies that allow secure data sharing, but also invent new business models, services, and policies, and incubate start-ups that would commercialize the applications of these technologies. With data as a new commodity and the capability to securely share data at scale, what new business models will emerge? What new laws, policies, and regulations do we need as data marketplaces become commonplace and these new business models disrupt our economy? How will such services and applications affect user behaviors in creating, sharing and consuming data?
The center will support deep and broad research and development in a multi-layer, multi-faceted framework. At the high level, general issues concerning policy and regulation related to the emerging paradigm of data as natural resource will be investigated by a comprehensive team of faculty and researchers with cross-disciplinary expertise, independent of the specific technologies adopted in implementation. At the middle level, innovative business models, services, and applications will be fostered. At the practical level, cutting-edge technologies, tools, and facilities such as those enabled by blockchain or alternatives will be developed and evaluated through large-scale experimentation. Furthermore, education and training of the workforce for innovating and operating in the world where data is fluid and can enable various services will be addressed.