Columbia-IBM Center for Blockchain and Data Transparency Supports Research To Enhance Blockchain and Related Technologies

Robert Florida
February 19, 2019

The Columbia-IBM Center for Blockchain and Data Transparency inaugural seed-fund program awarded grants to three Columbia teams whose research aims to enhance blockchain technology and explore how the growing use of data science in the financial industry – especially machine learning – is reshaping America’s knowledge economy.   

The Seed Funds Program supports cross-disciplinary research teams that advance innovation in blockchain and data transparency and benefit society. The Columbia-IBM Center promotes cutting-edge research that is exploratory and high-risk but not necessarily ready for funding from traditional sources. If successful, however, the research could lead to the creation of business-models and start-up companies that will commercialize new technologies.

Many teams from across Columbia applied for the inaugural grants, with the three winning teams being selected by the Center’s steering committee.

“Professors who’ve been awarded grants from the Columbia-IBM Center for Blockchain and Data Transparency will contribute important research and innovation in key areas of technology such as blockchain, smart contracts and data transparency,” says Juliette Fisbein, Director of Strategy and Operations at the Columbia-IBM Center for Blockchain and Data Transparency.

The Center, formed in the summer of 2018, is devoted to research, education, and innovation in blockchain technology and data transparency. The Data Science Institute and Columbia Engineering are partnering with the center, which intends to work in tandem with Columbia’s Business School, the Law School, the School of International and Public Affairs as well as Columbia Technology Ventures.

What follows are descriptions of the three winning research projects.

Project: Incentive Compatible Blockchains
Team members:  Christos Papidimitriou, Tim Roughgarden, and Xi Chen, all from the Department of Computer Science, Columbia Engineering.

Economic incentives have always played an important role in blockchain protocols. In public blockchains such as Bitcoin, which lacks a central authority to enforce rules, the primary tools available to protocol designers are cryptography and economic incentives. Yet each of these tools requires careful implementation to avoid becoming vulnerable to hackers who manipulate the protocol for their own gain. Getting the incentives right is a tricky problem; for example, Bitcoin suffers from some subtle flaws that can be exploited by hackers known as “selfish miners” to unfairly increase their reward from the protocol.

Therefore, to help create a safer and more robust blockchain technology, this team will use mathematical modeling techniques from algorithmic game theory to explore what improvements can be made to the technology to deter deviations from the intended participant behavior.

“We hope our research will offer a deeper understanding of leading blockchain technologies and the incentives they offer users to follow current protocol incentives,” says Tim Roughgarden, Professor of Computer Science and member of the Data Science Institute. “Ideally, the research will also suggest new blockchain protocols with better incentive properties that are thus more robust and less vulnerable to participant deviations.”  

In essence, a blockchain represents a distributed ledger that in theory is resistant to attack, since it is driven by network consensus and allows data to be transmitted by peer-to-peer networks. Blockchain technology also allows the transfer of data and assets between users without the need for a third party to approve that transfer. It provides a layer of trust for all transactions inasmuch as members of the network have access to the same data on the ledger, making it, in theory, easy for users to verify past transactions. This research team, however, has found flaws in the design protocols that offer incentives for users to deviate from the protocol and to disrupt the technology or cheat for profit.  

They will thus explore the reward mechanisms used by the two dominant blockchain protocol paradigms: proof-of-work (PoW) and proof-of-stake (PoS), and use game theory and mathematical models to understand how to eliminate incentives that encourage some users to deviate from the rules. In PoW, the algorithm rewards users who solve mathematical problems with the goal of validating transactions and creating new blocks. In PoS algorithms, though, the creator of a new block is chosen deterministically depending upon the user’s wealth. The researchers have discovered a new flaw in the PoW protocols, used most prominently by Bitcoin, which could make it vulnerable to attacks. Their research will find ways to remedy this flaw.  

“What we lack right now are mathematical guarantees for blockchain technologies that prove that users are incentivized to behave as intended,” says Roughgarden. “We hope that techniques from algorithmic game theory can contribute to the evolution of blockchain protocols from its current embryonic state into a mature, robust, and easy-to-use technology.”

Project: DeepSEA Framework for Building Certified Smart Contracts  
Team members: Ronghui Gu, Computer Science Department, Columbia Engineering, with assistance from a team of graduate students.

Blockchain allows users to trade cryptocurrency without the use of lawyers or standard legal contracts. Rather, they use smart contracts built from computer code–Digital Wallet is an example–to buy or trade cryptocurrencies. The users must entirely trust in the computer-code contracts – coding that in the past has been subject to error and hacking.

Gu’s platform, called DeepSEA-Blockchain, is a functional programming language that uses a mathematical model, called formal verification, to write the code that can be proven to be correct, bug-free and invulnerable to cyberattack. Functional programming is a declarative kind of programming that focuses on “what to solve,” in contrast to an imperative style, where the focus is on “how to solve,” says Gu,  an Assistant Professor of Computer Science who has an expertise in formal verification.  

“The DeepSEA-Blockchain platform will allow users to define specifications associated with contracts,” adds Gu. “They can use the platform to customize contracts and add features that will safeguard their currency. They can add specifications such as ‘my money will never be decreased in value or disappear,’ and we can ensure the code satisfies the user’s specifications.”

Smart contracts on blockchain are written by average developers who sometimes make mistakes, explains Gu, who cites the case of DAO (Decentralized Autonomous Organization) – a program built on the Ethereum Blockchain platform that was breached and resulted in $50 million worth of Ether being stolen. DAO was a promising application on the Blockchain, but a sole hacker spotted one bug in the DAO code and drained 3.6 million Ether into a personal account, causing the value of Ether to plummet and a crisis of confidence in the application. The price of Ether has since recovered, but the attack proved that Blockchain coding platforms are vulnerable.  

“It proved that Blockchain systems are not flawless,” Gu says. “The math model I’m using to verify that the code and the user’s specifications are consistent would have removed the bug that the hacker exploited.”

Gu’s delighted to have been awarded the Seed Funds grant, which will allow him to work with a team of doctoral students to develop the DeepSEA-Blockchain platform. He also intends to partner with researchers at IBM, so as to connect his new functional programming language to IBM’s blockchain platform, Hyperledger.

“I look forward to partnering with IBM’s researches on this project and hope that the DeepSEA-Blockchain platform, once developed, will improve the security and reliability of IBM’s Hyperledger ecosystem,” he says. “In the world of Blockchain, computer code acts as the law. We need safer code and invulnerable smart contracts that will allow Blockchain, an exciting new technology, to flourish and become part of the mainstream.”

Project: Machine Learning and the Changing Economics of Knowledge
Team members: Simona Abis and Laura Veldkamp, both from Columbia Business School

This project explores the idea that machine learning is to knowledge production what industrialization was to the production of goods.

The two researchers will combine economic theory and empirical data to understand how, in a knowledge economy, firms use data to make informed decisions that influence their profitability. They argue that machine learning especially is altering this decision-making process and aim to understand how it promises to do so and what the implications would be for the value of data.

Traditional macroeconomics is well-suited to explore a manufacturing economy, the researchers say, where capital and labor produce physical goods with constant returns to scale. Nowadays, though, more and more industries do not fit this paradigm. Particularly in data-intensive industries the key output of many firms is shifting from physical goods to knowledge. In this context, workers use small amounts of physical capital, typically a computer or cloud server, to process and evaluate data and transform it into knowledge, that may take the form of a strategic recommendation, a marketing campaign, or a new trading algorithm. This knowledge output requires two main inputs, structured data and skilled labor. The questions the researchers will seek to answer in this project are: How can these new inputs be valued and how do they work together to create knowledge and innovation? How can economists understand the effects that technological innovation and particularly the increasing use of machine learning will have on the macroeconomy?

In seeking to answer these questions, Simona Abis and Laura Veldkamp will adapt the standard tools of economic theory to describe modern knowledge economies. They will particularly focus on understanding the role that big data and machine learning are playing in generating innovation and returns. Economists generally represent the production of goods and services through a production function that takes capital and labor as inputs, the two say. Estimating such production functions then reveals the returns to their inputs. The two will use a similar estimation approach to assess the value of big data and machine learning and how this is transforming the demand for skilled labor. Particularly, they will consider a production function for knowledge that has two inputs: labor and structured data. Finally, firms use knowledge to make informed decisions that determine their profitability. According to the researchers, the use of machine learning brings about a different knowledge production function with respect to the one implied by traditional technologies. Traditional data analysis uses a string of data to estimate a given model. The model could be a linear regression to forecast sales growth, a capital asset pricing model (CAPM) or a search specification for finding a legal precedent. With a given model, each new type of data requires labor to adjust or reinterpret the model. Machine learning, though, eliminates the need for a person to specify a model. It takes many data types and determines the optimal model for evaluating them. This allows more data to be analyzed profitably.

“This analysis is particularly relevant given the unprecedented data proliferation observed in the last few years,” says Simona Abis, Assistant Professor of Business and member of the Data Science Institute.

“As of 2016, it was estimated that 90 percent of all digital data had been created in the previous two years,” adds Abis. “Without a clear mechanism that links data to firms’ profits, it is difficult to understand what determined such a dramatic increase in data production and hence how to value data. A prerequisite for any effective debate on data usage or sharing is to be able to attribute a value to data.”

Laura Veldkamp, Professor of Business, describes their approach as “fairly straightforward.” If machine learning is becoming increasingly profitable for firms, one would expect they would hire more data scientists to evaluate data and formulate profitable trading strategies.

“We’d also expect the firms to increase salaries for data scientists who can use machine learning to create innovation and profit,” says Veldkamp. “But no one knows for sure because no one has studied this. No one, as far as we know, has estimated the production of knowledge in this way. So we are trying to assess exactly how valuable machine learning is to financial firms, to guide our thinking about how valuable it could be for the global economy.”

While data science and big data are having a pervasive effect throughout the economy, Professors Abis and Veldkamp will examine one industry where machine learning is just taking hold: the financial industry. They believe finance is a good laboratory for understanding the effects of machine learning because it is a knowledge industry. The objectives of finance are relatively clear and it’s an industry with rapid and continuing adoption of new technologies. Investment banks and hedge funds primarily use data to forecast the future value of assets. They use this data to advise clients, broker deals, and to trade on their own portfolios. And machine learning is allowing these banks to transform a universe of quantitative and textual data into advice and trading strategies, they say.

The two will also analyze detailed data on job postings and labor trends, to help measure changes in the demand for data scientists who have machine-learning skills. They will use the data to understand how the composition of skilled workers is changing across the financial industry and how much the different skills are being rewarded through wages. Such understanding will allow them to estimate changes in the productivity and value of data as an input into knowledge production.

Once their research is completed, the professors intend to publish their findings in one of the top economics and finance journals. Both are grateful to have the seed-funds grant from the Columbia-IBM Center for Blockchain and Data Transparency, which is enabling them to carry out their research.

“I just moved to Columbia from NYU this fall, and already received this terrific grant that will help Simona Abis and me to conduct our research with an abundance of resources,” says Veldkamp. “It’s an amazing start to my career at Columbia.”

Abis is also grateful to the Data Science Institute for its support.

“Since I started collaborating with the Data Science Institute, it has opened up so many doors for me,” she says. “My research agenda is focused on the application of data science to financial markets. Being able to access all the resources they provide has certainly had a great impact on my research.”

And in the end, the two hope to understand how the growing use of data science in finance is contributing to and reshaping the American economy.

“Firms and investors are using data in new ways that enhance the value of firms,” says Veldkamp. “How does this use of data by skilled data scientists affect financial markets and the economy, and how can we quantify the effect those new employees are having on the macro economy? We hope that generating answers to this questions will help to guide valuation, regulation and forecasting in the new era of big-data.”