Encord Blog
Applying Principles of Quantitative Finance to Computer Vision
From my years working in quantitative finance, I know that a key idea to making money in the market is to find ways to gain alpha– in other words, to collect risk-adjusted returns above the average market over a certain time scale.
Beta is the market opportunity available to all investors– the general market trends for a given level of risk. Capturing alpha means capturing the additional opportunities and returns in the market beyond beta.
One method of capturing alpha is for traders to act on predictive signals. These signals are notoriously difficult to discover and cultivate, and doing so often requires intensive quantitative analysis and prodigious amounts of data.
Quantitative researchers and traders take information from those signals, synthesise it, and formulate strategies that enable them to act faster and smarter than the competition. They look at a bunch of market information– asset pricing information and news information and alternative data– from different sets of sources, compile it, and develop “hypotheses” that make predictions about future returns. They’ll then aggregate the successful hypotheses together into strategies for trading in the market. These strategies will execute trades and allow investors to enter or exit positions that will or will not make money over a certain time scale.
However, to come up with effective strategies that capture all that information, many traders follow certain principles. At Encord, we’ve applied these principles in a very different domain to develop a platform that enables our customers to create and manage high-quality training data for computer vision.
The Encord platform in action
- Think Modularly
To effectively come up with a trading strategy, quantitative researchers and traders often take a modular approach and research alpha signals individually. They test separate hypotheses for each signal and measure the quality of each idea in the market by using previous market data backtests and validating whether it has been historically true. They then combine the hypotheses that have merit into a strategy that can be applied to the market and used in the real world.
When working on a complicated problem, taking a modular approach and testing the solution’s components individually is much easier and more efficient than testing an aggregated solution. If a component fails a test, then researchers can remove it or perform targeted work to fix what’s broken. When an aggregated solution fails, they have to troubleshoot the entire solution, pinpoint the problem, and then attempt to remove or fix the faulty component while mitigating the impact of any changes on the solution as a whole.
At Encord, we’re solving the problem of data annotation by taking a modular approach. Rather than try to automate the entire annotation process, we’re breaking it into much smaller pieces. We break apart each labelling task into a separate, specific micro-model, training the model on a small set of purposely selected and well-labelled data. Then, we combine these micro-models back together to automate a comprehensive annotation process. With its modularity, the micro-model approach increases the efficiency of data labelling, thereby enabling AI companies to reduce their model development time.
Encord's Micro Model approach
- Be Adaptable
In the market, there’s rarely an equilibrium. Because things change constantly, traders and quantitative researchers have to adapt quickly. They have to assume that they’ll be wrong a lot, so they put mechanisms in place to verify whether their hypotheses are correct. When quantitative researchers run back tests, the hope is that the hypothesis will work, but the goal is to find out as quickly as possible if they don’t. The longer a trader moves in the wrong direction, the more time that they’re wasting not finding the right answer. Once traders have new information, they adapt. They change their hypothesis and incorporate the new learnings into their models so that they can make better, more informed predictions as soon as possible.
At Encord, we understand that in the AI world in general, and the computer vision world in particular, the ability to adapt directly impacts the iteration time. Currently, there’s a technological arms race of sorts where models, principles, and technologies are evolving rapidly. If you don’t adapt– if you can’t quickly figure out both how and why you’re wrong– you run the risk of falling behind your competitors.
Adaptability provides a competitive edge. With that in mind, Encord has created a training data platform that gives customers flexibility in annotating datasets and setting up new projects so that they can adapt as their technology evolves.
- Reduce Iteration Times
The success of a data science project– and the success of a trading desk– is mostly a function of the time it takes to iterate over an idea. The faster you can move through an iterative cycle, the more likely you are to succeed.
Similarly, the success of an AI company often depends on the time it takes to iterate on an AI application before letting it run in the wild.
This timeline includes more than just iterating on model parameters or architectures. The future of AI is data-centric. Rather than improve AI by looking only at the model, practitioners will focus on improving the training data. Therefore, the ability to iterate quickly on a model depends on having an effective pipeline for training data. This pipeline includes an efficient and accurate data labelling and review process, a well-designed management system, and the ability to query the data throughout the training process.
We developed our training data platform so that it enables users to create, manage, and evaluate high-quality training data, reducing iteration time for computer-vision model development.
---
Machine learning and data operations teams of all sizes use Encord’s collaborative applications, automation features, and APIs to build models & annotate, manage, and evaluate their datasets. Check us out here.
Power your AI models with the right data
Automate your data curation, annotation and label validation workflows.
Get startedWritten by
Eric Landau
Explore our products