Produced in partnership with AXA Investment Managers.

Artificial intelligence (AI) is poised to disrupt the investment industry. The latest AI innovations have been instrumental in improving potential investor outcomes – and we believe will continue to do so.

Optimisation

In finance, optimisation refers to the process of finding the best solution for a particular problem subject to a set of constraints. In quantitative equity investing this technique is used in portfolio construction, to find the optimal portfolio that aims to maximise the expected return while minimising risk.

An example of a simple optimisation problem is for instance: if someone was organising a party, what is the optimal number of pizzas, cakes and drinks they should order? We can solve this with our brains, relying on experience and the back of an envelope to do some simple calculations.

But in finance, if we wanted to build a portfolio of 100 stocks from the S&P 500, there is an almost infinite number of combinations.

The optimiser can find the optimal portfolio in the risk-return space, searching through the endless number of possible portfolios until it finds the best possible combination of stocks that should deliver the best outcome.

But this is not new technology. To find the optimal portfolio the optimiser uses the Lagrange multiplier method. This method was first published in 1806 by an Italian mathematician, Joseph-Louis Lagrange. The technique involves introducing a new variable (the Lagrange multiplier) for each constraint in the optimisation problem and forming a new function called the Lagrangian.

Then by taking the partial derivatives of the Lagrangian the optimiser has directions on which way to look for the solution, without having to check each of the almost infinite possible combinations. These techniques play a crucial role in improving model performance in machine learning (ML), which is how computers systems use data to make decisions, from feature selection and tuning to minimising the loss function.

The future of AI and equity portfolios

Recent years have seen a surge in AI tool usage across industries, driven by enhanced computing power and sophisticated models. The accessibility and diversity of software platforms for testing and developing AI models have significantly increased.

But what are the implications for equity portfolio managers? Do we see a future where AI models take over from humans? The short answer is no.

ChatGPT and other mainstream AI models have been developed using very clean data, by leveraging high-quality online sources where the data has little to no ‘noise’ i.e. spelling and grammar are very good; the sentence structure and vocabulary is of a high quality. The models can leverage this foundation to generate high quality output.

Unfortunately, the same cannot be said of equity data. The factors that influence the return of a stock on any given day are diverse, and often obscured. In 1973 US economist, Burton Malkiel, famously wrote that stocks exhibit a “random walk”, indicating they are highly unpredictable and difficult to model.

What this means from the practitioner perspective is that one cannot simply plug equity data into a ML framework. The data needs to be cleaned extensively, and the modelling techniques also need to be thoughtfully selected to cope with the unpredictability in the data.

A constant state of evolution

The dynamics that drive markets include a multitude of factors; monetary and fiscal conditions, technological advancements, regulatory developments and shifts in investor sentiment. In addition, global events, geopolitical shifts and demographic trends contribute to the ever-changing landscape. As quantitative investors we recognise that history has a lot to teach us about the future, but we’re also very aware the future will never look exactly like the past.

This is in stark contrast to the mainstream use of AI. In large language models (LLMs) – a machine learning model that can comprehend and communicate human language – language is relatively stable; grammar, spelling and sentence structure do change but over a timespan measured by decades. This ‘stationarity’ means that data from the past is still representative of the current state, and models trained on more data create better outputs.

More data required

One of the datasets used to train ChatGPT (version 1) was BookCorpus, which contains 11,038 books – approximately 74 million sentences – by unpublished authors. Wikipedia, another training dataset, contains approximately 300 million sentences, according to our estimates.

In contrast, a global equity dataset containing monthly data going back 30 years will contain only 3.6 million stock observations – a large dataset but several orders of magnitude less than the dataset available to LLM models. The next hurdle to face is this dataset will not grow quickly – we can generate new data only through the slow progression of time, as we observe and measure returns going forward.

A challenging landscape but opportunities exist

These three factors – lack of historical data, high signal to noise ratio and non-stationarity – combine to create the biggest risk from using ML in equities: overfitting. Overfitting occurs when a model memorises historical noise rather than capturing underlying relationships. This leads to a model that performs very well on historical data but delivers poor performance in live trading.

Despite these challenges, we have embraced the potential of AI and machine learning, while being mindful of managing the associated risks.

Neural network tail risk identification

We developed a neural network to identify tail risk – it estimates the probability a stock will have a very sharp increase in volatility over the next month. The model, first deployed in 2017, uses a carefully selected set of features that individually are skilled at predicting increases in short-term volatility. By combining them with the neural network, this model successfully identified as high risk two prominent US regional banks that failed in the first quarter of 2023.

The drawback of using neural networks is that they are a ‘black box’ i.e. it is very difficult to interrupt the output of the model. As systematic, fundamental investors, it’s important that we have full transparency on the underlying data changes that drive a trade recommendation. To achieve this, we build a ‘white box’ around the neural network. We take the output of the model and regress it against the input variables. This linear regression allows us to calculate which input variables are impacting the final recommendations the most.

Natural language processing sentiment identification model

To build well diversified portfolios, we depend on harvesting insight from data – the most timely and accurate data on company fundamentals.

We use natural language processing (NLP) to analyse what company chief executives and chief financial officers say, to inform our stock selection signals. We developed an NLP model, first deployed in 2020, that reads quarterly earnings calls that have been transcribed into text. The model measures the sentiment and language precision of company leaders, and feeds into our quality and sentiment factor models.

The next stage of evolution

While AI advancements are remarkable, applying AI to equity portfolios presents challenges due to complex data and overfitting risks. We believe the use of these techniques without proper safeguards can be perilous. Nevertheless, our experience in equity investing coupled with our willingness to embrace new technologies has paved the way for the adoption of these advanced techniques into our next generation of models, ensuring we remain at the forefront of leveraging AI for our clients’ benefit.

Ram Rasaratnam is the chief investment officer of equity quantitative investing at AXA Investment Managers.

Join the discussion