Background

Early online portfolio selection (OLPS) methods relied on a priori assumptions about market dynamics and mathematical optimization. With increased computing power and data, data-driven OLPS methods that directly learn from data have gained attention.

While various data-driven OLPS methods have shown promise, evaluating and comparing them remains challenging. Without standardized datasets and fair comparisons, it’s unclear which method truly performs best in real-world environments, hindering progress in this field.

Researchers often default to assuming newer methods are superior, but this is irresponsible without consistent test conditions. Financial time series are highly non-stationary, so the same method can exhibit drastically different performance on different datasets.

To address this, we introduce FinOL, a new finance benchmark platform designed for data-driven OLPS research. FinOL provides diverse financial datasets and extensive benchmark results for fair comparison.

Notation

Notation

Description

\(b_{t,i}\)

Proportion of wealth invested in asset \(i\) at the beginning of the \(t\)-th trading period.

\(\mathbf{b}_t\)

\(m\)-dimensional portfolio vector at the beginning of the \(t\)-th trading period.

\({\mathbf{b}}_{1}\)

Initial portfolio vector, typically a uniform distribution \((1/m, \ldots, 1/m)\).

\(\mathbf{b}_{t}^{\top} \mathbf{x}_t\)

Daily return of the portfolio at the end of the \(t\)-th trading period.

\(m\)

Number of assets in the financial market.

\(n\)

Number of discrete trading periods.

\(S_n\)

Final wealth achieved at the end of the \(n\)-th trading period.

\(S_0\)

Initial value of the portfolio, normally set to 1.

\(x_{t,i}\)

Price relative of asset \(i\) at the end of the \(t\)-th trading period.

\(\mathbf{x}_t\)

\(m\)-dimensional non-negative price relatives vector at the end of the \(t\)-th trading period.

\(\Delta_m\)

Simplex constraint indicating the portfolio is fully self-financing without leverage or shorting.

Problem Formulation

In the context of online portfolio selection, we analyze a financial market consisting \(m\) assets observed during a specified time horizon comprising \(n\) discrete trading periods (the term “periods” is defined flexibly). At the end of the \(t\)-th trading period, we use a \(m\)-dimensional non-negative price relatives vector \(\mathbf{x}_t=(x_{t,1},\ldots,x_{t,m}) \in \mathbb{R}_{+}^{m}\) to represent the performance of the \(m\) assets, where each element \(x_{t,i}\) equals the close price of asset \(i\) on the \(t\)-th trading period divided by the close price of asset \(i\) on the \((t-1)\)-th trading period, i.e., \(x_{t,i}={C_{t,i}}/{C_{t-1,i}}\).

Before the start of the \(t\)-th trading period, using past historical information, we can construct a portfolio vector \(\mathbf{b}_t=(b_{t,1},\ldots,b_{t,m}) \in \mathbb{R}^m\) to allocate our wealth among \(m\) assets. Each element \(b_{t,i}\) represents the proportion of wealth invested in asset \(i\) at the beginning of the \(t\)-th trading period. A portfolio clearly needs to satisfy the simplex constraint \(\mathbf{b}_t \in \Delta_m\), where \(b_{t,i} \ge 0\) and \(\sum\nolimits_{i=1}^{m}{{{b}_{t,i}}}=1\). This constraint indicates the portfolio is fully self-financing without leverage or shorting.

Therefore, at the end of the \(t\)-th trading period, the daily return of the portfolio is defined as \(\mathbf{b}_{t}^{\top} \mathbf{x}_t = \sum\nolimits_{i=1}^{m} b_{t,i}x_{t,i}\). Based on this, the final wealth achieved at the end of the \(n\)-th trading period is:

\[{{S}_{n}}={{S}_{0}}\prod\limits_{t=1}^{n}{\mathbf{b}_{t}^{\top} {\mathbf{x}_{t}}}={{S}_{0}}\prod\limits_{t=1}^{n}{\sum\limits_{i=1}^{m}{{{b}_{t,i}}{{x}_{t,i}}}},\]

where, without loss of generality, the initial value of the portfolio is normally set to 1, i.e., \({S}_{0} = 1\); at the same time, the portfolio vector is initialized to a uniform distribution, i.e., \({\mathbf{b}}_{1} = (1/m, \ldots, 1/m)\).

Obviously, the goal of this task is to maximize the final portfolio wealth, which depends entirely on the portfolio vector generated by the data-driven method in each period.