Background¶
Early online portfolio selection (OLPS) methods relied on a priori assumptions about market dynamics and mathematical optimization. With increased computing power and data, data-driven OLPS methods that directly learn from data have gained attention.
While various data-driven OLPS methods have shown promise, evaluating and comparing them remains challenging. Without standardized datasets and fair comparisons, it’s unclear which method truly performs best in real-world environments, hindering progress in this field.
Researchers often default to assuming newer methods are superior, but this is irresponsible without consistent test conditions. Financial time series are highly non-stationary, so the same method can exhibit drastically different performance on different datasets.
To address this, we introduce FinOL, a new finance benchmark platform
designed for data-driven OLPS research. FinOL provides diverse financial
datasets and extensive benchmark results for fair comparison.
Notation¶
Notation |
Description |
|---|---|
\(b_{t,i}\) |
Proportion of wealth invested in asset \(i\) at the beginning of the \(t\)-th trading period. |
\(\mathbf{b}_t\) |
\(m\)-dimensional portfolio vector at the beginning of the \(t\)-th trading period. |
\({\mathbf{b}}_{1}\) |
Initial portfolio vector, typically a uniform distribution \((1/m, \ldots, 1/m)\). |
\(\mathbf{b}_{t}^{\top} \mathbf{x}_t\) |
Daily return of the portfolio at the end of the \(t\)-th trading period. |
\(m\) |
Number of assets in the financial market. |
\(n\) |
Number of discrete trading periods. |
\(S_n\) |
Final wealth achieved at the end of the \(n\)-th trading period. |
\(S_0\) |
Initial value of the portfolio, normally set to 1. |
\(x_{t,i}\) |
Price relative of asset \(i\) at the end of the \(t\)-th trading period. |
\(\mathbf{x}_t\) |
\(m\)-dimensional non-negative price relatives vector at the end of the \(t\)-th trading period. |
\(\Delta_m\) |
Simplex constraint indicating the portfolio is fully self-financing without leverage or shorting. |
Problem Formulation¶
In the context of online portfolio selection, we analyze a financial market consisting \(m\) assets observed during a specified time horizon comprising \(n\) discrete trading periods (the term “periods” is defined flexibly). At the end of the \(t\)-th trading period, we use a \(m\)-dimensional non-negative price relatives vector \(\mathbf{x}_t=(x_{t,1},\ldots,x_{t,m}) \in \mathbb{R}_{+}^{m}\) to represent the performance of the \(m\) assets, where each element \(x_{t,i}\) equals the close price of asset \(i\) on the \(t\)-th trading period divided by the close price of asset \(i\) on the \((t-1)\)-th trading period, i.e., \(x_{t,i}={C_{t,i}}/{C_{t-1,i}}\).
Before the start of the \(t\)-th trading period, using past historical information, we can construct a portfolio vector \(\mathbf{b}_t=(b_{t,1},\ldots,b_{t,m}) \in \mathbb{R}^m\) to allocate our wealth among \(m\) assets. Each element \(b_{t,i}\) represents the proportion of wealth invested in asset \(i\) at the beginning of the \(t\)-th trading period. A portfolio clearly needs to satisfy the simplex constraint \(\mathbf{b}_t \in \Delta_m\), where \(b_{t,i} \ge 0\) and \(\sum\nolimits_{i=1}^{m}{{{b}_{t,i}}}=1\). This constraint indicates the portfolio is fully self-financing without leverage or shorting.
Therefore, at the end of the \(t\)-th trading period, the daily return of the portfolio is defined as \(\mathbf{b}_{t}^{\top} \mathbf{x}_t = \sum\nolimits_{i=1}^{m} b_{t,i}x_{t,i}\). Based on this, the final wealth achieved at the end of the \(n\)-th trading period is:
where, without loss of generality, the initial value of the portfolio is normally set to 1, i.e., \({S}_{0} = 1\); at the same time, the portfolio vector is initialized to a uniform distribution, i.e., \({\mathbf{b}}_{1} = (1/m, \ldots, 1/m)\).
Obviously, the goal of this task is to maximize the final portfolio wealth, which depends entirely on the portfolio vector generated by the data-driven method in each period.