Online portfolio allocation with a very simple algorithm

By Yuri Fonseca


Today we will use an online convex optimization technique to build a very simple algorithm for portfolio allocation. Of course this is just an illustrative post and we are going to make some simplifying assumptions. The objective is to point out an interesting direction to approach the problem of portfolio allocation.

The algorithm used here is the Online Gradient Descendent (OGD) and we are going to compare the performance of the portfolio with the Uniform Constant Rebalanced Portfolio and the Dow Jones Industrial Average index. You can skip directly to Implementation and Example if you already know what an online algorithm is.

For those who don’t know what Online Convex Optimization is…

From now on, we will say that p represents a point in dimension N, where N is the number of possible stocks to invest. Each of it’s coordinates represent the amount invested in the respective stock. For example, suppose that N = 3 and p_t = [0.25, 0.25, 0.5]^\prime. Then, our portfolio at time $latex  t$ are composed of 25\% from stock one, 25\% from stock two, and 50\% from stock three.

Online Convex Optimization is based on the idea of minimizing the so-called Regret function, which is basically an idea of how far from a good player you are (some kind of optimal player that you want to be). In the set-up of portfolio management, the Regret function is defined as:

\displaystyle Regret=\sum_{i=1}^tlog(p^*r_i)-\sum_{i=1}^tlog(p_ir_i)

The first part on the right hand side is the cumulative log return of the portfolio p^*, and the second part, is the cumulative log return of our portfolio until time t, which we have played strategy p_i,~~~i=1,2,\dots,t. What is interesting is that p^* is a very special case of portfolio. It represents the Best Constant Rebalanced Portfolio (CRP)…it means that if someone could see the future and knew upfront all the returns of all the stocks and had to play a fixed portfolio proportion at time zero, p^* would be his choice. In the worst case scenario, the best CRP is equal to the best stock in the period, and the best CRP is the portfolio that we want to track.

The OGD algorithm is quite simple, we are going to update our position every week based on the gradient of the Regret function in the currently strategy. Here it follows:

  1. First play p_1 = \frac{1}{n} \textbf{1}
  2. For t > 1,
  3. b_{t-1} = \sum_{\tau = 1}^{t-1}\nabla(\textit{Regret})
  4. \tilde{p}_t = p_{t-1} - \eta b_{t-1}
  5. p_t = \prod_{S_n}(\tilde{p}_t)
  6. \prod_{S_n}(q) = arg\min_{x \in S_n} (x - q)^\prime(x-q)

This algorithm is proved to have Regret of order O(GD\sqrt{T}) if \eta = \frac{G}{D\sqrt{T}}. There is another similar algorithm which takes second order information and is quite similar to an online version of the Newton’s method. The name, not surprisingly is Online Newton Step. See Agrawal et al (2006) for further details.

Implementation and Example

In our set-up the set of possible portfolios will not allow us to borrow money or short selling stocks, so every position should be bigger then zero and the sum of all positions should be one. This set is called simplex and define polytope. Basically it’s a triangle in higher dimension.


symbols = c('MMM', 'AXP', 'AAPL', 'BA', 'CAT', 'CVX', 'CSCO', 'KO', 'DIS', 'DD', 'XOM', 'GE',
  'GS', 'HD', 'IBM', 'INTC', 'JNJ', 'JPM', 'MCD', 'MRK', 'MSFT', 'NKE', 'PFE', 'PG',
  'TRV', 'UTX', 'UNH', 'VZ', 'V', 'WMT')

for (i in 1:length(symbols)) getSymbols(symbols[i], from = '2007-01-03', to = '2017-06-21')

# Building weekly returns for each of the stocks
data = sapply(symbols, function(x) Ad(to.weekly(get(x))))
data = Reduce(cbind, data)
data_returns = apply(data, 2, function(x) diff(log(x))) #log returns
colnames(data_returns)= symbols
data_returns[] = 0 # VISA hasnt negotiations between 2007 and 2008

The OGD algorithm is implemented next. It uses the quadprogXT package to solve the projection onto the viable set. We are going to test two different values for \eta to check the algorithm sensibility.


OGD = function(base, eta) {

  # Gradient of Regret Function
  gradient = function(b, p, r) b + r/(p%*%r)

  # Projection onto viable Set
  proj = function(p) {

    Dmat = diag(length(p))
    Amat = cbind(diag(rep(1, length(p))), -1)
    bvec = c(rep(0, length(p)), -1)

    fit = solveQPXT(Dmat = Dmat, dvec = p, Amat = Amat, bvec = bvec)


  T = nrow(base)
  N = ncol(base)

  r = as.matrix(base) + 1 # this is because the algo doesnt work directly with log returns
  p = matrix(0, nrow = N, ncol = T); p[,1] = 1/N # initial portfolio
  b = matrix(0, nrow = N, ncol = T); b[,1] = 0

  for (i in 2:T) {
    b[,i] = gradient(b[,i-1], p[,i-1], r[i-1,]) # calculating gradient
    p.aux = p[,i-1] + eta*b[,i] # what we would like to play
    p[,i] = proj(p.aux) # projection in the viable set

  return(list('portfolio' = p,'gradient' = b))

# testing two etas
portfolio1 = OGD(base = data_returns, eta = 1/100)
portfolio2 = OGD(base = data_returns, eta = 1/1000)

Lets build a small function to calculate the compound return of portfolios taking their log return as input.

compound_return = function(portfolio, r) {

  return_OGD = c(); return_OGD[1] = portfolio$portfolio[,1]%*%r[1,]
  portfolio_value = c(); portfolio_value[1] = 1 + portfolio$portfolio[,1]%*%r[1,]

  for (i in 2:nrow(r)) {
    return_OGD[i] = portfolio$portfolio[,i]%*%r[i,]
    portfolio_value[i] = portfolio_value[i-1]*(1 + return_OGD[i])

 return(list('value' = portfolio_value, 'return' = return_OGD))

Now let’s check the performance of the OGD algorithm with both 0.01 and 0.001 values for \eta. We are going to compare the cumulative return first with the Uniform Constant Rebalanced Portfolio and the DJIA index. Them we will check the compound returns of DJIA stocks individually to see if the algorithm tracked the best stocks like we want them to do.

# Dow Jones
getSymbols('^DJI', src = 'yahoo', from = '2007-01-03', to = '2017-06-21')
## [1] "DJI"
DJIA_returns = as.numeric(cumprod(weeklyReturn(DJI) + 1))

# Individual stocks
stocks_returns = apply(data_returns + 1, 2, cumprod)

# Our portfolios
portfolio_value1 = compound_return(portfolio1, data_returns)
portfolio_value2 = compound_return(portfolio2, data_returns)

In the figures below we can see the performance of the strategies that were implemented. Both portfolios had a surprisingly perform, much better than DJIA and UCRP portfolio.

plot of chunk graphicsplot of chunk graphics

Indeed, if \eta is relatively big, the algorithm quickly found Apple as the best stock, and start pursuing it. If \eta is relative small, them the algorithm takes more time to adapt, but this makes the portfolio more stable.

Portfolio variation eta = 0.01

Portfolio variation eta = 0.001

To finish the comparison between our portfolios and DJIA, lets do a very simple risk analysis with just empirical quantiles using historical results. For a proper risk analysis we should model the conditional volatility of the portfolio, but this will be discussed in another post.

risk_analysis = function(return) {

  report = matrix(0, ncol = 2, nrow = 2)

  report[1,] = quantile(return, probs = c(0.01, 0.05))
  report[2,] = c(mean(return[which(return <= report[1,1])]), mean(return[which(return <= report[1,2])]))

  rownames(report) = c('VaR', 'CVaR')
  colnames(report) = c('1%', '5%')

  return(round(report, 3))

report1 = risk_analysis(portfolio_value1$return)
report2 = risk_analysis(portfolio_value2$return)
report_DJIA = risk_analysis(weeklyReturn(DJI))
Portfolio1 eta = 0.01
1% 5%
VaR -0.091 -0.047
CVaR -0.119 -0.074
Portfolio2 eta = 0.001
1% 5%
VaR -0.057 -0.039
CVaR -0.085 -0.055
DJIA index
1% 5%
VaR -0.060 -0.040
CVaR -0.084 -0.055

It’s quite evident that setting a big \eta, like we did in portfolio1, may have caused problems with VaR and Conditional VaR since Apple is a volatile stock. For portfolio2, where \eta is small the portfolio takes time to change, the historical VaR and CVaR are very similar to what we already observed in DJIA, but the returns were much better.


Agarwal, Amit, et al. “Algorithms for portfolio management based on the newton method.” Proceedings of the 23rd international conference on Machine learning. ACM, 2006.

This entry was posted in R and tagged , , , , . Bookmark the permalink.

10 Responses to Online portfolio allocation with a very simple algorithm

  1. Pingback: Online portfolio allocation with a very simple algorithm – Mubashir Qasim

  2. Thank you for this interesting post!

    In the beginning you write:
    “What is interesting is that p^* is a very special case of portfolio. It represents the Best Constant Rebalanced Portfolio (CRP)…it means that if someone could see the future and knew upfront all the returns of all the stocks and had to play a fixed portfolio proportion at time zero, p^* would be his choice.”

    This sounds as if something forward looking/information from the future were used. I assume this is not the case. Is p^* in each step calculated from the most recent past? Or where do we get our target from? If it is the most recent past (week) then we do some kind of momentum investing, right?

    Thank you and kind regards, Richard


    • Yuri Resende says:

      Hello Richard! Thanks for the comment.

      You are correct, to play the strategy p^* in t = 0, one would have to know the future beforehand, and only our ‘adversary’ has this hability. We can only use information t = 0, … , i-1 to play at time = i.

      What is interesting about the algorithm is that, in fact, we dont need to use any information at all about p^*. Once it is a constant strategy, when we calculated the gradient at step three of the algorithm, this term vanishes.

      The OGD algorithm only uses information about the past to define the strategy to pursuit p^*, even that we dont know which strategy is.

      The algorithm’s guarantee comes from math analysis, you can check this link to see the mathematical proof.

      Click to access thesis.pdf


  3. Lucas Paiva de Carvalho says:

    Congrats, Yuri
    Really interesting post! Leave you the Donald Award!!


    • Lucas Paiva de Carvalho says:

      Can I apply this on a portfolio with different class of assets? I would like to settle the optmal diversification of a client, being my choices between stocks, fixed income, private equity and alternatives. The returns would be the appropriate benchmark. Is it possible? Cheers!


      • insightr says:

        Sure you can! But if you use different class of assets, you will need to adapt the projection onto the viable set in a smart way to guarantee the proportion invested in each class.

        Thanks for the Donald Award, but i dont think i deserve it yet! Cheers!


  4. says:

    Dear Yuri,

    Thanks for the interesting post.

    It would be interesting to see the forecast part on out of sample data and it`s comparison with model. Any ideas?

    As I understand, here we get a picture of what rebalancing should be, given all known prices. So, this allocation praises momentum effect (that has it`s theortical basis).



  5. Pingback: Parametric Portfolio Policies | insightR

  6. Pingback: Parametric Portfolio Policies – Cloud Data Architect

  7. vic says:

    may I know the full code of the visualization part


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s