I recently did some statistical testing to see if markets were random (details in the post Another cut at market randomness). It turns out they were, at least close to close returns for SPX/GSPC. My confirmation bias wasn't going to stand for that, so I thought about taking a different look.

Two things interested me. Firstly, I am looking at up vs down (i.e a higher or lower close than previous), rather than trying to predict an exact price. If markets are random then up or down have a probability of 0.5 each and are independent. A run of 5 consecutive ups or downs has a probability of 0.03125 or roughly 3%. How would that pan out looking at historical data?

Secondly, how could one visualise seemingly random data without it ending up looking like noise?

I came up with the following chart:

Each square represents one week and each line represents one year. If the close was higher than the previous week, it is blue, otherwise it is red. As the count of successive higher or lower weeks rise, the boxes get deeper in colour, up to a maximum of 5. As a side effect of date calculations and the definition of "week" some years have 53 weeks, which is why some lines are longer than others.

In total there were 138 runs of 5 weeks in the same direction out of 2208 samples, or around 6%, roughly double what we might expect.

Looking at that, I wondered what it would look like comparing weeks across years, comparing week 1 of year n with week 1 of year n + 1. That lead to the second chart:

This time we had 175 runs of length 5 out of 2208, just under 8%, again quite a bit more than the 3% we were expecting.

That is all well and good, but these charts only represent the direction of the week to week moves, not the magnitude of the moves which is probably more important. Finally I took a look at the return over 5 periods.

Again if it is positive the squares are blue, negative they are red. The colours are scaled as a proportion of the largest positive and negative returns for blue and red squares respectively. The very pale squares are where the returns were proportionally so close to zero they would not otherwise be visible, so I set a minimum level to ensure they displayed.

We can see that positive returns tend to follow positive returns and vice versa, at least for this 5 week look back period. This is somewhat deceptive as a negative return, though negative, may still be higher than the previous one implying a loss.

We can see that positive returns tend to follow positive returns and vice versa, at least for this 5 week look back period. This is somewhat deceptive as a negative return, though negative, may still be higher than the previous one implying a loss.

What does all this mean? Not too much in practise, as it is another thing to know in advance if a series will have consecutive up days or down days. In this case a tradable edge is not so easily won.

However, it does reflect my understanding of how prices move a little better, in that they trend for a while then range for a while and vice versa, and things may not be as random as we might expect. My confirmation bias somewhat sated.

However, it does reflect my understanding of how prices move a little better, in that they trend for a while then range for a while and vice versa, and things may not be as random as we might expect. My confirmation bias somewhat sated.

The charts were done in Processing using the free weekly data from Yahoo! finance for GSPC. If you would like a chart for a given ticker, let me know.

Very cool. In your research so far, it seems like the lower frequency data you use, the less 'random' the data appears to be.

ReplyDeleteHave you applied these tests to minute/hourly/etc. data? I would guess that the higher the frequency, the more random/noisy the data would appear.

What do you think?

Hey thanks, I haven't actually looked at higher frequency data, partly because I don't have it available. I do have some intraday forex data and it might be interesting to see what comes out of that.

ReplyDeleteI think lower frequency data is more clear about what is really "going on" in some respects, the lower the time frame the more noisy things get, and more subject to distortions like big orders going through or whatever. That is why I tend to focus on daily/weekly timeframes.

Also with a lot of data, charts like the above get very big :) But I think I'll give it a go.

As far as working with Machine Learning, have you found the same to be true---that you tend to get better results working with lower freq. data?

ReplyDeleteIt really depends on what you are doing. I have looked at lower timeframes with FX trading, but not yet utilizing machine learning. The very high frequency stuff like HFT is more like market making which is not really what I am focussed on.

ReplyDelete