Crypto quant trading: Intro

I’m go­ing to write a few posts on quant trad­ing. Speci­fi­cally trad­ing crypto, since that’s what I know best. Here’s a few rea­sons why I’m do­ing this:

  • I think I can benefit a lot from writ­ing about my ap­proach and method­ol­ogy. Hope­fully this will make the ideas and as­sump­tions more clear.

  • I’d love to get in­put from other peo­ple in the com­mu­nity on their ap­proaches to model build­ing, data anal­y­sis, time se­ries anal­y­sis, and trad­ing.

  • There’s been a lot of great con­tent on this web­site, and I’d love to con­tribute. This is the topic I cur­rently know best, so I might as well write about it.

  • My com­pany (Tem­ple Cap­i­tal) is also look­ing to hire quants and we be­lieve the ra­tio­nal­ist way of think­ing is very con­ducive to suc­cess­ful quant trad­ing.

My goal here isn’t to make you think that “Oh gosh, I can be­come a mil­lion­aire by trad­ing crypto!” or “Here’s the strat­egy that no­body else has found!” In­stead, I want to give you a taste of what quant trad­ing looks like, and what think­ing like a quant feels like. EAs have been talk­ing about earn­ing to give for a while, and it’s well known that quant trad­ing is a very lu­cra­tive ca­reer. I’ve known about it for a while, and sev­eral of my friends have done quant (e.g. at Jane Street) or worked at a hedge fund. But, I never thought that it was some­thing I could do or would find en­joy­able. Turns out that I can! And it is!

I’m go­ing to be shar­ing the code and some­times the step by step think­ing pro­cess. If you’re in­ter­ested in learn­ing this on a deeper level, definitely down­load the code and play with the data your­self. I’ve been do­ing this for just over a year, so in many ways I’m a novice my­self. But the gen­eral ap­proach I’ll be shar­ing has yielded good re­sults, and it’s con­sis­tent with what other traders /​ hedge funds are do­ing.

Setup

Note: I’ve ac­tu­ally haven’t gone through these in­stall steps on a clean ma­chine. I think they’re mostly suffi­cient. If you run into any is­sues, please post in the com­ments.

  1. Make sure you have Python 3.6+ and pip

  2. `pip in­stall pan­das numpy scipy mat­plotlib ipython jupyter`

  3. `git clone https://​​github.com/​​STO­pand­think/​​tem­ple-cap­i­tal.git`

  4. `cd tem­ple-cap­i­tal`

  5. `jupyter note­book`

  6. Open `blog1_sim­ple_pre­dic­tion_daily.ipynb`

If you’re not fa­mil­iar with the tools we’re us­ing here, then the next sec­tion is for you.

Python, Pan­das, Mat­plotlib, and Jupyter

We’re go­ing to be writ­ing Python code. Python has a lot of re­ally good libraries for do­ing nu­mer­i­cal com­pu­ta­tion and statis­tics. If you don’t know Python, but you know other pro­gram­ming lan­guages, you can still prob­a­bly fol­low along.

Pan­das is an amaz­ing, won­der­ful library for ma­nipu­lat­ing tab­u­lar data and time se­ries. (It can do a lot more, but that’s pri­mar­ily what we’re us­ing it for.) We’re go­ing to be us­ing this library a lot, so if you’re in­ter­ested in fol­low­ing along, I’d recom­mend spend­ing at least 10 min­utes learn­ing the ba­sics.

Mat­plotlib is a Python library for plot­ting and graph­ing. Some­times it’s much eas­ier to un­der­stand what’s go­ing on with a strat­egy when you can see it vi­su­ally.

Jupyter note­books are use­ful for or­ga­niz­ing and run­ning snip­pets of code. It’s well in­te­grated with Mat­plotlib, al­low­ing us to show the graphs right next to the code. And it’s good at dis­play­ing Pan­das dataframes too. Over­all, it’s perfect for quick pro­to­typ­ing.

There are a few things you should be aware of with Jupyter note­books:

  1. Just like run­ning Python in an in­ter­ac­tive shell mode, the state per­sists across all cells. So if you set the vari­able `x` in one cell, af­ter you run it, it’ll be ac­cessible in all other cells.

  2. If you change any of the code out­side of the note­book (like in `note­book_utils.py`), you have to restart the ker­nel and re­com­pute all the cells. A neat trick to avoid do­ing this is:
    `im­port im­portlib`
    `im­portlib.reload(note­book_utils)`

Our first notebook

We’re not go­ing to do any­thing fancy in the first note­book. I sim­ply want to go over the data, how we’re simu­lat­ing a trad­ing strat­egy, and how we an­a­lyze its perfor­mance. This is a sim­plified ver­sion of the frame­work you might use to quickly back­test a strat­egy.

Cell 1

The first cell loads daily Bit­coin data from Bit­mex. Each row is a “daily bar.” Each bar has the `open_date` (be­gin­ning of the day) and `close_date` (end of the day). The dataframe in­dex is the same as the `open_date`. We have the `high`, `low`, and `close` prices. Th­ese are, re­spec­tively, the high­est price traded in that bar, the low­est, and the last. In stock mar­ket data you usu­ally have the open price as well, but since the crypto mar­ket is ac­tive 247, the open price is ba­si­cally just the close price of the pre­vi­ous bar. `vol­ume_usd` shows how much USD has been trans­acted. `num_trades_in_bar` is how many trades hap­pened. This is the raw data we have to work with.

From that raw data we com­pute a few use­ful vari­ables that we’ll need for ba­si­cally any strat­egy: `pct_change` and `price_change`. `pct_change` is the per­cent change in price be­tween the pre­vi­ous bar and this bar (e.g. 0.05 for +5%). `price_change` is the mul­ti­plica­tive fac­tor, such that: `new_price = old_price * price_change`; ad­di­tion­ally, if we had long po­si­tion, our port­fo­lio would change: `new_port­fo­lio_usd = old_port­fo­lio_usd * price_change`.

A few terms you might not be fa­mil­iar with:

  • We take a long po­si­tion when we want to profit from the price of an as­set go­ing up. So, gen­er­ally, if the as­set price goes up 5%, we make 5% on the money we in­vested.

  • We take a short po­si­tion when we want to profit from the price of an as­set go­ing down. So, gen­er­ally, if the as­set price goes down 5%, we make 5% on the money we in­vested.

Cell 2

Cell 3

Here we see that in­deed BTC re­cently crossed its 200 day SMA (Sim­ple Mov­ing Aver­age). One neat thing about that that I didn’t re­al­ize my­self is that it looks like the SMA has done a de­cent job of act­ing as sup­port/​re­sis­tance his­tor­i­cally.

Cell 4

Cell 5

Here we simu­late a perfect strat­egy: it knows the fu­ture!

One thing to note is that the re­turns are not as smooth /​ lin­ear as one might ex­pect. It makes sense, since each day bar has a differ­ent `pct_change`. Some days the price doesn’t move very much, so even if we guess it perfectly, we won’t make that much money. But it’s also in­ter­est­ing to note that there are whole pe­ri­ods where the bars are smaller /​ big­ger than av­er­age. For ex­am­ple, even with perfect guess­ing, we don’t make that much money in Oc­to­ber of 2018.

Cell 6

Here we simu­late what would have hap­pened if we bought and held at the be­gin­ning of 2017 (first graph) vs shorted.

Quick ex­pla­na­tion of the com­puted statis­tics:

  • Re­turns: mul­ti­plica­tive fac­tor on our re­turns (e.g. 5.2 means 420% gain or turn­ing $1 into $5.20)

  • Re­turns af­ter fees: mul­ti­plica­tive fac­tor on our re­turns, af­ter ac­count­ing for the fees that we would have paid for each trans­ac­tion. (On Bit­mex each time you en­ter/​leave a po­si­tion, you pay 0.075% fees, as­sum­ing you’re plac­ing a mar­ket or­der.)

  • SR: is Sharpe Ra­tio. It’s a very com­mon met­ric used to mea­sure the perfor­mance of a strat­egy. “Usu­ally, any Sharpe ra­tio greater than 1 is con­sid­ered ac­cept­able to good by in­vestors. A ra­tio higher than 2 is rated as very good, and a ra­tio of 3 or higher is con­sid­ered ex­cel­lent.” (Source)

  • % bars right: what per­cent of days did we guess cor­rectly.

  • % bars in the mar­ket: what per­cent of day were we trad­ing (rather than be­ing out of the mar­ket). (It’s a bit mis­lead­ing here, be­cause 1.0 = 100%)

  • Bars count: num­ber of days simulated

Cell 7

There are more graphs in the note­book, but you get the idea.

I’m not go­ing to dis­cuss this par­tic­u­lar strat­egy here. I just wanted to show some­thing more in­ter­est­ing than con­stantly hold­ing the same po­si­tion.

Fu­ture information

One of the in­sidious bugs you can run into while work­ing with time se­ries is us­ing fu­ture in­for­ma­tion. This hap­pens when you make a trad­ing de­ci­sion us­ing in­for­ma­tion you wouldn’t have ac­cess to if you were trad­ing live. One of the eas­iest ways to avoid it is to do all the com­pu­ta­tion in a loop, where each iter­a­tion you’re given the data you have up un­til that point in time, and you have to com­pute the trad­ing sig­nal from that data. That way you sim­ply don’t have ac­cess to fu­ture data. Un­for­tu­nately this method is pretty slow when you start work­ing with more data or if there’s a lot of com­pu­ta­tion that needs to be done for each bar.

For this rea­son, we’ve struc­tured our code in a way where to com­pute the sig­nal for row N, you can use any in­for­ma­tion up to and in­clud­ing row N. The com­puted `strat_sig­nal` will be used to trade the next day’s bar (N+1). (You can see the logic for this in `add_perfor­mance_columns()`: `df[‘strat_pct_change’] = df[‘strat_sig­nal’].shift(1) * df[‘pct_change’]`. This way as long as you’re us­ing stan­dard Pan­das func­tions and not us­ing `shift(-num­ber)`, you’ll likely be fine.

That’s it for now!

Po­ten­tial fu­ture top­ics:

  • What is overfit and how it im­pacts strat­egy research

  • Filters (mar­ket regimes, en­try/​exit con­di­tions)

  • Com­mon strate­gies (e.g. mov­ing av­er­age crossover)

  • Com­mon indicators

  • Us­ing sim­ple ML (e.g. Naive Bayes)

  • Sup­port /​ resistance

  • Autocorrelation

  • Multi-coin analysis

Ques­tions for the com­mu­nity:

  • Do you feel like you un­der­stand what’s go­ing on so far, or should I move slower /​ zoom in on one of the pre­req­ui­sites?

  • What top­ics would you like me to ex­plore?

  • What strate­gies are you in­ter­ested to try?