3 min read

Accidental RAP

I have this pipeline of data - I do a workout, record bits as I go along, log it on Fitocracy…

Fito haven’t added any features in years, in fact they’re barely keeping the server running at this point.

So I grabbed this script that asks the server for all my workout data, which arrives as a giant blob of JSON.

The data on my machine doesn’t need to be the most timely, so I have cron run that script 1ce a day. 1

There’s interviews with the founders of Fitocracy that showed that they distrusted libraries and wanted to code everything from scratch. This really shows in the JSON that the server spits out. I have this little script that converts that JSON into a CSV:

Honestly, the heavy lifting is done by the jsonlite package. The rest is a bit of purrr going up and down the list of lists of lists and keeping only the useful stuff.

Now I have a (massive) csv that’s accurate up to midnight last night. Then the below dashboard sort of grew out of that.

It wasn’t until someone gave a presentation on Reproducible Analytical Pipelines (RAP) that I realised I’d built one just trying to motivate myself to keep lifting heavy stuff.

Dates by PR

name kg reps
Barbell Bench Press 110 1
Barbell Deadlift 190 1
Barbell Squat 180 1
Standing Barbell Shoulder Press (OHP) 70 1
Jul 2019Jan 2020Jul 2020050100150200
Barbell Bench PressBarbell DeadliftBarbell SquatClose-Grip Barbell Bench PressFront Barbell SquatOverhead Barbell SquatPaused Barbell SquatStanding Barbell Shoulder Press (OHP)actiondatekg

Volume for the year

Workout days for this year

The programs I’ve been working with while running these scripts have focused on pressing things overhead, pressing on a bench, squatting and deadlifting, so that’s why those ones get more attention.

I came across Stronger By Science’s Art & Science of Lifting which confirmed that (as long as the weight isn’t too light) the better workout is generally the one with more volume - the repetitions (reps) times the weight. So I started graphing volume.

The heatmaps of volume for those lifts are kinda inspired by GitHub’s activity graphs.

I installed the ggalt package just so I could draw dumbbell graphs.

Further work

The dashboard gets tweaked fairly often, sometimes adding a table/removing a graph/changing a colour scheme.

Blogging about how I accidentally built a RAP has been on the TODO list for a while…

I want my nightly cron script to pin the latest data with the pins package, which I’ll do by tweaking the json->csv part of the pipeline.


  1. For 1 user running this script 1ce a day, I don’t mind if it’s inefficient in that it takes 10 mins. It’s far more important to me that it works.