3 minute read

A shoutout to Matthew Analytics (via R-Bloggers) for linking to the Open Powerlifting Data.

I’ve started looking at distributions of the lifts (squat, bench, deadlift) that have been performed at meets, and already submitted a bug report on the data. (They were slightly overzealous on matching records with the same name and ended up with strong babies.)

I’ve also been looking for a reason to use ggridges so there’s this, below:

powerlifting_data <- read_csv(here("static", "data", "Open Powerlifting", "openpowerlifting-2019-06-19.csv"))
powerlifting_data %>%
  filter(Sex=="M", !is.na(AgeClass)) %>%
  mutate(AgeClass = replace(AgeClass, AgeClass=="5-12", "05-12")) %>% #Cheap way to make them line up in order
  select(Sex, AgeClass,
         Bench    = Best3BenchKg,
         Squat    = Best3SquatKg,
         Deadlift = Best3DeadliftKg) %>%
  gather(lift, kg, Bench, Squat, Deadlift) %>% # Make long or 'tidy'
  filter(!is.na(kg)) %>%
  filter(kg>0) %>% # Failed lifts are logged as negative
  ggplot(aes(x=kg, y=AgeClass)) + geom_density_ridges2(aes(rel_min_height=0.005)) + facet_wrap(~lift) + theme_ridges()


powerlifting_data %>%
  filter(Sex=="F", !is.na(AgeClass)) %>%
  mutate(AgeClass = replace(AgeClass, AgeClass=="5-12", "05-12")) %>% #Cheap way to make them line up in order
  select(Sex, AgeClass,
         Bench    = Best3BenchKg,
         Squat    = Best3SquatKg,
         Deadlift = Best3DeadliftKg) %>%
  gather(lift, kg, Bench, Squat, Deadlift) %>% # Make long or 'tidy'
  filter(!is.na(kg)) %>%
  filter(kg>0) %>% # Failed lifts are logged as negative
  ggplot(aes(x=kg, y=AgeClass)) + geom_density_ridges2(aes(rel_min_height=0.005)) + facet_wrap(~lift) + theme_ridges() 

Instagram has chosen to remind me that Eddie Hall is defending his 500 kg world record deadlift. Density plots with thresholds don’t really handle single-points, but I could look up his age and add a special labelled point for him.

Does James even lift?

comparison_data <- powerlifting_data %>%
  filter(Sex=="M", AgeClass == "24-34") %>%
   select(
         Bench    = Best3BenchKg,
         Squat    = Best3SquatKg,
         Deadlift = Best3DeadliftKg) %>%
  gather(lift, kg, Bench, Squat, Deadlift) %>% # Make long or 'tidy'
  filter(!is.na(kg)) %>%
  filter(kg>0) 

lift_to_percentile <- function(lift_name, weight){
  
comparison_data %>%
  filter(lift==lift_name) %>%
  arrange(kg) %>%
  mutate(row = row_number()) %>%
  mutate(percentage = row/max(row)) %>%
  filter(kg == weight) %>%
  summarise(from = min(percentage) * 100, to=max(percentage)*100)
}

lift_to_percentile("Bench", 80)
## # A tibble: 1 x 2
##    from    to
##   <dbl> <dbl>
## 1 0.948  1.22

So somewhere around the 1st percentile for bench :)

lift_to_percentile("Deadlift", 145)
## # A tibble: 1 x 2
##    from    to
##   <dbl> <dbl>
## 1  1.28  1.42

And about the 1st percentile for deadlift. I see a pattern.

lift_to_percentile("Squat", 130)
## # A tibble: 1 x 2
##    from    to
##   <dbl> <dbl>
## 1  2.27  2.68

Trend broken! 2nd percentile for squat.

This discrepancy makes me want to look at ratioes of personal best squats to deadlifts…

Squat to deadlift ratio

powerlifting_data %>%
  group_by(Name) %>%
  summarise(squat_pr = max(Best3SquatKg), deadlift_pr = max(Best3DeadliftKg)) %>%
  mutate(squat_to_deadlift = deadlift_pr/squat_pr) %>%
  filter(squat_to_deadlift > 0) %>%
  ggplot(aes(x=squat_to_deadlift)) + geom_density() + theme_ridges()

That’s a loooooong tail, let’s zoom in.

powerlifting_data %>%
  group_by(Name) %>%
  summarise(squat_pr = max(Best3SquatKg), deadlift_pr = max(Best3DeadliftKg)) %>%
  mutate(squat_to_deadlift = deadlift_pr/squat_pr) %>%
  filter(squat_to_deadlift > 0, squat_to_deadlift < 2.5) %>%
  ggplot(aes(x=squat_to_deadlift)) + geom_density() + theme_ridges()

I like the little peak at 1.0.

comments powered by Disqus