Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

https://en.wikipedia.org/wiki/List_of_association_footballer...

  21 in 2021
  3  in 2020
  3  in 2019
  4  in 2018
  7  in 2017
  8  in 2016
  9  in 2015
  5  in 2014
  10 in 2013
  8  in 2012
  3  in 2011
  9  in 2010
Something is different about 2021. I'm not a data analyst, but maybe somebody can quantify how 2021 is an outlier.


That’s not a complete list for any year, so comparisons of counts is meaningless - again, lots of room for biases to creep in here.


>comparisons of counts is meaningless

Even if the data is biased, it's still useful when you account for the biases. What are you proposing is the bias for 2021?


A potential bias for 2021 would be that: In 2021 some people are worried about a vaccine->heart link and for any given event may be more likely to document it online and/or add it to Wikipedia. As such we might not be measuring number of cardiac arrests, and actually may be measuring public attention paid to cardiac arrest.


What test could prove or disprove that hypothesis? Creating some criteria by which to include a death in the statistic? (eg, notability)


Notability is sort of a bad metric since it’s incredibly subjective. And we’re not just worried about extraneous inclusions in 2021, we’re also concerned about missing data points for previous years - there’s no real statistical correction for “my inputs are biased in unknown ways”. You need to do real analyses with better quality data (i.e., national death registries) to find a ground truth to compare to. But good news, since this is quite a hot topic I’d expect that people are already looking into that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: