Wikipedia people per born year with occupation

Written on December 29, 2009

There are times in which you just want to graph something. So I picked up this mildly interesting idea of inspecting what kind of people are mostly portrayed in the Wikipedia and tried a CSS graphing solution.

Wikipedia people per born year with occupation

  • 8232

  • 8523

  • 8533

  • 8440

  • 8184

  • 7953

  • 7244

  • 6171

  • 5311

  • 4114

  • 2966

  • 1661

  • 828

  • 428

  • 221

  • 142

  • 126

  • 92

  • 83

  • 56

  • 39

  • 27

  • 19

  • 17

  • 9

  • 15

  • 8

  • 13

  • 4

  • 4

Athletes (mainly footballers)
Artists (mainly actors)
Nobles (mainly royal infants)
  • 1980
  • 1981
  • 1982
  • 1983
  • 1984
  • 1985
  • 1986
  • 1987
  • 1988
  • 1989
  • 1990
  • 1991
  • 1992
  • 1993
  • 1994
  • 1995
  • 1996
  • 1997
  • 1998
  • 1999
  • 2000
  • 2001
  • 2002
  • 2003
  • 2004
  • 2005
  • 2006
  • 2007
  • 2008
  • 2009

About the data

I crawled the Wikipedia with mwclient, a library that does exactly this: allows to connect (e.g., from Python) to a Wikimedia-powered site and easily access the properties of the pages. Not related to my goal, it is able to modify pages too.

I looked up at most at 100 people per born year, looking at the categories they were into, and assigning them one or more (usually) broader categories, as “nobles” or “art” or “sport”. There were many more, but these accounted for almost all people born after 1980, the period I’m focusing on now.

Looking at the data, Wikipedia-aware people with six years or less are mostly aristocrats, in particular they were mostly sons/daughters of kings somewhere in Europe. Of course there weren’t many of them, about fifty in this six years span.

From six years to seventeen, there is an exponential grow of people in the Wikipedia, arriving to almost half a thousand born in 1993. Most of them are child actors from all over the world (USA, but also UK, Europe, and the far east, China and Japan).

Athletes, that began appearing earlier (figure skaters, weightlifters, gymnasts), become predominant from eighteen years old to the end of the period I considered, thirty years old. This happens thanks to the explosion of footballers. Seems that even obscure, second-league soccer players have their fans and so their Wikipedia pages.

About CSS bar graphs

Simple bar graphs are quite easy to do with pure HTML+CSS solutions, if you are not too demanding about particular features. Using ad-hoc images allows more freedom, but is obviously annoying.

My solution draws bars as list items, absolutely positioned. Hence there is a small annoyance: you have to manually put the left coordinate of the bars. But this is not too bad in 99% of the situations, that is when you use PHP to generate in some way the data. On the bar there is the height value, while the label (i.e., the year) is in another list in the bottom. Sadly this cause the graph to degrade poorly in textual browsers.

Finally there are some cosmetic CSS3 issues: rotated labels, rounded bars, blurred drop shadow, that you should be enjoying if you have a modern browser (Safari 4, Firefox 3.5).

If you liked this post, please share it with your friends: