What are the necessary skills for research-oriented data analysis?
The work that our team does at Our World in Data (OWID) has become a lot more visible recently. It has led me to receive more frequent emails asking for advice on how to work with us, or grow one’s skills towards similar positions. In this post, I summarize the necessary skills to join our data team at OWID, or an “OWID-like” organization.
What we pay attention to
Data wrangling is the fundamental thing that our data team does at OWID. It is thus the most essential skill to master if you’re considering joining us. You will need to be fluent in the use of pandas, or a similar package in R (dplyr, data.table). But our entire data pipeline relies on Python, so we have a strong preference for analysts who use this language.
The fact that this is our core skill should not only make it clear what we do, but also what we don’t do. To solve most of the world’s largest problems, our team believes that machine learning isn’t the most urgent next step. We work with datasets that are small by industry standards. And our tech stack is not at the cutting edge of data science and cloud services. Instead, we want to provide the world with the cleanest, most-reliable, best-documented datasets on crucial problems.
Other necessary skills
It’s not only the technical skills for data wrangling that are essential. Research-oriented data analysis implies an ability to use data to understand the world, and to help others do the same. The “expert data wrangler” presented above would thus also need:
- good “data judgement” (attention to detail, thoughtful tradeoffs between data quantity and quality, careful and systematic thinking in situations where there is no perfect solution);
- very good knowledge of data visualization principles and good practices;
- a good understanding of our work at Our World in Data and our mission;
- fluency in English;
- strong experience with importing, transforming, and maintaining datasets for other users.
This last skill can seem difficult to showcase if you’ve recently graduated. But its presence here doesn’t mean that you need to have worked for a large company before. We love to hear from people who maintain open-source datasets on important subjects. Highly-valued applications also include candidates who have worked with some of our key sources (WDI, SDG, UNWPP, GBD, etc.).
If the skills listed above make up the trunk of a tree, secondary skills are the branches. You don’t need to grow all these branches to work well at OWID, but our data analysts tend to be proficient in at least one.
- strong knowledge of statistics;
- strong knowledge of programming;
- strong knowledge of academic research, ability to understand publications, experience with science communication;
- experience with the development, maintenance, and documentation of large public datasets.
Beyond the skills that are useful to perform well in any job, here are the ones that are the most important for what we do:
- extreme attention to detail;
- being able to assess what data is accurate and insightful and which is not;
- recognizing shared behaviors and patterns that provide solutions to data problems;
- intellectual curiosity, openness to new ideas;
- interest in learning about novel research topics;
- flexibility to receive feedback, learn from new evidence, and change one’s mind;
- ability and drive to work without supervision;
- proactivity, assertiveness.
What we don’t particularly pay attention to
This section is only relevant to our work at OWID. Other organizations, including very similar ones, may need staff who fit these definitions.
There are a few things that aren’t on our list of criteria, although some people think that they are:
- having a PhD (this isn’t necessary to join our data team);
- strong experience with machine learning, big data, cloud services, etc.;
- knowledge of many programming languages. In fact, “Python and nothing else” is a much better profile for us than “everything but Python”.
This doesn’t mean that you won’t find people on our team who have these characteristics. Half of our current analysts have a PhD. Half used to develop machine learning models in previous jobs. All of us know other languages besides Python. But none of us joined OWID because of these things. Instead, these characteristics are merely correlated with the skills we are looking for. People with a PhD often understand academic research very well. People who have worked on ML models tend to have strong knowledge of statistics. And people who know many programming languages tend to also be experienced programmers.
Even though projects and work experience are the best way to build your CV, I remain a big fan of book learning. Reading these five books is a great way to sharpen many of these skills at once:
- Hans Rosling, Factfulness
- Nate Silver, The Signal and the Noise
- Philip E. Tetlock, Superforecasting
- Edward R. Tufte, The Visual Display of Quantitative Information
- William MacAskill, Doing Good Better
Where to work
Opportunities at OWID
Even though our team has grown a lot, we still rarely open new positions. When we do so, you’ll be able to find them on our Jobs page. You can also follow us on Twitter (OWID, myself) or LinkedIn, where we will usually advertise new positions multiple times.
Opportunities outside OWID
If you’re looking to work at an organization similar to OWID, I recommend following the 80,000 Hours job board. 80,000 Hours is a nonprofit that provides free advice and support to have a greater impact with your career. Opportunities listed on the job board are always interesting, and many of them are related to data and research.