Author: Matt Dickson, Capital One
There’s little dispute that data drives the world. Sessions of Congress, major court cases, and some of the largest companies in the world exist because of the power of data. Regardless of the opinions we may have around data privacy, ownership, and the ethical use of data, it’s hard to deny that data is the backbone of I/O psychology. By combining data analytic skills with the inherent background our field has in organizational science, I/O psychologists are uniquely positioned to make meaningful differences within organizations, and drive innovation within this ever-evolving field.
Data is omnipresent in the world of an I/O. It’s data that helps prospective job-seekers find the roles best matched to their own skills and abilities; it’s also data that enables organizations to efficiently navigate the costly world of hiring the best possible personnel they can. Data is the ally that helps raise issues of disparate impact or discrimination in the workplace to the forefront, be it with regards to pay, performance, or any other measurable outcome. Data is everywhere in the world of an I/O, but it isn’t everything. It must be coupled with a conceptual, as well as technical knowledge of how to understand, manipulate, and analyze it…otherwise you’re left with just numbers on a screen.
Generally, I/O psychologists tend to be pretty good with numbers. You’ve probably opened Excel more than a few times in your life, hand-calculated some inferential statistics a time or two (or perhaps that’s a distant stats class nightmare), and whether it’s your best friend (or frenemy), you at least know what SPSS is. That said, in order to practically work with data as an I/O, it isn’t always as simple as it was in graduate school. Later in your career, you might find yourself saying things like:
- “If I try working with a file this big in Excel, it crashes!"
- "What does this part of the output mean again?"
- “I got laughed out of the room when my IT department saw what SPSS costs!”
Even getting access to - let alone working with - data as an I/O in the real world comes with a bevy of challenges. In fact, I would argue that I/Os are not consistently equipped to handle data as part of their formal training. Fortunately, there are ample knowledge, skills, and abilities that we can leverage from the world of data analytics.
Data Manipulation and Analysis: Data analysts excel at gathering, organizing, and analyzing large datasets. They have expertise in statistical methods, data querying and cleaning, and languages such as SQL, Python, or R. These skills enable them to extract meaningful insights from complex data, identify relationships, and generate predictive models. I/O psychologists, by training, are well-versed in statistical methods and are familiar with working with data. However, there is often a gap when it comes to manipulation of data and combining data sources to make it usable (e.g. joins, interacting with data warehouses, knowledge of data types).
Data Visualization: Communicating data-driven findings effectively is essential for driving change within organizations. Data analysts are skilled at creating visually appealing and informative charts, graphs, and dashboards that convey complex information in a concise and understandable manner. This skill is invaluable for I/O psychologists in presenting findings to stakeholders and facilitating data-driven decision-making. While I/O psychologists are able to create data visuals that communicate findings, there is often a challenge in doing so in a way that communicates a clear and powerful (but still accurate) message to colleagues who may not be as familiar with data analysis or statistics.
Machine Learning and Predictive Modeling: Data analysts are knowledgeable about machine learning techniques, allowing them to build predictive models based on historical data. These models can help I/O psychologists anticipate future trends, identify potential issues, and make data-driven recommendations to improve organizational outcomes. Most I/O psychologists are familiar with the concepts of logistic and linear regression, and these are often keystone algorithms in the work we do in both applied and academic settings. However, there is not a ton of exposure to unsupervised machine learning algorithms, more focused on identifying hidden patterns in data. Factor analysis is a great example that I/O psychologists are likely familiar with, but increasing the exposure I/O psychologists have to other similar methodologies will only add to what the field is capable of.
However, acquiring these skills does not always require additional coursework, schooling, or even additional financial investment. The internet has afforded us so many resources, enough where one can go from being a novice to data wizard, all with some time, structure, intentional searches, and a lot of practice!
Here are some tips for upping your data analytics skills, coming from someone who primarily developed these skills through self-learning and independent practice and leverages this skillset daily within my role in People Analytics at Capital One:
- Learn SQL: I can speak firsthand to the challenges in working with data, especially large amounts of organizational data without first having a basic knowledge of SQL (Structured Query Language). SQL is the fundamental language used to extract or query data. It is often how I/O psychologists will retrieve existing data, often from a server or data warehouse. You don’t need to be a SQL expert, but spending an afternoon learning the fundamental commands, and then practicing them as much as possible, will make the rest of your data journey infinitely easier. YouTuber Shashank Kalanithi has a great one-hour tutorial video of getting started with SQL, which also includes accompanying notes reviewing the most essential commands you’re likely to use most often.
- Reduce, Reuse, Recycle: As you build upon your experience in working with programming languages like R or Python, the more you’ll start to recognize that reusability of code (or entire scripts) is something to optimize for. While innovation is invaluable, it doesn’t make sense to reinvent the wheel all the time. Crafting code with a mindset towards scale and reusability will save you a lot of time in the long run. One strategy I’ve found help here is to keep a central repository where I’ve written functions that are useful across many contexts to reference when writing new code. Another is to annotate your scripts as much as you can! This will make it much easier to revisit and identify reusable parts of code later on, that maybe didn’t seem as valuable at the time.
- Learn to Search: The languages that analysts use to work with data are vast, and the packages that are developed for them only add to the depth and scope of what they can do. With that said, it’s highly unlikely that you will learn and memorize every possible command or feature that a language has to offer. Therefore, it’s super important to learn how to search for things. Many packages have built-in help documents, but oftentimes, the internet is the best source for answering the more specific questions you’re more likely to have. There will be a lot of trial and error, but take note of how you phrase things when you search for how to do something - there’s an art to effectively searching for code resources, and that will come through practice.
- Online Tutorials are Fantastic: Online tutorials, in my experience, have served a dual-purpose. For starters, they are great ways to get step-by-step demonstrations on how to work with a particular tool or method. They are also great ways to pick up “cheat codes” and learn other’s tips & tricks. In fact, I am hard-pressed to remember a time where I didn’t watch a tutorial and pick up a trick on how to approach a certain situation differently or apply a tool or technique in a way I wouldn’t have thought of before. You often get more than what you seek when it comes to these tutorials. Some of my personal favorites have been Shashank Kalanithi and David Robinson for some great R webinars, and Kevin Stratvert for all things Excel (just to name a few).
- Hit Those Keyboards: I appreciate you taking the time to read this, and I’m sure creators like those mentioned above also appreciate the time you spend watching them work with data, but you won’t get much value out of any of these resources until you get in there, start writing some code, and create some error messages. A quick search will turn up several free datasets to work with, but good starting options are the “mtcars” and “ToothGrowth” datasets for R (both built into the base version). Python users will get access to several included datasets by installing the “seaborn” package (e.g. “flights”, “fmri”, “penguins”, and “titanic”). With data so prevalent in the world around us, pick a topic area that interests you, and you will likely find a dataset to start working with. Some of my favorites include Sean Lahman’s baseball database and this Star Wars survey dataset collected by the folks at FiveThirtyEight.
There is an incredible amount of value in being a data-savvy I/O psychologist. Being effective in speaking both the language of I/O, as well as the language of data is an incredibly rare skillset today, but one that will become more necessary as data becomes an essential, and possibly foundational element, in our lives at work and at home.
Author Bio: Matt Dickson is a member of the Talent Assessment team at Capital One. Sitting within People Strategy and Analytics, Matt draws from selection expertise and analytical skills to inform data-backed assessment and selection recommendations across the enterprise.