Back To Top
AI-DIET-World-Speaker-Michael-Boufford-Greenhouse

AI for Hiring: AI DIET World 2021

``Hiring is one of the most consequential things that happens in person’s life.``

“A rapidly emerging sub discipline inside of recruiting is to try to figure out how we hire in both an ethical way and at scale”

I am text block. Click edit button to change this text. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

90

The recruiter only has so much time, and if they only look at the top, let’s say 10%

it’s almost like having an automated rejection for the bottom 90% according to this algorithm``

I am text block. Click edit button to change this text. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

“You can go through 100 resumes and five milliseconds and so whatever they’re doing is amplified dramatically.”

Expand

This is a Fireside chat with Greenhouse’s CTO, Mike Boufford, to discuss how AI/ML is being applied across the hiring landscape. From algorithmic attempts to infer demographic metadata to sexist automated resume screeners, the world of hiring is fraught with meaty ethical challenges regarding the appropriate application of AI/ML. In this discussion, Mike will share more about where the industry is today, where it’s headed, and how to stay on the right side of these thorny issues. Mike Boufford is the Founding Engineer and CTO of Greenhouse Software, the makers of hiring software leveraged by more than 5,000 companies from SMB to the Fortune 500.

Expand

0:12
Hello, everyone. Good morning. Good afternoon. Good evening, no matter what part of the hole you’re joining us from Welcome. Welcome to AI di 2021. Day two. I hope you enjoyed the excellent lineup of speakers on day one. We heard so much great food for thought and from speakers like Christina, Betsy, Katherine, who else, Eric cart? I mean, we’ll see like, what a great lineup of speakers. And I had so much food for thought I learned something new. I hope you did as well. I can’t wait to get started on day two, we have again, a fabulous lineup of speakers. And our keynote is going to be my preferred who is the CTO of GE greenhouse software. Let me introduce myself my mic is a founding engineer and the CTO of Greenhouse, the makers of hiring software leveraged by over 5000 companies from SMBs to the fortune 500. And if you ever wanted to know how AI is being used for hiring today, I know there is so much discussion. It is such a hot topic in today’s world on how AI is being used for guiding positives, false positives, false negatives, all of that stuff, how are we making algorithms better and more accurate, so that you know we are fair to people, we are inclusive, and all of that good stuff. So keep your questions ready. I know this Fireside Chat is going to be on fire. Let me bring my current state.

2:00
Thanks so much for having me.

2:02
Welcome, Mike. I’m sure you are also as excited as I am. I think this is the biggest topic of conversation. Today, as I know, there are all sorts of things that AI is doing. But hiding the things that also affect many

2:19
people today. Yeah, hiring is one of the most consequential things that happen in a person’s life. And it winds up setting up the trajectory for not just them and their lives. But future generations, you know, your parent gets a great job that provides a stable living, they feel engaged and happy in their work. That’s going to translate to better success for the next generation Expo and after that. So you know, there are a few things that are equally as consequential as hiring.

2:51
Absolutely, completely agree with you. So I know we only have 30 minutes, and we have so many things to cover. So I’m going to jump right in. So well, how is AI helping talent acquisition managers automate an array of time-sucking tasks today?

3:10
Yeah, well, I think in some cases, it’s helping and in some cases, it’s probably reinforcing bad behaviors. So maybe we can start there.

3:24
So what’s the fundamental problem, a recruiter I have a job that’s really desirable. And let’s say that I have 1000 people applied for a job, or I’m going to hire one person. This is the most common thing. Usually, you might have, you know, 80 or 100 people or something like that. But luckily, 1000 people applied for some jobs. How do you whittle down the interview process? A bunch of different steps? So the first thing that you do is figure out who should enter the file? Like, where should I advertise it? Who should I reach out to throw an event to try to attract people to come? So there’s a huge range of different things to do to try to source initially, one of the things that people are doing now is, they’re leveraging these AI tools to go out and try to find the list of candidates that they should reach out to. So they’re asking these tools to take the criteria that they think are important to the role. So let’s say somebody has experience in Java or they’ve worked at a company for more than five years, and they’ve worked in a specific competitor where they think they’re going to have some valuable insights into this type of work. There are companies that are building tools that are going out and trying to find candidates. So not all of that is necessarily leveraging what we might call AI, a lot of it’s just going to go and see if they’re, these keywords are present for this person. But they’re getting increasingly sophisticated as they are leveraging certain types of models that you might say are closer to sort of our machine learning base. So one thing that they’re doing for particular, which I think it’s sort of question whether it’s something we should be doing is taking a look at people’s photographs, their names and running them through models that will assess whether they are of particular race or gender. The intention for companies is to try to identify people who would add diversity to their teams and try to bring them on board. So the intention is in some ways. At the same time, they’re leveraging tools to create a real classical ethical conundrum. And how we assign automatically assign labels. The last thing I’ll say on that particular point is, people are doing the same thing. So one of the problems with AI in recruiting is that it’s usually reinforcing whatever the thing is that the person is versus that person’s going out and looking guessing, is this person going to add diversity to my team or not? The AI is doing things a different way, but coming to their successes is whether they come to a similar conclusion as to what humans

6:08
Yeah. So, Mike, I think you touched upon automated sourcing a little bit. Right. I know, we talked about it, and you talked about how. And this also brings into the human behavior that is being used for sourcing and that

6:27
I’m sorry, the mic that a little poppy for a second.

6:31
Yes, yes. So I was saying that should be an iPhone, you’re inside my

6:39
ads. It’s coming from my side. Sorry about that.

6:46
Let’s see. I think that’s.

7:03
Not sure. I, like coming out of my speaker. So I’ll try to mute in between things.

7:14
Okay, no, no worries. I was saying it sounds to me like human behavior is now being amplified through these tools, whether we use AI tools are algorithms to do the same tasks. But humans are doing the same thing, right? I mean, in an effort to bring more diversified people to hire and to train teams, we are doing that we are looking at their profiles on LinkedIn, we are making sure that like, we are going to the other part of the things whether it’s like, whether it’s my performance, or whether it’s my role, also on diversity. And so that’s being amplified to these AI tools. And yeah, that’s an ethical question. Because while we say that we want to be gender-neutral, or want to be like, you know, more inclusive and diverse, it’s hard, like it’s a catch 22, in my opinion, right? I mean, how do you then if you remove the name, you remove the picture, you remove, like, that’s also a discussion, right? You remove all those personal identifiers from a resume and only hire somebody based on merit. And then whether you do it in person manually, or whether you use some tools to do that, then it can if it’s 100% of merit, then it may or may not give you the most diverse team that you’re looking for. So how do you tackle that catch 22?

8:42
I think it’s extremely challenging. I think this is sort of like a rapidly emerging sub-discipline inside of recruiting to try to figure out how do we both in an ethical way and at scale, tried to bring diversity into the team? One thing that we do in our company, which I think is, I would say is maybe the right approach, not saying that all these are necessarily wrong, is we don’t leverage AI, we just ask the question, or we allow companies to ask the question of how people self identify, and then they can leverage that data. In reporting. That’s, that’s provided directly from them. Of course, they couldn’t make something up and say something that’s untrue. But there’s no sort of AI guessing at it. And so I think part of the ethical issue is like, are we does it just feel like you know, we, we just don’t feel good about the computer, assessing those types of things about us. It feels picky as a person. I think the tricky thing with AI is that we’re now talking about doing something at scale that might require an individual to pay attention to go through 100 resumes over a few hours. You can go through 100 resumes and five milliseconds and so whatever they’re doing is amplified dramatically. A good example, which was in the news. So I think probably a bunch of people read it was an algorithm, an algorithm that Amazon has developed to try to judge whether or not somebody is likely to be hired long term. So that’s what a lot of recruiting AI is trying to do the same, I would like to predict who is likely to be hired for the role among the people who are in this applicant pool. And so the training data leverage is a bunch of decisions that recruiters actually made. So recruiters looked at a bunch of resumes, and they decided this person gets advanced and this person doesn’t. And Amazon, it seems like they were using certain criteria that probably a human, but a computer might pick up on. So they might say, you know, what, I’ve noticed that the name John is correlated with being hired as a software engineer, of course, making something up like this. And so this is something like that, if you’re running an unsupervised learning algorithm or, or something like that, over the data, then you know, it’s going to say this is predictive, I can see that this is correlated with somebody being hired. So I’m going to start using this criterion and making judgments about whether somebody should be advanced, a human might not do that. But that might be something else that that sort of criteria might be ignored. And it might be amplified just by looking, having an AI train off of that type of dataset. How do you feel about creating algorithms like putting different weights on different parameters, right, like, or an alum of a particular? Or if you’re starting from a particular university, for example? Or if your keywords match the keyword in the resume, you might be more qualified? And what about the softer skills like leadership and, and communication skills? Those cannot be quantified easily? How do you make sure that the algorithm is designed to pick up the best type of candidates for that

12:05
Kind of? Good question. I mean, I don’t know that. It’s easy to assess how good somebody is at something, just by having them included in the resume. So I could include the keyword Java, like 15 times, that doesn’t necessarily convert to Java. But it probably means I had a bunch of experience, that would be good for leadership. That’s not usually some things where there’s like a good resume for that necessarily. Instead, they’re describing a time when they did something that looks like leadership. And so it’s a lot harder for a matching algorithm that’s just looking at discrete terms, and trying to figure out which ones are most frequent. That type of thing. But maybe I’ll get into ranking algorithms for a minute, I think you also will be on another topic, which I could get into later, which is AI as applied to things like video interviewing. Maybe I’ll start with ranking algorithms. So if I have like 1000 candidates summoned into the top of the funnel, and I have some algorithm that’s putting a sort order on them. So it can be totally random. It could be sorted by when they applied, which is sort of the default thing that a lot of these systems do. Or they’re companies that specifically focus on ranking candidates, and they’re trying to use relevant criteria. So one of the first things they do is they say, what are the attributes of this job? Have you looked at the job description and tried to do a bunch of keyword matching, so it’s probably the first person similar enough to the description that’s in here that they’re worth bubbling up, then they might look at other attributes, which are based off, not necessarily just a pure AI, but based off of like a knowledge graph. So if I have a knowledge graph, that sort of represents the relatedness of companies to each other, and might include some metadata about how prestigious they are, how much revenue they’re making, how many employees they are to university, where did they sit in the US News rankings or something like that, then they might provide some additional weight to say that this person is a good candidate, because they went to do a top school, that’s not necessarily going to be true. But they are taking a lot of the same types of factors that might be proxies for somebody being smart or somebody staying at a company for a long time. So they’re reliable or whatever it is. They’re taking those types of proxies from the resume and deciding to rank people off. Another one that I have seen included is, you know, is the grammar good in the resume? So you’re using a huge range of different features to decide who to bubble up to the top. They’re often doing a ton of automated rejecting From what I’ve seen, which is I think, a worry that people have, they’re like, Oh, well, I apply it to some system and there’s cheating, that’s going to automatically reject me by looking at my resume. I actually haven’t seen a ton of that happening in our industry, it’s certainly not something that happens in Greenhouse, people might configure some kind of rule that says, if you’re under 18, with the question, I’m under 18, and you’re automatically not legal to work for this job, they might send you an auto rejection. But they don’t usually trust an AI to reject a bunch of people at scale from the ranking problems.

15:38
But it seems to me like they would if they are using AI to source the candidates or bubble the candidates to the top that then in some way, they are using some kind of detection mechanism, which is not manual, but automated, right? So the worry is that I’m not even going to make it to the first round on the table of somebody who is going to look at my resume. Right. So

16:00
Yeah, I think that’s exactly right. So it’s sort of like implicit rejection. So you may or may not be getting an email saying, You’re disqualified where the entire process was run by an AI. But by only having the recruiter only has so much time, and they only look at the top, let’s say 10% That the ranking algorithm provides, right, it’s almost like having an automated rejection for the bottom 90% According to this algorithm, right. And so I think, just getting into a little bit more, most of those algorithms today are BlackBox algorithms. And with somewhat limited feature engineering, that’s happening. So sometimes they are doing better feature engineering, they’re not picking up on life, John’s are correlated with being hired. So we’re gonna use John, a lot of them are supervised learning algorithms, with enough feature engineering, that they’re at least trying to use relevant criteria, a thing that they don’t do generally, and I have seen examples where they do so I know that there are companies working on this is to try to ensure that there’s models viability, and you can say, this is how we got to this decision. And the classic example, which I’m sure has already been referenced in a press conference, and I’m sure it will be referenced again by somebody else before the thing is through. If we think about the Fair Credit Reporting Act, I don’t know how familiar was with that? I assume they are. But just to reiterate, for anyone who was not, there is an act that was created to ensure equal access to credit for people in the United States. And so an example of criteria that that that was being used by banks in the past might have been to say, people in this zip code correlate highly with defaults. So unlikely to pay their debts. And so the consequence of that might mean that as an individual, you had a perfect credit record, you were creditworthy by all measures. But because you lived in let’s say, a low-income neighborhood where they had witnessed default before, you were automatically rejected, the outcome of that type of decision meant that people of color in the United States were less likely to have equal access to credit. And so the outcomes were the significance, life-altering disparate impact across a bunch of different groups. So the Act came out and said, you know, you have to provide a reason if you define a specific reason. And it certainly can’t be that they live in this neighborhood, that is insufficient. And so there’s, there are just criteria, like, well, you could reject someone for not having enough credit history would reject somebody for being Leon payments, valid criteria that they use, to be able to claim. And so even though, credit card companies are obviously leveraging machine learning in order to, make credit decisions for people now, because you can get a credit decision in 30 seconds, or 60 seconds, or whatever it is online. They’re doing so in an explainable way. So they’re leveraging decision trees they’re communicating. This is the hiring yet, but I think

19:09
That’s exactly my point. So what I’m hearing from you is like the transparency, the explainability, the AI that’s moving in that direction, less black box and more of you know, why the why people should be able to understand the weights that were given and how it was altered, how it was, how they came to this decision, whether they are acceptable or not. And that brings up an important point, you were doing housework it’s so many different vendors like you were telling me and also different companies directly as well as vendors who make these AI for hiring software. So does anyone mention false positives or false negatives of their decisions or of their models’ decisions that their models in like to Your OSI yours because that would be one way to understand? I don’t know if the industry is doing this and I want to find out if they are. Because yes, we can say at one point that this is not the right fit. And manually also, we could do that. But it would be good to go and go back and reflect on it and say, Hey, two years later, five years later, look at the trajectory of this person, this person would have made a great fit. But for whatever reason, I didn’t pick it up and rejected

20:27
Him or her? That’s a good question. I have not seen longitudinal studies that communicate anything like that. But if anyone has certainly been interested in seeing it. I do think that one thing that’s happening, so we’re better able to track outcomes at like a group level today than we are at an individual level. So again, we don’t have longitudinal analysis on like, what does this person could have been a good hire out of Greenhouse, for instance, because people are self-identifying and say, I am of this particular blend of demographic characteristics, we actually report on pass-through rates all the way through the funnel for each cohort, so for each group, so you can see, you know, is there a drop-off or a decrease, you know, one particular subset. So, I’ll just give a couple of examples of the type of analysis that people might be doing. And this helps us detect discrimination, necessarily, at the individual level, but maybe at the group level or that process design. So if I see that, let’s say, Asian American men, you know, have a pastor rate all the way through to interview x, and then we start seeing a higher drop-off for that group, then another group, there’s probably something to look into. They’re like, what are the reasons underneath that that are causing disparate drop-off rates. And so there’s, there’s one that sort of that I think is fairly prevalent that we’ve seen, which is, you might see differences in how different genders handle different types of testing environments. And so this is something that I don’t think has been studied at enough scale, I feel like conferencing one way or another, but you might see that as a drop-off rate for an in-person tech out interview, like a higher among some women than it would be. So what is it about the design of the interview itself, that might be creating the sort of disparate impact on one group versus another? Whether it’s explicit bias or implicit bias, those types of things can only be analyzed, we can start looking at the outcomes and see if there are differences between one group

22:49
I want to take an audience question here. Beltre wrote down says, How do you? How do you define and measure transparency? Or

23:03
Can you define and measure transparency? Well, I think maybe I can ask a follow-up question and see if it comes back in the chat. Which parts of transparency? Are we talking about, how a hiring decision was made? So I narrow in on the right thing.

23:21
I better please feel free to post your question in the chat. And we’ll pick it up, we’ll come back

23:29
And see it again. I can get back to it. I mean, I think probably transparency would say these are the specific criteria that were used in order to make the decision, just like in the Fair Credit Reporting Act, that there should probably be a subset of valid criteria that can be used. So there’s probably stuff that should not be included and some stuff that shouldn’t be included. You are trying to make some judgments. But, you know, certainly irrelevant details about whether you’d be successful in the role should not be considered and we should be communicating, I think to candidates, what the underlying reasons were, there are some companies that do this manually, they say, you know, here’s, here’s why we decided not to move forward. And certainly, that’s generally appreciated by candidates to understand what the reason is, as we deal with at scale processes, where AI’s are making more of the judgments like the ranking case that we discussed before. I think transparency looks like communicating the specific criteria that that AI used or to use was to the top or the bottom of the list for a given role.

24:36
Yeah, but in today’s hiring process, I know that all you get is a letter whether you were accepted or rejected even in like whether you it was done manually or using a tool, they don’t give you any specifics on how and where you fell short. It’s like oh, this end to go one step further than say, oh, it’s not you. It was not a good fit for the team. So I mean, it’s a blind In a statement, there is no learning experience there, there is no way you can improve on your skills, whether it’s upskilling, or interviewing skills, or, or writing a job, like the resume skills, like you have no clue where you fell short.

25:16
Either there are there certain things that are hard to express to candidates, I think people avoid them. So like things that are relatively easy to express that person says is, you know, we assess your ability in Java, and you didn’t start at my three-year-old was crying. We assess your ability in Java, and we felt like you didn’t do enough stepwise refinement didn’t have the interfaces that were necessary, whatever it is some technical thing, people are totally happy to do that somebody can go off and learn and get better at it the next day. That’s not necessarily the case with things like communication skills. You spoke but you didn’t speak back well, like so. We’re not gonna hire you for that reason, or like, you know, you’re supposed to be a salesperson I wasn’t sold. It’s just that feels a little bit more personal, when certain criteria are used. Those criteria are not used in credit decisions. So I think it is it is difficult to look at you and want to work with you every day.

26:20
It’s hard to do it also at scale these things to send a personalized rejection letter for everyone applying to it. But yeah, I would think so that if this is a learning experience, at least maybe not the first level. But the second level, if the candidate has reached the interview level, some level of personal transparency. Again, this all comes back to explainability, and transparency would be good. And I want to touch upon an important topic here that we barely scratched the surface on pitfalls and using AI assessment. For video interviewing. I know that is becoming more and more common, especially with COVID. Like we can meet in person. So we are underway, we are doing this whole conference virtually, how do you assess like the overall personality of a person

27:12
Through a video interview? Yeah, so this is another sort of newsworthy thing that has popped up. So there are all these video interviewing platforms. And by video interviewing, I’m not talking about like, you get on zoom with somebody else. But instead, there’s a list of questions, and you’re supposed to answer them to a camera, and then somebody’s supposed to watch the video later. Right. And so that’s, that’s the fundamental thing about like, how these have these features that have been added on to some of these systems included things like, you know, does this person pronounce their words clearly? Do they sound confident, you know, there’s, there’s a bunch of different sort of dimensions of personality assessment, that it’s trying to leverage machine learning in order to sort of identity, the problem is, you know, it’s not just a training set problem, some of the things that they’re trying to identify are fundamentally problematic. So, you know, if somebody said, one of the potential outcomes or something like this is, let’s say that I’m judging whether somebody speaks clearly, and they come from a background where they’re not a native English speaker, they are a native English speaker, even in America, they come from a region where they have a regionalized accent. Those are that’s relevant criteria, and whether somebody can do the job or whether somebody even speaks English. Well, you know, and so to have a, I use a factor like that, to determine how well-spoken somebody is, can be fundamentally problematic, create disparate impact. So yeah, that’s the video assessment stuff. I know, we’re running short on time. So.

28:46
So how is supervised learning used today to train ml hiring models?

28:54
Well, I think it probably depends on the point in the process. So there are things like, well, let’s, let’s see, there’s certainly conversational AI models, that’s actually a big one that we didn’t even really touch upon a lot of times career page. Now there’s like a little chatbot. And it’s, it’s AI. That’s a lot of that stripping off of watching how some real human would respond. Things like ranking algorithms are usually trained as super, they’re supervised learning. They’re looking at, you know, the choices that you made, and they’re trying to predict the same types of choices and then apply roughly the same criteria to the problem. I think those are probably some of the bigger use cases for supervised learning. Other ones are like, you know, automated technical assessments. So they’ll look at, you know, code, some code snippet and look at things like how complex is your method? Blah, blah, but then at a level up, they’re also looking at how did users make their judgments and reasons? And I see

29:59
Yeah, I’ll take two more questions. When each panel was asked, on that note, do the AI models making judgments self-correct?

30:10
My firm answer is it depends. Are people actually updating their models? Are they using live models that take in new information? Or are they training up a model and deploying it into production letting it sit for a while. So if we look at things like guessing at race or gender, based on an image, they’re using off the shelf models that tried to do that type of thing, for the most part, and they’re not making major updates to that map, we’re changing over time. So I think it’s kind of a next

30:43
Give us who gets to decide the rules of the game and establish AI models? How does one minimize self-bias that could get generated when creating these interested in learning how you do it?

30:55
So we actually don’t create a mock AI model for recruiting for precisely a bunch of these sort of fraud reasons. So we do try to make sure that there’s human intervention, these types of consequential decisions wherever possible, if somebody is creating an automated rule, like people who are over 18, or are under 18, someone under 18, cannot work at this job. That type of thing is possible. But that reflects a real concrete rule that somebody would be implementing manually on their own. In the industry generally, though, is it? Is it true that you know, these AI models are doing a sort of sketchy stuff? That’s for sure. Sorry. I see another question come in.

31:42
The last one we have time for our team is agile fit,

31:48
Culture fit metrics. So I don’t know that. So culture fit is actually something that is controversial in and of itself. Do you want to hire more people who fit the existing culture that you already have? Maybe yes, maybe now, maybe there are specific attributes you want to hire for like entrepreneurialism or something along those lines. But you don’t necessarily want to hire people just like you, it’s certainly not a good way to build a diverse team. So the lens that people have tried to shift to form culture, fitness to culture, and Are they someone who’s going to add to our culture and make it better over time? So when you’re thinking about culture add, you’re actually making sort of a diverse assessment is this person going to bring something that somebody else hasn’t already brought to our organization? And so the idea of leveraging data to make that type of decision gets harder and harder because your sample size search for what you’re really looking for.

32:40
Thank you, Mike. This has been a fantastic conversation. I learned something new, I’m sure our audience as well. There are just so many things that we couldn’t cover in the interest of time, but I’m sure we’d love to have you back and pick your brain. Many such topics. Thanks again for being here. Take care, bye-bye.

Michael Boufford-AI DIET World Speaker

Mike Boufford,  CTO, Greenhouse

AI-DIET-World-Hero

DataEthics4All hosted AI DIET World, a Premiere B2B Event to Celebrate Ethics 1st minded People, Companies and Products on October 20-22, 2021 where DIET stands for Data and Diversity, Inclusion and Impact, Ethics and Equity, Teams and Technology.


AI DIET World was a 3 Day Celebration: Champions Day, Career Fair and Solutions Hack.

AI DIET World 2021 also featured Senior Leaders from Salesforce, Google, CannonDesign and Data Science Central among others.

For Media Inquires, Please email us connect@dataethics4all.org

 

Come, Let’s Build a Better AI World Together!