# The Right To Vote

Voting is the cornerstone of democracy. It cements the right to elect representatives, to weigh in on state and federal issues, and to have your hand in the course of your country.

Everyone has a vote**.

## But who actually gets to cast it?

That’s the question that determines whether a proposition passes or fails. Not how many people in the voting population support it, but how many of those people vote for it. If 60% of people are for something, and 40% are against it, but 90% of one group and only 10% of the other group get to vote, that’s going to affect the outcome.

Of course, there is already prolific documentation about the unequal accessibility of this fundamental right. This is most often framed in the light of voter turnout, meaning the percentage of possible voters who actually go to the polls. The United States has a startlingly low voter turnout as democracies go. Policy researchers are avidly exploring explanations for this discrepancy with the rest of the world.

But while I was voting by mail this election, I started asking another question.

## “Does everyone vote for everything?”

I live and vote in California. My ballot is 5 pages long. I’ve got federal, state-level, and local representatives to research and vote for; I’ve got county measures and city measures to comb through. I’ve spent a couple nights this week working on my civic duty, much to the chagrin of my laundry and dishes!

It’s one thing to show up to the polls in a country that doesn’t have an election day as a mandatory holiday, or a compulsory vote — that’s already hard, and we know that. Polling place locations and lines are long. Sometimes polling places close and it’s hard to know where to go.

But once you get there, does everyone make it all the way through that long ballot? How many people leave things blank?

I decided to find out.

### Welcome to King County, WA, where voter turnout is over 80%!

This is actually about the average turn-out in the rest of the democratic first world outside of America. (That makes the ~55% American turnout in 2016 look staggeringly behind.)

King County is a mostly urban and sub-urban county — one of the three that make up the larger Seattle-Tacoma-Bellevue area. According to the 2010 Census, King County is majority white; according to the 2017 American Community Survey (see fig 10) a majority of the population has had some college education. Last week, the Washington Post reported the connection between education status and turnout, so it shouldn’t be surprising that such an educated place has such high voter turnout.

I downloaded precinct-level reporting data on every measure on the 2016 ballots of King County. I validated that data with the overall information to make sure the numbers matched up, but the data I worked with has data for individual polling places, so I have a lot more granularity to work with.

But this post isn’t about getting to the polls. This post is about what happens after.

## Voter turnout is high in King County, but not everyone votes for everything.

### People overwhelmingly vote for federal level positions.

I categorize the different races on the ballot in order to group ones together, and order by the precinct-level average % of voters voting. You can see that (in 2016) the highest percentages of voters vote for President & Vice President, with most of the federal positions ranking high (US Senator, US Congress). But state-level representatives recieve votes from about 10% less of the population on average — meaning about 10% of King County residents make it to the polls, but don’t vote for the candidates.

### More people abstain when candidates run unopposed.

Why don’t people vote for an issue? In some case that’s because their vote may literally not really count. In every circumstance, unopposed candidates for state representatives or for judges (more on them later) get about 15-25% less of the overall population voting for them.

### Voting is way lower for judges, of which there’s lots more than the other low positions.

Now back to the judges. Far and away, fewer residents vote for judges, even in contested races! Of course, they also vote at similar rates for the state treasurer and superintendent, but here’s the difference: there were 13 elected judicial appointments on ballots across King County, compared to just one treasurer and superintendent race. Across all these races, people just plain old vote less.

### The distributions of vote share between two sides overlap each other — a lot.

Zooming in on four county-level races (we’ll get back to these particular races later) — two judge positions and two propositions — we can look at the distributions of the vote share per precinct that each candidate or choice got. The more distinct these distributions, the clearer the gap between the two choices.

But instead, these particular races — which are varying degrees of contested, as you can tell by where the mean in the boxplot lies — have vastly overlapping distributions. Are they significantly distinct from one-another? Votes are just the polls that matter. Are these outcomes separated by more than the margin of error of a poll?

## Do people really vote less for ‘downballot’ races?

I actually hadn’t heard of this word until I made this figure and was trying to understand. I showed the plot to a friend. “Oh, downballot races, that’s a known thing,” he said.

And indeed, the literature suggests that both ballot order and choice fatigue have an effect on absention. How does that effect work? Do people just vote less often as they vote for more things?

I got a hold of the county-wide ballot, and luckily, the first page is almost all propositions, which have similar abstention rates. Then I split it out by where the proposition was on the page

## More people abstain as the measure moves down the page.

Here, the Y axis is ordred in the same way as the measures on the ballot, with each box representing one of the 3 columns of page 1. You’ll notice that in each one, the closer you are to the bottom, the less people are voting for you — it’s a literal physical down ballot effect.

With one HUGE GLARING EXCEPTION: in column 3, this effect instead centers around the LEAST abstained vote, i.e., the vote for US President and Vice President.

The further you are from the item of interest, the fewer people vote, i.e., the more abstain.

# The effects on actual outcomes

“These trends are cute,” you might say. “A neat little experiment. But what’s the consequence?”

As before, the importance of voting is really in representation. If a non-representative group of voters votes, the election doesn’t represent what people want.

Often this is the argument for voter registration drives. “If we get out the vote, and people show up, our propositions will pass.” “Our representatives will be elected.”

So how big of a difference can abstention really make alongside getting the vote out?

# 30% of county-wide races and 20% of municipal races could be flipped by abstainers.

I did a simple analysis, asking just the question, “if everyone who turned out to vote but abstained on an issue voted for the losing side, how often would the outcome reverse?”

Here’s a visualization of this analysis for a few county-wide races — the ones we looked at before, to see how well the vote share distributions per precinct overlapped.

The margin of actual counted votes was within about 50-100k (5-10% of turn-out) of one-another for these disparate races. But that turn-out includes a similar number (or in the cases of judges, sometimes a drastically higher number) of ballots where that particular box was just left blank.

In each of these cases, the blank votes would be more than enough to reverse the outcome — and in the case of the judges, they could reverse them by a serious landslide.

And these reversals aren’t always on the levels of hundreds of thousands of people — in at least one case, the margin between one legislator and the other is literally one actual votePeople’s votes literally do matter, and in a simplistic model, they could have a big impact.

## If their votes could mean so much, what’s making people abstain?

So if votes are so valuable, why aren’t people who actually DO make it to the polls … using them on every ballot measure?

There have been a lot of good pieces of interviewing journalism on why people aren’t voting this year (or at all). But I wanted to look at this question from a different perspective, by examining my dataset rather than asking voters for their personal explanations.

## The wider the margin, the more abstentions (for local races)

I’ve continued to mark out races that are flippable (by my test above) here, but am focused on something else: the correlation between the number of blank ballots and the winning margin of the winning side, though only for municipal races. (The correlation for county-wide is likely underpowered and therefore insignificant).

By Spearman’s rho, there’s a statistically significant relationship indicating that the bigger the vote margin  gets, the bigger the number of blank ballot lines gets ($p = 3.04\times10^{-5}$).

Why is this?

Is this because people feel they need to vote less on less contentious issues? Is it because people who feel really extremely vote, and others don’t, so the vote looks extremely split?

That much we can’t tell from just this data (that’s where population-wide polling data comes in…), but we can say there is a strong relationship between these two quantities.

So we’ve learned:

1. Voters don’t vote for judges.
2. Voters lose steam as they move down the physical ballot, or away from items they’re interested in voting in.
3. Blank votes are not a trivial number of votes — in 30% of county and 20% of municipal races they could outright flip the direction.
4. The less competitive a municipal racethe more blank ballots there likely were.

# Turnout is the beginning, but it isn’t the end.

If you’ve had all the time in the world you could possibly need to understand a measure and you say, “I’ve done all the research I want, I can’t possibly choose [candidate | some choice | yes | no ],” abstaining is a valid exercise of your right to vote.

But are voters abstaining for that reason? Would they really feel empowered to make informed decisions about every candidate, including the decision not to decide? Even if 100% of people went to the polls, do they have the opportunity to learn and think through what they’re voting for?

I’d argue the hints of ballot fatigue suggest that isn’t what’s going on.

Getting to the polls is just the first step.

It’s what we do once we get there that fulfills that civic duty and cements the act of participating in a democracy — whether that’s participating by voting, or by abstaining.

And if citizens don’t feel ready for that, even in a world of perfect, 100% turnout, we won’t have a representative voting process.

It isn’t just about turning up to the polls. It isn’t about checking a box. It’s about feeling empowered to make choices, not feeling distracted, frustrated, and throwing up your hands and saying: “Whatever, I guess I just won’t vote on this measure”.

It is not enough to just show up.

# The price of a GRFP, part 2

###### (This post is a continuation of the price of a GRFP, part 1. If you’re not caught up, you can go back and start from the beginning.)

In this next installment about the GRFP I’ll be aiming to answer the second question in my first piece, about the change in NSF demography from before the Dear Colleague Letter announcing the ineligibility of graduate students to apply twice.

## How has this policy change impacted diversity?

One simple way to look at this is to start with the applicants being awarded who come from prestigious undergraduate institutions. As we know from the last post, the # of awards to these institutions is already fairly lopsided. Check out this cumulative distribution function of sorts, with the % of total awards on the Y axis, given to N schools on the X axis:

The top 3 schools recieve 10% of awards. The top 10 get about 25% of them. The top 30 get about 50% of all 2000 GRFPs. Once you’re out in the top 100 you’ve given away nearly 3/4 of your awards, with about 600 schools yearly splitting the last 25% amongst themselves.

But how has the share of awards from year to year to those top schools changed?

## 1. Top 10 schools recieve over 50 more awards than last year.

Here schools are ranked not by selectivity but by the top NSF-awarded schools. (Of course you can tell these two things go hand in hand.) There’s been a big uptick in awards in the first year this policy came into play, and it’s not precedented by change in the previous year.

## 2. In fact, Top 10 schools recieve quite a few more awards than they have for a while.

Since we have data going back quite a bit, you can extend the plot further back in time:

Beginning in 2011, there’s actually a persistent decrease in the number of awards going to top 10 schools. This levels off in 2013 around where it remains – until the spike in 2018. Also note that the change in this year is about 2 times larger than any change in previous years (and those were all decreases).

## But the claim was that the number of undergraduate applicants would increase, and that they’re more diverse.

These two figures only address undergraduate schools — not the applicants who are still undergraduates. We don’t directly have undergraduate status, but can infer it imperfectly by looking for applicants who (a) list their undergraduate institution as their current school, or (b) list no current school at all. Anyone in the third category, (c) list a different current school than their undergraduate institution, is most likely a graduate student; we assume others are most likely to be undergrads.

This method is imperfect, but we can use it to understand at least some of the changes in undergraduate awardees over this same time.

## 3. The same # of undergraduates win the GRFP, but more of them come from top 30 schools.

Here, we see that there’s been a decrease over time in the overall number of likely undergraduate awardees — consistent with a trend that might concern the NSF, and prompt them to put in place a policy like this! But interestingly, the number of undergraduate awardees increases in 2016, not 2017 — and the only increase from before the policy (2017) to after (2018) is in the number of those awards going to undergrads from those top schools.

## At least this policy stops people from reapplying a million times, right?

I was interested in understanding the landscape of NSF reapplicants for obvious reasons — because this is the applicant pool most significantly decreased by the policy. Now that graduate students can’t apply twice, who does that really curtail from winning the award?

I identified probable reapplicants in the dataset by comparing the NSF awardees 2012-2018 with the honorable mentions from 2011-2017. I then split these up by undergraduate school of origin.

## 4. Reapplicants from the least selective schools are most harmed.

There’s a precipitous drop of about 60% in reapplicants winning from 2017 to 2018 — indicating that indeed they’re dropping out of the pool! But when you compare reapplicants from top and middling school backgrounds on the left, their 60% award drop pales in comparison to the 80% drop for those from the least selective schools.

There are a few more analyses I want to do, but for now, these are the main takeaways: the new policy is definitely impactful. But it seems to have result in a precipitous increase in awards to top schools. Does that fulfill its mission of expanding diversity? Well, maybe, if the most diverse students at those top schools are getting those awards.

But even for fully funded students, the NSF isn’t the difference between the dream of science and the reality of it. How many times have people called it a medal, or a feather in a cap? Should its mission be to decorate people who have funding or to change the lives of people who don’t?

Last but not least, if you’re curious about any of these things, I’ve put together a shiny app that lets you play with your stats– and see how many GRFP slots there are for you.

If you’re interested in adding any other sliders, shoot me a comment or tweet at me – I’ve got a lot of data 🙂

# The price of a GRFP, part 1

I had some downtime a while back (literally; the cluster I work on was down) and so I cracked open some analysis I’ve been doing on the side for a while. I like to switch off analyses and work on some side projects to keep me working, but not burnt out, and so I picked up a dataset I’ve been working on for a little while: the NSF GRFP awardees.

### A dear colleague letter

About 2 years ago, the NSF made a policy change announcement, summarized here (capitalization & other emphasis mine):

NSF will limit graduate students to only one application to the GRFP, submitted either in the first year OR in the second year of graduate school. … GRFP continues to identify and to inspire the diverse scientists and engineers of the future, and especially encourages women, members of underrepresented minority groups, persons with disabilities, and veterans to apply…. This is a more diverse population than admitted graduate students.

Lots of great commentaries have touched on the diversity challenges of the GRFP and the potential effects of the policy change proposed. Others have even looked into some of the representation details themselves in their commentaries.

But I want to ask two questions in a data-driven way this year, the first year that the rule is fully in place**:

and

## How has this policy change impacted diversity?

###### **While it was announced a little while back, this is the first year that the rule will be fully in effect as last year students who’d applied as 1st years were allowed to take the 2nd shot they’d expected to get.

Note that the NSF’s mission isn’t to award the absolute best scientists regardless of diversity. It isn’t an award sheerly on academic merit, on the number of Science and Nature and Cell papers you’ve got (or even for stellar bioRxiv preprints). It’s aiming to identify and inspire a diverse future.

The NSF GRFP is not an R01. It doesn’t fund high-risk experiments and expect grand returns. For 2000 people a year, give or take a few, it provides a chance to do research without financial pressure, as it is — with a scant living wage and freedom from burden.

But is it providing those resources to people who genuinely don’t have them? Is it funding fully-funded scientists with star-studded credentials or people who had to take student loans through college?

I downloaded data about 28,106 NSF recipients from 2011-2017, and matched the 700+ institutions they hail from with the College Score Card, a resource from the US Department of Education which has extremely comprehensive** information on accredited colleges in the United states.

###### **When I say this I’m not kidding. They’ve got the percentage of Pell Grant recipients who died within 2 years of starting at the university. All kinds of intensely random stuff. It’s really cool.

And I used this to study only the undergraduate institutions of NSF GRFP recipients. I’d theorize that properties of these reflect their opportunities prior to the beginning of their scientific career as graduate students. While I don’t know the story of any individual NSF recipient, I can say a lot about how diverse their undergraduate school’s populations are.

## 1. The most expensive undergraduate schools have an extreme excess of recipients.

The median school with any undergraduates who win the NSF has 2 winners in any given year. However, if you stratify schools by cost, the top most expensive schools have over a fourfold excess of winners. The cheapest schools have extremely few. These correspond to yearly tuition differences of nearly $10k at private schools and$5k at public schools – fairly large differences, compounded over 4 years. It could cost an undergraduate $40k extra to attend a school that has any shot at getting them an award. How are they paying for that? ## 2. The schools GRFP undergrads go to have smaller Pell grant populations. The Pell grant is a subsidy that the US Government provides for students who need it in order to afford tuition. They’re limited to first-time bachelors’ degree recipients in genuine financial need. And notably, although quite a few students get them across the entire College Score Card dataset, the proportion needing them at schools that GRFP recipients come from is strikingly lower, for public and private schools. ### To get a sense of the starkest differences, I look at schools which had even 1 undergrad recieve an NSF from 2011 through 2017. ## 3. There is a difference of nearly$30k between family incomes at schools with and without even a single NSF recipient.

Each College Score Card school has a reported family income for dependent students (e.g., students being claimed as dependents by their parents). In both public and private schools, the difference in mean between schools with just one NSF recipient (to say nothing of those with outlandishly many) is \$30k – ironically, that’s about the size of the extra graduate stipend they’re about to win.

## 4. There are 16% more first-generation undergraduate students at schools with no NSF GRFP undergraduate recipients.

At both public and private schools, according to the College Score Card, there’s a difference of 12-14% in the number of first generation college students in the school population between the schools whose undergrads earned even 1 award during 2011-2017 and those who earned none at all.

## “But Natalie, what if the students at these high-income schools who are winning awards are there on scholarships, and your plots don’t represent them?”

Sure, that’s obviously possible and a big caveat. I think there are an incredible amount of deserving, hardworking people of diverse backgrounds at top institutions.  Certainly everyone who gets the GRFP, and many people who don’t, deserve it.

So really we want to ask: Should we consider the NSF GRFP a success if it by and large gives resources to schools that already have the resources to recruit and inspire diversity? What about the incredible deserving people at other institutions who could truly be inspired by the opportunity to attend graduate school?

## “These are honestly just a few measures of economic opportunity and equality. I’d rather see…”

I like your style! To placate you, check out this Shiny I built app where you can look at a lot more about the schools and compare award winners and non-winners.

## “So then what are you saying?”

This part of the mission statement has stuck with me throughout this analysis:

GRFP continues to identify and to inspire the diverse scientists and engineers of the future, and especially encourages women, members of underrepresented minority groups, persons with disabilities, and veterans to apply.

This is a great and noble goal. But do I buy that the entire pool of outstanding diverse future scientists is hiding inside the same few halls of learning? No. There are graduate institutions who win an extreme excess amount of GRFPs where, as I know myself (since I trained at one!), students are already fully funded, mostly on RAships.

There the NSF GRFP becomes a cap feather, not a guarantor of stability the way it could be in another circumstance.

So does that fulfill its mission?

——–

But let’s get back to the bigger question. We may not be surprised to find that the NSF GRFP is not awarded to the most diverse, most needy group of future inspired scientists (regardless of its mission).

But how has the new policy affected that?

For more that, stay tuned for part 2, as I crunch the numbers for the 2018 NSF GRFP, relative to the 2011-2017 classes.

# Interactive data visualizations

As a scientist when I read and review, I don’t feel satisfied seeing the visuals in print. I pull up the paper on my computer and I have the urge to push an axis, to add a variable, to do my own discovery.

Of course, papers and publishing have a place in our scientific community and discourse that’s extremely important. They tell stories. But we’re all scientists and reading stories gives me (us?) the urge to make stories.

So I wanted to reflect that in my visuals, and give people the opportunity to see more of the story than you can share with a fixed image.

Were this a poster, I wouldn’t have that luxury. But this is the internet, after all. So try out some of my interactive visuals here, and write your own stories.

# ASHG2017 Question Portal

###### Looking for the data entry portal? It has closed for ASHG 2017; contact me about data entry or adapting the portal for your own use via Twitter (@NatalieTelis), email (first name dot last name at gmail dot com), or the form on my About page.

Recently I’ve had the opportunity to present my work on a project asking questions about question-answer behavior at conferences. I’ve been asking two big questions — who’s here, and who’s asking? — but I also get asked a lot of questions about the project, so I wanted to write more about its history.

### The first question

The beginning was simple. The first conference I ever went to, I participated. I asked questions. And at the end of the day, I realized I was the only woman who had.

“That’s weird, isn’t it?” I remember remarking to a friend. She had won the same fellowship I had, the one that sent me to the conference. “I don’t know,” she told me. “It seems nervewracking to ask a question.”

Not to me, I thought. I saw questions as a way of learning. I wrote down questions every talk I went to, though I didn’t ask them all – sometimes they were answered, or sometimes I decided I wasn’t as interested in knowing. It was a device to get myself to engage and participate.

But was I alone?

### Being in the room

Ultimately what I was asking was did gendered participation differ from representation. In essence, was the population of people participating in the meeting (in this one, very simple way) a random sample of the population there?

It wasn’t simple to decide if the answer was yes or no. My background is in math, I reasoned, and sometimes I was the only woman. But was that still true here? I couldn’t answer the question quantitatively without knowing more.

And of course I wanted a well-powered quantitative answer. Without a quantitative answer, I reasoned, I’d miss out on being able to measure the effect. And that meant I couldn’t understand whether it was present, or more interestingly, whether it was perturbable. If participation and representation weren’t the same along this axis, could I make them the same?

Quantification meant I needed data. And data was simple to come by. When you sit at a meeting, you observe every question — as long as you aren’t always rushing for coffee between talks. Coffee safely in hand, I was free to record whatever I wanted.

So I started recording the answer to my first question:

###### ** I’ll say here the biggest caveat of this work is that, without asking people to identify themselves and gender themselves (problematic for many reasons!), I’m limited to the constructs society leaves us to make assumptions about speakers and askers. These characteristics (like names) are flawed for obvious reasons (as is any binary simplification of a spectrum). Although they provide statistical power at scale, they don’t capture any individual person’s truth.

But I also recorded auxiliary information I was interested in. For example, was the question-asker a moderator? (Moderators are supposed to ask questions, but I reasoned their questions could affect non-moderator questions, so I record them too.) And what did the question-asker actually say?

But in order to analyze the information I was collecting, I needed to know more about the audience.

And that’s different from meeting to meeting, so take note – from here on out, I’m talking about the American Society of Human Genetics meeting.

### Knock knock; who’s there?

###### Figure 2: Number of abstracts from each state in the Bioinformatics category at ASHG2014-2016. Remarkably, the information we collect to know “who’s there” includes abstracts and affiliations, and in addition to learning gender proportions, this information gives us a surprising amount of information about fields all across the US.

To know who was there, I made a big but simple assumption: people presenting talks or posters were definitely there.

This makes a lot of sense on principle since they’ll obviously present. This guarantees they’ll be present at the meeting at some point. Unlike their co-authors or their last author, who might not attend as many talks, I reasoned that the authors of abstracts and posters were a good representation of what an audience might be like at a talk in that same field. They’ll literally be there to attend talks, and so might actually ask questions; most importantly, they represent the gender ratios across their field.

### There is no one answer to “who’s there.”

When looking across the poster sessions, many of them differ significantly from the overall proportion of women, which is around 45-49% (depending on the year). But then we have things like bioinformatics, clocking in at 28%; on the polar opposite spectrum we have genetic counseling, ELSI, education, and health services research, coming in at 69% female.

But the fact that the variation was so broad meant I could ask who was asking questions in a few contexts. I collected data in person, in bioinformatics and statistical genetics sessions (my subfields). And my collaborator Emily Glassberg joined the project and collected data from invited sessions across those specialties.

We set our expectation for questions based on the women in our estimated audience. At an ELSI session, we’d expect 69% of questions to come from women. At a bioinformatics only session, we’d expect 30% of questions to come from women. At a statistical genetics session, we’d expect 40% of questions to come from women.

But we were wrong.

### Present, but silent

We found that overall, women ask 2/3rds of the questions we’d expect. But we were able to ask even more nuanced questions:

Did women ask fewer questions when they were underrepresented? They did, as demonstrated by the Stat Gen / Bioinformatics bar.

But it wasn’t enough to increase representation. They still asked fewer questions than expected even in the most female-biased sessions.

And we tested our assumptions about audience by studying plenary talks. The plenaries have no competing ASHG events, so theoretically, the 45% – 49% of women attending each year should all have the opportunity to attend the plenary talks. This meant we could be very certain the audience was nearly 50% female.

And yet, across each category, we found again and again: women ask fewer questions.

### More nuance, more power

But we had a lot more than just who asked a question, and during what talk. We also knew a lot about the speakers. And as always, looking for the nuance paid off. We found, curiously, that women preferred to ask questions to women.

And men also preferred to ask questions to men.

###### Figure 5: Relative to the overall proportion of female speakers in our dataset (0.4), women prefer to ask questions to women, and men prefer to ask questions to men.

This significant difference is controlled for how many women and men (speakers) were available for women and men (askers) to ask questions to.

There are other trends we find – like trends about word use, and trends about follow-up questions, but most important to take away is that there’s a lot of nuance to participation and representation. It’s not as simple as saying 50% of the people here are women, we’re done. And it might not even be as simple to say women asked questions — when they might not be asking them to the same network of scientists.

One of our aims was quantification of the trend. But we also wanted to understand — how malleable is this trend?

I had the opportunity to present these results at ASHG 2017. And both to encourage a kind of democratic science that I really believe in (more on this later), and to measure the effects of a major presentation as intervention, we developed a crowd-sourcing platform to collect and record questions.

The platform was really successful — more successful than we were. To our six-hundred or so questions over 3 years, we got over 1000 questions from nearly 50% of the talks at ASHG 2017. There were almost a hundred different participants who recorded anywhere from one to almost one hundred questions.

###### Figure 6: All the crowdsourced questions recorded at ASHG. Each individual recorder has their own line across those questions. These questions cover just under 50% of the talks presented at ASHG.

This gives us incredible power to ask questions even wider-ranging than our initial ones. Of course, the data isn’t perfect. But with so much of it, we might have the power to look at even more:

1. Questions were overwhelmingly recorded by raters staying in sessions. We know the order of talks and therefore how long each talk lasts, from when the previous questions end to when the next ones start. Do men present longer talks than women?
2. This information also lets us ask whether men ask longer questions than women do, by looking at the gaps between questions.
3. How do the ratios change after the plenary? Do they change at all?

### Participation isn’t representation, and vice versa

We’re continuing to analyze the data from ASHG 2017 to understand how much impact talking about these trends and crowdsourcing data collection could have.

But what became really clear to us is that quantitatively, the group of people represented at a meeting is not always the same as the group of scientists participating at that meeting.

It is mathematically accurate to say that ASHG is nearly 50% female. But that’s not a sufficiently nuanced quantification of ASHG diversity. Overrepresentation in one field doesn’t change underpresentation in another field.

And even given the context of representation, we can tell that the people asking questions at a meeting aren’t the same** as the people attending the talks.

###### **We’ve thought about how they may differ and some of our detailed methods can be found here at our FAQ.

So a more nuanced quantification of demographics gives us the power to dig past summary, deeper into the statistics of representation. And along the way we find, regardless of the context: participation isn’t representation.

Which is great for us. We get to keep trying to ask questions about questions, and drilling down into the quantitative, measurable mechanics of these phenomena.

Want to know the answers? Well… stay tuned. =)

# Online Methods (e.g., an FAQ)

There’s a wealth of incredibly interesting questions about questions, as you can imagine! We figured we’d take some of the most common ones we get, and condense them down into one big FAQ.

Do you record/account for question seniority?

The principle underlying this question is as follows: “Who’s in the room” varies along many axes outside of gender. These include things like academic seniority. Perhaps the population of question-askers is actually a smaller subset of who is literally in the room, along such an axis. For instance, maybe only faculty ask questions.
This is challenging for us to literally evaluate at ASHG on a per-question basis, as this would require identifying question-askers.
However, in smaller study environments, we’ve been able to do something which approximates this, which is to stratify “who’s in the room” along the axis of seniority. For instance, at the Biology of Genomes meeting, the abstract booklet contains PhD / non-PhD status. This means it’s possible to separate out faculty and postdocs, and look at both of those attendee proportions. As you can see, they are different (PhDs are less female), but not different enough to explain the observed efffect.
Figure: Question-askers (left) at Biology of Genomes 2015 (total qs: 147), as compared to proportions of attendees (right). BoG 2015 is chosen as this is prior to any publicization of data-collection or data gathering. See right that non-PhD holding attendees are somewhat more female than PhD-holding attendees; however, this is difference is substantially smaller than the difference required to explain the proportion of female questioners.)

Is your gender classifier accurate for names from other countries?

In short, yes, as much as possible. We use genderizer, available for both Python and R, which draws on hundreds of thousands of names from almost 100 countries. As a result, our classification is as complete as possible given this information, and we achieve a classification rate of about 70% (see below), which we use to estimate the proportions of women and men present.

How can you be sure your proportions estimated are correct?

Of course, we couldn’t be certain, unless we had a perfect ground truth. But luckily, we’re close! Since 2016, ASHG has internally allowed people to report gender on registration. We compare the inferred_v_reported genders for 2016 and 2017, and see that our pipeline estimates are extremely similar.

How are the people who ask questions chosen? Could the people choosing them be biased?

This question is undoubtedly informed by the large body of literature confirming that teachers in the classroom spend more time speaking to and interacting with male students. Correspondingly they also call on female students less and interrupt them more. (This is mostly the work of Sadker and Sadker and is described well in either David Sadker’s book or this broader textbook.

However, ASHG is remarkably equanimous, as there are self-selecting microphone lines. Admittedly, not during every session is there an opportunity for every person to ask all the questions they want. (However, we record these sessions.) We also record positioning of microphone and speakers at microphones. Since the lines are self-selecting, there’s no need for a moderator or any other potentially biased figure to be choosing hands amongst a crowd.

At the Biology of Genomes meeting, where we also collect data, the microphones are held by individuals and move. In this different scheme (which might be slightly more biased, as the individual with the microphone has to move towards someone soliciting it) we still record a similar magnitude of effect [binom(16,147,0.35), p=2e-11] prior to any intervention on our part.

How do you figure out that men ask men, and women ask women, if not all speakers and audiences are in the same room at the same time?

In essence this question gets at the following idea. What if most women are in rooms with mostly female speakers, such as ELSI sessions. And what if most men are in rooms with mostly male speakers, such as Bioinformatics sessions.
Wouldn’t this create a (not-perfectly-symmetric) bias for women to ask questions towards women, and men to ask questions towards men?
Yes, that’s absolutely right, it would! To account for this, we wanted to test for consistent, within-category bias. In essence, imagine a contingency table for each category with frequencies, set up like this:
Male p 1-p
Female 1-q q

What you’d expect is, regardless of the session, to see p and q (the male-to-male and female-to-female questions) having a little bit more weight than (1-q), (1-p).

In particular, you can measure this by looking at the difference, pq – (1-q)(1-p). This represents the difference between products of frequencies of same-to-same questions and different-to-different questions. Under a null, the expectation of this difference should be 0. If the difference was greater than zero, this would suggest there are more same-to-same questions.

To test this, we take the questions within each invited sub category, as follows, and re-assign them. So say you have 20 questions – we assign each one of them to come from a female asker to a female speaker, or female to male, etcetera and we do 10 thousand such permutations for each sub category. From this, we calculate a mean statistic and look at the distribution of those statistics.

Of course we calculate the same statistic for our own dataset, and as you can see, there’s a signficant skew observed in our data (pink) over the permuted sets (black)

So we subsequently conclude there’s a session-stratification-controlled significant bias towards female-to-female and male-to-male (same-gender) questions, as opposed to female-to-male and male-to-female (cross-gender) questions (p=8.1e-5)

We verify the accuracy of this statistic by performing a similar test not on the frequencies but on the raw contingency tables of counts of questions in each category. We use the Mantel Haenszel (say that one three times, fast, out loud!) test to look at the combined odds ratio, and again, across sessions, we see the same consistent trend (p=0.004).

Since you’re crowdsourcing ASHG2017 collection, how do you know whether people are recording the same talks?

Great question! (Note: this answer pertains ONLY to the new crowdsourcing dataset). Participate at our crowdsourcing portal!

Each device that logs data into our database is anonymized and recorded (and controlled by a human, via CAPTCHA). This is how we build our question-entry leaderboard.

But wait. How do you match all the different recordings together for one talk?

(Note: this answer pertains ONLY to the new crowdsourcing dataset). Our entry-tracking means we can actually do a kind of string-alignment — something many of us geneticists should be familiar with — to ensure we’re matching questions correctly.

For example, imagine that user 1 records the whole question session. User 2 comes in for the next talk and starts recording midway, and user 3 leaves to go to another talk and stops early. As a result, you have something like this:

True String M M F M M F F M M M
User 1 M M F M M F F M M M
User 2 M M F M
User 3 F F M M M

You can even see that User 2 and User 3 don’t overlap at all!

However, in computational biology, we’ve developed a lot of methods to align strings and derive a consensus. And in fact, that’s actually what we do! We borrow standard Bioconductor packages to do a multiple sequence alignment and derive a “consensus” question string. As we continue