Skip to main content

Increasing participation of women in computing fields: A data science perspective

We live in an age when everyone understands the importance and pervasiveness of data—the pictures we take, the activities we track, the transactions we make, and the websites we visit. Data can often be contentious in terms of ownership and access. Today, businesses must deal with the availability, pervasiveness, control, and usage of data for their day-to-day workings. Now data and related aspects touch everyone’s life in one way or another. This portends a sea-change that needs to occur in the way we view technology jobs.

As an increasing number of jobs focus on data and require data analysis capabilities, we need to train a new data-capable workforce to respond to this demand. There is a need to train a diverse population to fill these jobs, and create an inclusive and equitable playing field that nurtures and retains that population.

As two women with careers that intersect with the burgeoning landscape of data science, we are frequently among the few women in the room. This experience hints at some of the challenges women face in data science, and in computing fields more broadly. Some of the challenges we face as women are not unique to computing. For example, women tend to be interrupted more than men; women are viewed as less credible, less expert, and less competent than men; and despite speaking less in mixed-gender settings, women are perceived as dominating the conversation.

However, many of these challenges may be more pronounced in computing fields. Additionally, stereotypes (both explicit and implicit) exist regarding women’s abilities and perceptions of their belonging in computing fields, which have been demonstrated to impair performance in testing situations. Some of these challenges are likely shared across multiple underrepresented minority groups. Although the field of data science appears on a surface level to be more diverse than computing, given its inherent connections to multiple disciplines, we believe that this is a misconception based on our experiences attending and participating in data science conferences and workshops. Despite the perceived diversity in data science, all-male panels and non-diverse program and organizing committees and journal editorial boards are still unfortunately prevalent.

Given the expansion happening in the field of data science, and the multidisciplinary views needed to tackle data science problems, the issue of gender diversity—and diversity more broadly—is becoming an imminent problem.

Why should we care about diversity, you might ask? Diversity is important, not only because it is socially just, but also because there is a business case for it. According to some reports, (e.g. McKinsey 2015), diversity in executive management teams has been associated with higher profit margins.

We not only need people who can discover and execute data science algorithms, but also people who can think of data in different ways and manage data enterprises differently. One possibility is that a diverse set of views can lead to innovative new ideas and products, such as those offered by the company ThirdLove, which has been named in Forbes' Next Billion-Dollar Startups List for 2018 and incorporates personal survey data into fit recommendations for women’s bras, or the collaboration between ClickMedix and DC Greens that created a mobile platform for low-income residents to redeem vouchers for fruits and vegetables at farmers markets. This project is also helping farmers set up shop in locations where data patterns reveal food deserts to increase access to healthy produce. 

In addition to such examples, which unfortunately are rare, there is a great need for people to grapple with the deeper questions and processes that need to be resolved prior to the data crunching. Given the complexity of these issues, greater diversity in teams can only be beneficial to solving some of the pernicious problems arising in data science.

The pervasiveness of data has also made data science an area of study that crosses into many silos. Most data studies require data science plus the “X” factor, where X comes from any of the fields where the data are being generated and where research questions may arise. Indeed a lot of CS+X programs are popping up (e.g., at Stanford University; at University of Illinois, Urbana-Champaign; at Caltech) not just in education but also research. This opens opportunities for the influx of women from other areas that have traditionally been more diverse than computing and information systems.

Opening up these pathways for cross-pollination between computing and other fields can broaden participation in data science. This could encourage more multidisciplinary, humanistic, and even empathetic thinking, where we start to view data—not just as a resource to test an algorithm, or gold nuggets, or the new oil—but also as people, because, increasingly, data capture deeply personal aspects of real individuals. There is also an opportunity for data scientists to work in more team-based settings where the impact from the entire team is bigger than the sum of its parts. It is a well-known phenomenon that women constitute profitable and important components of team-based environments. Although it may not be possible to discern whether individual characteristics play a role, some suggest that female scientists are more likely to engage in interdisciplinary research projects.

A call to action

To foster broader engagement of women in data science, not only do we still need to fight the stereotypes we see in everyday society and particularly in computing fields, but we also have to open up paths for cross-pollination between computing and other scientific fields driving data science studies. Some such fields (for example biology, humanities and arts), which have historically been more diverse than computing fields, may be inherently more conducive for broadening participation of women and minorities. Areas of study such as CS+X, where X might refer to an area within the humanities, to biology, or to human-centered computing, should be leveraged to expand the horizon of data science work.

Broad systemic change through peer mentoring and other supportive approaches should be explored as well. A lack of sense of belonging can be harmful. Peer support through local and even virtual networks may mitigate some of these feelings, in addition to providing instrumental support in tackling technical and experiential challenges. Systemic changes in culture need to be made at each step of the way, starting in infancy and early childhood when gender socialization and stereotypes about ability begin to be instilled, such that by age 6, girls believe they are intellectually inferior to boys. Yet another example from Girls who code is that roughly 74% girls express interest in STEM fields in middle school, but only about 0.4% take up computer science courses in high school. This is also supported by other pieces of evidence, for example an NSF study indicates that female students enroll in AP computer science courses at a much lower rate (19%) than male students (81%). This trickles down to data professionals where only 26% of data professionals are women. Thus, culture change is also needed during secondary and post-secondary education.

At the faculty level, the trends indicate a more difficult scenario for women and minorities than for individuals from majority groups. Although women and minorities join in similar numbers at the assistant professor rank, they often are not promoted at similar levels to full professor. Microaggressions are often experienced by women and minority faculty throughout their careers, even after tenure. In addition, intersectionality of factors creates complex challenges for individuals with multiple minority criteria (e.g., women of color). These need to be addressed through faculty support and retention efforts.

Multidisciplinary data science teams can be leveraged as a new tool to expand participation. Departmental units and workplaces can increase diversity and provide environments conducive to individual and departmental growth by creating partnerships and networks across disciplines that provide support structures. Since data science is increasingly used as a tool to extract meaning from data in multiple scientific disciplines, teams can be built that capture diversity from these partnering disciplines to solve not only data science challenges but also diversity challenges. In these teams, women and underrepresented minorities who join from various scientific domains can share their expertise, and in the process, perhaps cultivate additional interests and strengths in data science to bring back to their respective disciplines.

Recommended further reading

  1. A Growing Demand for Data Expertise, February 06, 2018, Harvard Business Analytics Program Blog

  2. 5 Ways To Support Diversity in Data Science, Meredith M. Lee Posted on October 5, 2017

  3. Data Science Education Lags Behind in Diversity. September 25th,  2017, General Assembly

  4. Keeping Data Science Broad, Negotiating the Digital and Data Divide Among Higher Education Institutions. January 16, 2018, Renata Rawlings-Goss et. al   

  5. National Academies of Sciences Engineering & Medicine, Data Science for Undergraduates: Opportunities and Options. Washington D.C.: The National Academies Press, 2018.

  6. National Science Foundation, “Committee on Equal Opportunities in Science and Engineering (CEOSE) Biennial Report to Congress, Broadening Participation in America’s STEM Workforce,” Arlington, VA, 2013.

  7. National Academies of Sciences, Engineering, and Medicine. 2018. Sexual Harassment of Women: Climate, Culture, and Consequences in Academic Sciences, Engineering, and Medicine. Washington, DC: The National Academies Press.  

Note: The two authors contributed equally to this piece. 

Image: Pixabay

Related Posts