US Social Security: Data Availability and Ethical Use
Did you know that Social Security data is used for much more than benefits computing?
Social Security datasets are required by different categories of entities for various purposes, now more than ever. Social Security was originally intended to only provide benefits for retired, unemployed and disadvantaged Americans. However, by now, it covers more benefits, hence the universality of its use such as health insurance inclusion and its trust fund.
Slowly, the Social Security number became the single most adopted identification number in the United States of America both by government agencies and the private sector. The significance of this 9-digit number translates to the importance of the data connected to it that was harvested from the Social Security Administration (SSA) of the USA. It not only involves information about the beneficiaries of the social security but also about their respective employers and level of income and much more.
Besides computing benefit amounts for citizens eligible for the social insurance programs, the Social Security administrative data is relevant to private businesses, governmental as well as non-governmental organizations and also research labs where studies are conducted for policy evaluation, innovation and development. The datasets provided by the SSA fall into 3 different categories: Public, Restricted Public or Non-Public datasets.
Despite the dynamic events that the United States of America has witnessed both on a national and global level in the previous 5 years, the Social Security Administration managed to provide a scaling number of datasets. The nature and proportion of these datasets, however, varies in time.
Figure 1: Social Security datasets’ percentages (2015-2020)
You may hover over the chart for more details.
Looking at the Figure 1, a noticeable pattern shows the increase in Public datasets’ percentage from the total number datasets in the SSA from 2015 to 2020 from 41.01% to 71.51% and a less proportionate percentage decrease in both Non-Public and Restricted Public datasets that accounts to about half of the respective percentages from 2015 to 2020.
No matter the nature of the events that occurred throughout these 6 fiscal years, they make space for potential research studies on a limitless number of topics. R&D professionals tend to link Social Security data with their survey data to successfully conduct their studies. This is due to the restricted scope of variables in SSA’s public datasets.
On the other hand, restricted public and non-public datasets cover more ground in terms of features which makes them rare for the public. This is proven in the previous chart with the largest respective percentages of 50.73% and 8.26% from a total of 10,791 datasets in the year 2015.
The progressive change in these numbers poses a question regarding what happened in the US since 2016 that could have possibly stimulated this or been affected by it.
Ethical Data Use.
Figure 2: Number of SSA Public Datasets (2016-2020)
You may hover over the chart for more details and interactive options.
The chart illustrates an identical pattern in the number of public datasets from 2016 to 2020, specifically in the 4th quarter of every year. It also highlights a few alarming events for the American Citizen which could represent a probable reason for the observed pattern.
The SSA minimizes data politicization during presidential elections
SSA restricts over 100 datasets prior to Trump’s presidential election in 2016
One month prior to the presidential elections of Donald Trump in 2016, the number of publicly released datasets dropped from 969 to 811. This accounts to over 100 datasets that either have been restricted to a certain number of people or organizations or clasfified as non-public for the purpose of minimzing the risk of data exploitation and politicization for election polls or they were completely disposed of by the Social Security Administration.
SSA shows a decrease of 52 public datasets prior to Biden’s election in 2020
The drop in the number of public social security datasets also reoccured one month prior to the presidential elections of Joe Biden in 2020 and it accounted to a decrease of 52 datasets from September to November 2020. However, between the two previous elections, the loss gap has got slowly smaller than 100 over the years, especially after the national scandal of Cambridge Analytica Data Leaks in March 2018.
Documents from Cambridge Analytica in London revealed that the firm improperly obtained and used over 87 million Facebook user profiles in a transaction with Donald Trump’s presidential compaign where the scraped private Facebook data was used to build voter profiles and assist the candidate in the US presidential elections of 2016. This national data privacy crisis proved that data exploitation is not limited to commerical purposes and that it can target the American Citizen without their consent.
Social security data contributes in making the workplace safer
152 public datasets were restricted during the “Me Too” Movement in the US
The same phenomena occured with a drop rate of 13.18% in the beginning of the US “Me too” Movement from October to September 2017. “Me too” is a global movement that condemns sexual harrasement and gives a voice to its victims.
Social Security Administration reacts to data privacy
SSA gives access to 1312 public datasets on Data Privacy Day in January 2019
Crisis & Social Impact.
Figure 3: Social Security public datasets released in 2020
You may hover over the chart for more details.
Social Security data becomes more available during Covid-19 crisis
SSA’s monthly public datasets reach their maximum of 1469 since the pandemic
Until the 2020 presidential elections, the number of public datasets released by the SSA in 2020 has been continuously increasing from 1430 to a maximum value of 1469 in order to satisfy these research needs. Moreover, non-governmental organizations were in need of social security data in the year 2020 as they worked on advocating equal health care opportunities such as First Covid-19 Vaccinations starting from December 2020 to all US social security beneficiaries which will probably result in a new rise in the beginning of 2021.
Social Security data contributes to #BLM Movement
SSA publicly releases 1457 datasets during #BLM movement
Protests for the Black Lives Matter globel movement that was initiated in the USA shortly after the death of George Floyd in May 2020 have generated a large number of petitions and fundraising campaigns regarding #BlackLivesMatter across the internet and numerous sensibilization events about ending racism globally. The movement also approached the BLM cause on an international level which perhaps pushed the SSA to move some datasets from being restricted or non-public to being publicly available to everyone everywhere.
This was proven in the fiscal year 2020, given the exceptionally high number of public datasets released of 1457 in May 2020. The data was used to highlight the gap between white and black people’s wages, benefits and much more with the purpose of raising awareness about the details of racism present in an american’s everyday activities.
TikTok App challenges data protection values of the SSA
SSA releases 1463 public datasets during TikTok US data harvest allegations
The Social Security Administration has been playing an active role in 2020 when it comes to endorsing public data to support Research & Development departments in different fields in the United States of America. This includes global health which is the world’s priority during the Covid-19 crisis and many more fields. The data that is publicly released on a yearly basis also contributes in recovery from other types of crisis, raising awareness about data privacy threats from social media or content creation giants and about data transparency when it comes to policy making and elections in the US. As for the restrictions made on private datasets to the administration, some are working on protecting US citizens data from political exploitation. Nonetheless, some are preventing NGOs from tackling certain topics and keeping the data from speaking for the voiceless. The SSA needs to open calls for datasets where every external organization can make a data usage proposal for certain restricted or non-public datasets added by the administration. How else would social security data be as transparent as the SSA claims? How else would the world use the data for positive social change and ethical purposes?
- Source: data.gov
- Publisher: Social Security Administration
- Retrieved: June 26th, 2021.
- Last Update: July 2nd, 2021. It’s monthly updated by the SSA to reflect new datasets (public, restricted, non-public)
- Description: Social Security continues to release data in support of the Open Data Initiative.
- URL: https://catalog.data.gov/dataset/enterprise-data-inventory-progress-information