First Impressions: Do Intelligent Systems Really Tell the Whole Story in Big Data?
Hana Okasha, Evelyn Baskaradas and Suay M. Ozkula explore big data, AI and the role of ethics in the context of hiring practice and gender profiling.
In recent years, big data have become a currency that is used, more than ever before, in efforts to rapidly obtain large-scale data on societal trends and behaviours. It has given rise to the ubiquity of advanced technologies such as machine learning (ML) and artificial intelligence (AI) systems that describe intrinsic information that is not immediately visible to users. Such information is commodified, which presents a competitive advantage and form of informational power to entities in possession of it. In the urgency to lead in the adoption of new technological capabilities and actuate this form of power, many organisations leverage advanced methods towards understanding societal demographics, consumption habits, and even employability profiles. These insights have, on the surface, proven to be useful in consumerist scenarios and been utilised to gain an edge in various sectors. Out of these, however, develop challenges and implications of using big data in conjunction with intelligent systems that need to be brought to light and examined more closely.
Data of any volume is fundamentally a collection of distinct facts that describe an event or environment. In order to make sense of the data, relationships and connections between the attributes need to be analysed, established and contextualised appropriately, usually through the combination of tacit knowledge, domain expertise, and relevant statistical methods, which is particularly useful in processing big data. AI has, in the last couple of years become a trending topic in this initiative. It was also the subject of much discussion at the 10th World Summit on Information Society (WSIS) Forum 2019 in Geneva.
In the high-level dialogue on The Ethical Dimensions of Artificial Intelligence hosted by UNESCO, Amandeep Singh Gill, Executive Director of the Secretariat of the High-level Panel on Digital Cooperation called for reflection and a correct understanding of the potentials and limits of AI. One of several cultural assumptions include the belief that data is objective and rational, and is incapable of being influenced by societal or individual biases. The fallacy in this hypothesis is that processed data is inherently biased, and these become evident through the implementation of advanced techniques. Neural networks, for example, are black or grey box in nature and their opacity does not easily facilitate a clear understanding into how outputs are determined through the processing of input data. This has the propensity to perpetuate unintended consequences that may not become immediately apparent and potentially cause societal harm before they are identified and resolved.
The internet is a tool of information, but it is a representative of who we are, and we are contradictory. (Manuel Castells, 2019)
This issue has become apparent in recruiting. Human resource executives are progressively being assisted by automated systems in the recruitment space. Curriculum vitae (CV) or resume filters have the ability to sift through large pools of applicants or proactively identify potential candidates delivered by automated systems directly to their inboxes. The neutrality of these systems, however, needs to be closely reviewed for the implicit biases within the data used to train the algorithms carrying out these classification tasks. Monique Morrow, President of the VETRI Foundation and a senior member of The Institute of Electrical and Electronics Engineers (IEEE), highlighted this reality at the UNESCO dialogue by saying that behavioural and sentiment analytics are in existence today and have the capability to evaluate interview processes. This, in turn, may lead to interference with the employability of certain groups of people.
One aspect of this is exemplified by Amazon’s recent shut down of its AI screening process when it was discovered to be gender-prejudiced against women. This was due to the use of a decade of historical training data, which displayed male prevalence as a successful hire in the technology industry. Once the system detected words like “women” or “female”, applicants were automatically assigned a lower score. This resulted in the creation of a subclass of disadvantaged job seekers that were mostly female. In doing so, this process replicated implicit societal biases. When algorithms are trained with such biased data, old hiring practices are inadvertently incorporated into the models and perpetuated indefinitely. They, therefore, typically go undetected for long periods under the premise of “objective data”.
Another issue relates to the ethics of big data. There is currently a lack of a comprehensive ethical framework for AI applications and these considerations cannot be reliably codified into intelligent systems. Ethics can be viewed in terms of being the boundary within which legal limits reside and play an important role in “ambiguous situations in regulating human behaviour”, as stated by Gill. While the ethics of big data have been a long-term concern, AI poses new challenges due to its existence outside of specific platforms (and, therefore, the prespecified legal and ethical frameworks) and its application in sensitive areas, such as healthcare. In response to some of these issues, the IEEE has introduced a global initiative with the aim of ensuring all participants constructing such systems are knowledgeable and “empowered to prioritize ethical considerations so that these technologies are advanced for the benefit of humanity”. Even so, the ethics of AI have been highlighted as a persisting and persistent concern throughout the WSIS 2019.
These current debates make it clear that there are many points of deliberation to be taken on board when working with big data, AI and autonomous systems. Without a multifaceted approach taking into account rules, norms and practices within society, intelligent systems could potentially fail to interpret the reality of the measured environment contained within collected data. It is important for stakeholders to acknowledge and evaluate their contribution in the generation of knowledge as well as incorporate it in daily practice. Active measures should be taken to recognise cognitive biases contained within training data and human agency should not be conceded in favour of a fully automated system, especially where it concerns social elements. Pre-existing prejudices and discrimination towards certain demographics are absorbed by machine learning systems that can affect and even amplify perceptions on sensitive topics such as race, culture, religion, or gender. Algorithms applied to big data should therefore be designed to minimise unfair biases, for example through ensuring the training data is representative of the target population, selecting algorithmic techniques that best address the issue at hand, and exercising discernment in the selection of input variables that are relevant to the problem. Additionally, ethical dimensions should be considered in big data collection processes, particularly in newer digital technologies that incorporate AI and are therefore comparatively more unpredictable.
All in all, current debates show that big data analytics continue to hold exciting possibilities and innovations that are required for societal progress. It is, however, essential that a reasonable balance be achieved in the integration of new technology in existing cultures and practices within legal and ethical frameworks. Ongoing initiatives facilitated in support of the WSIS action line C10 relating to ethical dimensions of the Information Society will, therefore, be crucial in directing constructive conversations towards good governance of a more diverse and inclusive global community.
This post is part of a series from the Global Leadership Initiative's team of eight students at the World Summit on the Information Society 2019 from 8th to 12th April. All their outputs can be viewed here.