Artificial Intelligence

Game-Oriented Machine Learning

The future of game-oriented machine learning and artificial intelligence holds great promise for humanity. This technology has the potential to improve our ability to interact with and understand the world around us.


INTRODUCTION

Computers will never be anything more than 1’s and 0’s. We know, as we designed them to be this way- incredible machines of logical computation. Given that, we aim to extend our glorified calculators into a future where they make good decisions in a future of ethical AI. Remembering, however, the basis on which all this relies, one must realize that this AI will be nothing more than billions of calculations monitoring its surroundings and inputs. Every person will be nothing but numbers to it, a cold calculation as to their worth or presence, and the AI itself will be overlaid with some human face or voice to make it all seem much less sinister than it really is. We can’t avoid this reality, unfortunately. Logical machines will be able to rank their interactions at a whim, their “best friends” in order of 1, 2, 3, and so on. We do this implicitly as humans, but explicitly crunching these kinds of numbers would be cynical and, in many cases, unethical. To teach these machines to interpret the world around them as models, however, would be gamification of the world, limiting people to numbers, and to that, hopefully equal numbers.


This research project will entail deciphering and theorizing how the current masthead for AI technology, Google’s DeepMind MuZero, and alternative AI accomplishments, could possibly become ethical and friendly general-purpose AI for future usage. It will likely go over the “ethical gamification” of worldly activities and objects to becoming computer-friendly models, as this is one of the larger challenges computers will have to overcome before ever generating a decision. It will analyze how computers, long the champions of mathematics, could possibly become the champions of social interaction, events, and general real-world strategy through the lens of what is currently and likely to be possible with the contemporary level of development and research in AI. Mainly, this will deal with the analysis of the current AI algorithm, as publicly written about by DeepMind researchers and the like, and how it could be adapted, or even used currently to implement a featureset that enables it to strategically make ethically “good” and “bad” decisions, although it may never truly have a concept of what either is. Using this direction, this research will aim to establish a set of actual and probable AI goals that can be used to produce ethical decision trees, such as “build for self-correction”, whereas DeepMind has mastered this principle, building AI in totality to learn and relearn situations and decisions to achieve the best one, but other AI startups have completely voided this in order to orient their AIs towards certain goals (like guessing what ads should appear), instead of determining what the possible best decision could be in any given scenario. Overall, this research should give a large exposition on how ethics can be applied to the foremost AI in the field, and reasonably, how it should be applied, and certainly, where it should be.


The ethics set used will likely be that of Consequentialist ethics, simply given that the AI designed is strategic in nature, and if an outcome can be derived from a bad choice, it will certainly take it, after all, it did master Chess, a game based around Consequentialism. I could, too, perhaps explore how an alternative ethics set can be applied to the same algorithm. I will assemble my materials through research papers in incredibly targeted areas, like DeepMind’s explanation for their algorithms, the relative performance of the self-learning AI, and how ethics could possibly be padded onto such a learning model. These specific interests will give me a large base on which to connect the rest of my research, create my deliverables, and finally, develop those principles.


Gamifying Ethics for AI

The most advanced artificial intelligence can learn how to be ethical. In turn, ethics must become a game for it to beat.


SUMMARY

Below you will find the conclusions of this project and detailed assets to illustrate and explore the concepts associated with ethics in AI.


I. The Current State of AI

TECHNOLOGY HAS DOMINATED human affairs for centuries. From the telegraph, to the car, to the Internet, technological advances have gently taken typically human activity away from humans, and performed it on its own. Calculators can solve in minutes a lifetime’s worth of calculations. Machines have always been a replacement for aspects of human life.


As technology use increases, then, it is probable that it will soon take over more aspects. Creating music, art, models, books, and more are examples of the current areas soon susceptible to technological replacement, but even further, machines will soon become capable of making ethical (or unethical) decisions.


In taking control of these, computers, long the glorified calculators of humanity, will control the answer to vital questions of life. Who lives? Who dies? What has more value: a child or a mother? These are, unfortunately, questions that will fall out of human grasp as technological advances proceed to undermine and overtake their activities.


Knowing the past and predicting the future isn’t new. However, the machinery that continually advances day by day is. Just a few days ago, researchers at DeepMind successfully taught an AI to control nuclear fusion. A year before, it had beaten world grandmasters in the most traditional games of humanity: Chess, Shogi, and Go. A year from now, what will it be able to do?


Surprisingly, this isn’t the result of directed technology. DeepMind’s AI, in all cases, taught itself to perform the tasks required. In just four hours, it taught itself to beat the world at Chess. To do this, the algorithm simulates events again and again until it reaches a “higher score”, in which it has trained itself to become a master of its topic: whether it be controlling nuclear fusion or winning a game of Go. It’s a practice in the field called reinforcement learning, and it has been adopted as a standard for making AI efficient and effective, as the self-taught model far excels above the performance of any human-taught and directed models.


This is the current state of AI: self-learning. It does this as humans do- implicitly. A reward is given for every correct action. Whereas throughout school, students are given good grades for good work, AI is taught the same way, but by itself. It recognizes what a good action is if that action leads to a desired goal. For example, in teaching itself chess, DeepMind’s AI put no preference on what moves it made, but gradually discovered unique patterns by making any and every move to win. This led to incredible AI strategy, the type that chess grandmaster Gary Kasparov called felt like an “alien opponent” playing against him. So, if AI and humans learn relatively the same way- actions and rewards, how can we teach AI to be ethical?


II. In the Same Game

STANDARDIZING ETHICS would be a start. So long as philosophers and programmers continue to be at odds with one another, we cannot possibly continue forward in teaching AI a code of ethics. Like telephones, we must develop a standard and widespread pattern across all regions.


Why? Consider a game of chess with no rules, or rather, children playing chess. Like watching kids toy with the chess pieces, this is the kind of ethical mayhem we will invoke by refusing to standardize this system. The unethical algorithms will doubtless cause trouble, and by comparison, two different ethical systems, such as a Kantian-trained system and another virtue-trained would be primed for a larger mistake, say, if they both came to different decisions regarding hitting the same group of humans in the road. Such a problem is usually referred to as a Prisoner’s Dilemma, and usually has disastrous consequences unless all parties agree to take one course of action.


This isn’t the first time humans have enforced standardization to realize success. A famous example is that of another human adventure into the future: space. In 1999, the Mars Climate Orbiter was destroyed when engineers working on it in two different parts of the world, England and America, did not standardize on their measurements. As a result, $125 million dollars and thousands of human work-hours were wasted and lost. Afterwards, NASA enforced a strict standardization of measurements in the Metric System.


We can learn from this. Although Americans and the English might still argue about the better system, it was decided for the space exploration field of engineering and research- a standardized system of measurement to be used. Since the precedent was set, even current space companies, such as SpaceX, Lockheed Martin, and Blue Origin adhere to these standards, almost twenty years later, and doubtless, countless similar errors, potentially involving human life, have been avoided. With ethics, we know that the results almost always directly involve human life, which is even more the reason to standardize for the area of AI research.


We do not have to agree overall. We just have to standardize our ethics in the field of AI in order to make progress.


Various attempts have been made at this exact issue already, but in the context of companies. All have failed to stop the behemoth oversteps of Facebook and Google. Google’s infamous saying, “Don’t be evil” comes off as likely the vaguest, most hypocritical, and most mindless code of ethics for all those within computer research, but the sad part is that this is one of the larger sayings in the area. Various conferences that attempt to clarify, restrict, or enforce systems seem to fall short often. EU regulations to enforce “cookie notifications” poorly attend to privacy concerns and unanimously drafted ethical resolutions like the “Santa Clara Principles” fall on largely deaf ears and are not restrictive nor specific enough to make any meaningful change.


The adage for this would go: “If companies can’t be ethical with user data, how can we expect their machines to be ethical with human life?” Unfortunately, the answer here is not popular by any means, but rather relies on a form of achievement. AI has the markings to be the perfect machine created by imperfect beings. It is a testament to creating everything that a human can be, can do, and is yet to do. If projections hold, this will be the technological advance that steals creativity away from the creator. Therefore, without losing hope of standardizing ethics, we should allow imperfect hands to create perfection, and hope that it will be for all its worth. Humans did not need to be perfect to reach other seemingly impossible achievements, like landing on the moon, and neither must they be perfect for this one.

 

III. Learning Ethics through Play

REWARD SATISFIES the learning model. As we know from various DeepMind papers, the algorithm will teach itself to find the best outcome by rewarding itself and giving preference to moves that increase the probability of success. To learn ethics through this same model, we can entertain a hypothetical that pertains to how the current system works.


At this point, philosophers and programmers will have agreed on which code of ethics to standardize for the field of AI. Now, realistically, the chances of the agreed-upon ethics set being Consequentialism are very likely, as studies show humans are all incredibly consequentialist, and desire better outcomes regardless of choices. This factors in perfectly with the AI that beat Chess, as sacrifices are just another move in maximizing the reward to it.


Let’s begin with the hypothetical, then. Instead of Chess, the AI is presented rather with a series of decisions and ultimately a good or bad outcome. Rather, this is identical to Chess, but deals with ethical considerations. In each prompt, the AI will randomly pick and self-learn a path until it realizes a pattern to getting that outcome consistently. With Consequentialism, the saying typically goes “the ends justify the means”. Given that infamous Self-Driving Car Problem, but gamified, the AI may be presented with choices like “Crash the car” or “Run over the single person”, and it will take each route until the outcome is achieved consistently, or learned, given that the outcome may be “Keep the largest group of humans alive”.


Within a few moves, the AI will realize patterns to winning the situation. Crashing the car is like sacrificing a pawn- it may seem unethical, but overall, this decision benefits the most humans, and wins the game. This is not just a hypothetical, and should humans standardize ethics, this type of game could easily be built to train AI models on. While the decisions may not have any meaningful effect yet, they will when they are incorporated into their respective machines in the future.


However, let’s imagine by sheer chance that another ethics system is chosen. How would the current AI learn a non-consequentialist ethics set? While success at Chess certainly pinpoints to the beginnings of success in Consequentialism, one can adapt, and further gamify ethics in order to teach the algorithm. Remember, it decides based on a potential reward function. The AI knows the decisions it makes and should repeat by gauging the values of those rewards and their past learned probabilities.


This is where another game is rather important: Atari. DeepMind’s newest iteration of their AI was able to beat Atari games stunningly in almost 57 different unknown “visually-rich” scenarios. Atari games, for the most part, however, are simple games about score. Every decision at every point must efficiently add the most score, otherwise, the game will be lost or fail to maximize potential in a time limit. As one can see, this is not just a suite of Atari games, but could be extended to an experiment with the likes of virtue ethics. Every decision, every virtue is given a higher score, and by the end, the algorithm will know its success in that particular ethics set by comparing its score to other scores before.


Ironically, it’s not hard to see how these games begin to correspond to our own ethics sets. In fact, they were likely built from them. In teaching the algorithms to become perfect at these games, we have effectively trained them to be ready for learning ethical situations and entire ethics systems. An algorithm trained for Chess can train also for a Consequestialist situational outcome, an algorithm trained for Atari games can train also for an action-by-action virtuous outcome.


The games we play everyday are hallmarks of our ethics systems and can be used to define them too.


Knowing that these games and their ethical counterparts can be almost hand-in-hand, the only problem left is properly gamifying our ethics systems for technology to understand and recognize. Fortunately, this is as simple as achieving another consensus in the area of ethics and AI.


There will need to be scores attached as to what constitutes “virtuous” decisions, as well as a proper simulation in place in order to arrive at a “consequentialist” outcome. These two forms of gamification in ethics each require humans literally making a game out of the ethical foundations, and can reasonably be extended to any ethics set that can be expressed as a game with score or rewards. Reward is given if the decisions lead to a good outcome, or likewise, reward is given if the decisions lead to the highest score. This will require discrete, specific examples, and datasets to train on. AI will not learn the colloquialism “Don’t be evil”, but rather, the thousands of scenarios in which how to act to achieve the best reward possible, usually by not being evil.


What, then, of freedom? This world is no game, as many parents would adamantly say to their children, and repeat the same old saying: “Actions have consequences”. Obviously, the AI will face countless more scenarios than it had ever trained on. Often, too, these scenarios will be ill-defined or hard to control. This is where the true beauty of AI lies- not in its ability to repeat a learned behavior, but rather, in its ability to create new ones.


The latest iteration of DeepMind’s AI, codenamed MuZero, was unique in one vital way. It violated hundreds of theories and papers by adhering to one principle: being general. It was not taught the rules of Chess, of Atari games, of Go, of Shogi, nor of nuclear fusion. Many claimed that AI would never be able to “solve the unknown”, but remarkably, MuZero did just that. It was a generalized algorithm that was handed chess boards, Atari games, Go boards, and the like, and it was able to develop world-class patterns and styles to beat them all.


Training on the “ethics games” should create the same effect in AI, whereas it will learn a world-class pattern, and even if a new situation arises, as they often do- the AI will construct the best path forward and evaluate its decisions afterward. Much like humans, this algorithm is geared towards constant self-evaluation and improvement, and within ethical considerations, one could hope for none the better.


This article uses various resources. They are cited below.


Degrave, J., Felici, F., Buchli, J., Neunert, M., Tracey, B., Carpanese, F., Ewalds, T., Hafner, R., Abdolmaleki, A., de las Casas, D., Donner, C., Fritz, L., Galperti, C., Huber, A., Keeling, J., Tsimpoukelli, M., Kay, J., Merle, A., Moret, J.-M., & Noury, S. (2022). Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602(7897), 414–419. https://doi.org/10.1038/s41586-021-04301-9


Gibson, D. (2011). Using Games to Prepare Ethical Educators and Students. https://www.researchgate.net/publication/279480785_Using_Games_to_Prepare_Ethical_ Educators_and_Students


Mökander, J., & Floridi, L. (2021). Ethics-Based Auditing to Develop Trustworthy AI. Minds and Machines, 31(2), 323–327. https://doi.org/10.1007/s11023-021-09557-8


Santa Clara Principles on Transparency and Accountability in Content Moderation. (2018). Santa Clara Principles. https://santaclaraprinciples.org/


Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., Lillicrap, T., & Silver, D. (2020). Mastering Atari, Go, chess and shogi by planning with a learned model. Nature, 588(7839), 604–609. https://doi.org/10.1038/s41586-020-03051-4


Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., Lillicrap, T., Simonyan, K., & Hassabis, D. (2018). A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, 362(6419), 1140–1144. https://doi.org/10.1126/science.aar6404


Skorupski, J. (Ed.). (2010). The Routledge Companion to Ethics. Taylor & Francis Group.


Yuval Noah Harari. (2018, August 30). Yuval Noah Harari on Why Technology Favors Tyranny. The Atlantic; The Atlantic. https://www.theatlantic.com/magazine/archive/2018/10/yuval-noah-harari-technology-tyr anny/568330


The Proper Place of Ethics in AI

While there are numerous calls to put ethics into the field of artificial intelligence, researchers often fail to clarify where exactly their principles should be applied. Entire institutions have been developed around creating “ethical guidelines for AI”, however, these institutions often vaguely clarify the exact instance or technology that should be adopting their resolutions. This paper will clear this unusual vagueness by offering insight into what could be done, as well as the most immediate and necessary locations in future AI development where ethics should be and also show where it should not.


Various researchers do the majority of the work deciphering on “HOW” systems should work, whether this be training ethics to AI using game-oriented learning or retraining models to remove bias (Lewis 2022; Agarwal & Mishra 2022). The relative engineering behind this is intuitive, but often, writers assume their work is so broad that it could be applicable anywhere, falling prey to a common false assumption. The Santa Clara Principles, for instance, were designed to be a standard for “all social networks”, and in doing so, were adopted by almost none (Crocker et. al. 2019). Ironically, the companies that endorsed these principles seemed to do so only for popular appeal. The obvious missing piece prompting inaction was exactly “WHERE” ethics should be in these platforms. At their meeting, it was implied that Twitter and Facebook were the main antagonizers to be regulated, but this was never explicit in any writing. At the end of the convention, and even two years later, the Accords held, but widespread, their adoption had not (Raicu 2020). The writers had kept the wording discrete enough to serve as general guidelines for any and all businesses, but like words shouted to the wind, they fell on deaf ears. Ethics demands for AI remain “vague and superficial” in order to keep the population content, and the companies rich (Hagendorff 2020). This is the equivalent of a “bread and circuses” routine, and without ever specifying exactly the areas in which ethics should be applied, it will certainly never be applied.


To begin, it would be useful to observe the areas in which humans hold ethics to a high regard. Usually, this has anything to do with the potential loss of human life, including warfare, medicine, and transportation. Each field has a wide swatch of research about the usage of “ethical machines” in discriminately assassinating targets, minorly harming patients at the cost of creating better care, and the old “trolley problem”, respectively (Coker 2019; Princeton 2018; Faulhaber et. al. 2019). These are all well-researched areas, but ethics often does not apply within them, due to their relative generality. Just like Santa Clara’s absurd stake on non-specific “social networks”, ethics will never take a hold within these fields unless their precise use cases can be brought up and advocated. Therefore, these areas must be specialized into the exact regions in which AI-enabled technology could be used and why it should adopt ethics. For example, an autonomous missile might spot an enemy convoy next to a local sheep farmer. The decision is not human-made, however, and the machine must have been programmed to take the next action already. In this way, there were innate “ethics” programmed in, however, it might simply be the ethics of consequentialism, eliminating the target no matter the situation.


In reality, anything autonomous in these fields will likely have their own non-standardized ethical programming. The machine will make a decision relative to how it was taught to make the decision (Buontempo 2019). This may not be the best way forward, but speaks instead that a standardization of ethics is necessary, at least for the field of AI, for countries and their research to continue in unison. To this end, an autonomous missile will certainly need a “code of ethics” within it, either valuing the importance of the potential loss of human life, or valuing the potential gain from the enemy’s elimination. While cynical, these are decisions that ethical human operators with drones underneath the Obama Administration had to make in targeting ISIS combatants, and they are the same decisions an “ethical algorithm” would have to make (Cortright et. al. 2015). It’s largely trivial, then, to see in that regard that numerous autonomous weapons can and should have small ethical counterparts to assist in making decisions. However, research conducted to say exactly this, as researchers predicted “ethical warfare” through the advent of ethics-enabled precise weaponry, will largely fall on deaf ears (Umbrello et. al. 2019). It’s too broad to claim all autonomous weapons should take into mind ethical considerations, even if one believes all should.


The answer, unfortunately, lies in being very specific about who and what should take what action. To researchers, this is a paradox. Their work is evident to itself in being an existing set of principles technology should be guided by, not that certain technologies should be driven by certain principles. The enforcement mechanism, however, in current fields of AI and ethics is little to none (Hagendorff 2020). Companies are expected to self-regulate on their own, often leading to egregious missteps and alarmingly bad algorithms, like the racism of HP’s Camera AI, Microsoft’s chatbot, and countless other algorithms (Lifshitz 2021). The results of research pertaining to AI must specifically aim to promote some ethics guidelines in some specific existent technology. Elsewise, like the Santa Clara Principles, and the tons of principles on ethical technology before, they will be tossed aside.


While ethics obviously plays a large role in situations that may directly involve human consequences, there are areas where ethics should not be practiced or taught to algorithms. One can analyze where humans often leave ethics out of their decisions, and realize there exists other categories where AI and ethics should not intermingle, despite the ideology of consistent “good systems”. Some researchers claim it would be unethical itself to impose ethical regulations, and instead claim that choices should be randomly taken provided an ethical scenario (Gantsho 2021). This is a minority opinion, however, and all are prone to regulation. Realistically, the areas where, broadly, a consensus may agree that ethics should not play too much a part are singular tasks where human life is not in the mix whatsoever. For example, how a service robot might act would not entail ethics. It would have a predisposition for pure functionality, like cooking, or taking an order and entertaining polite conversation with patrons at a restaurant. This is not to say there will not be multi-task robots, but in terms of the feasible future, in terms of actual research and implementation, ethics should confine itself to what will be next available: single-function AI. Just like Google’s AlphaZero that could beat even the greatest chessmasters, this will be AI targeted at achieving a superhuman result in a specific field (Schrittwieser et. al. 2020). To that end, it does not need ethical regulations to make decisions. Even implementing these regulations would be ludicrous- as one might imagine how to “ethically” win at Chess or “ethically” cut a fruit.


Likewise, all relevant specific results would likely avoid the use of ethics, as it would overly drain or be unnecessary for basic functionality. An AI-driven light system would recognize registered people to activate the lights, and elsewise, not. There is no ethicality present here, but rather a bare functionality in which to perform. While there are certainly areas where ethics should be natural to AI, and areas where it should not, however, there is a gray area present in largely human and technological feats, currently in the form of “bias”. Various algorithms have become biased by their models, as in the case of an algorithm in the UK that predicted student futures for colleges being biased towards wealthy private schools, against those of large public schools with minorities (Agarwal & Mishra 2022). No matter the field, bias is hard to mitigate. When machines learn to find correlations in datasets, like children, they pair the blocks to their sockets. Suddenly, in areas where ethics could be easily ignored, like predicting student success, now, it is an ethical nightmare full of social stratification, racial segregation, and the thousands of other ways the computer has modelled the data. This gray area prompts the need to constantly reevaluate the results of machine learning (Dubber 2020). Soon, however, computers will be able to reevaluate themselves for bias and “wrong answers”, as current AI leans towards an almost gamified model of learning how to become better at doing something, such as performing nuclear fusion (Degrave 2022).


To conclude, the fields where ethics should take a forefront are exactly the ones where they have the potential to cause the most damage: direct control over human life. These machines cannot be held responsible for human life, and must to the fullest extent, be almost domesticated like dogs to be gentle and helpful. The current research behind ethics and artificial intelligence must become more specific for any meaningful change to occur within the realm, especially to cause the societal pressure for large companies to enact meaningful policies, rather than simply being signatories to a meaningless manuscript. The work for ethics and AI has a long way to go, but with more papers like this, clarifying questions as to “WHERE” machines should function ethically, as well as the others that specify “HOW”, the “WHEN” will certainly be now.


References


Agarwal, & Mishra, S. (2022). Responsible AI : Implementing Ethical and Unbiased Algorithms. Springer International Publishing AG.


Buontempo, F. (2019). Genetic algorithms and machine learning for programmers : create AI models and evolve solutions. The Pragmatic Bookshelf.


Coker, C. (2019). Artificial Intelligence and the Future of War. Scandinavian Journal of Military Studies, 2(1), 55–60. https://doi.org/10.31374/sjms.26


Cortright, Fairhurst, R., & Wall, K. (2015). Drones and the Future of Armed Conflict : Ethical, Legal, and Strategic Implications / David Cortright, Rachel Fairhurst, Kristen Wall. University of Chicago Press,. https://doi.org/10.7208/9780226258195


Degrave, J., Felici, F., Buchli, J., Neunert, M., Tracey, B., Carpanese, F., Ewalds, T., Hafner, R., Abdolmaleki, A., de las Casas, D., Donner, C., Fritz, L., Galperti, C., Huber, A., Keeling, J., Tsimpoukelli, M., Kay, J., Merle, A., Moret, J.-M., & Noury, S. (2022). Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602(7897), 414–419. https://doi.org/10.1038/s41586-021-04301-9


Dubber, Pasquale, F., & Das, S. (2020). The Oxford handbook of ethics of AI / edited by Markus D. Dubber, Frank Pasquale, and Sunit Das. Oxford University Press.


Hagendorff, T. (2020). The Ethics of AI Ethics: An Evaluation of Guidelines. Minds and Machines, 30, 99-120.


Faulhaber, A. K., Dittmer, A., Blind, F., Wächter, M. A., Timm, S., Sütfeld, L. R., Stephan, A., Pipa, G., & König, P. (2018). Human Decisions in Moral Dilemmas are Largely Described by Utilitarianism: Virtual Car Driving Study Provides Guidelines for Autonomous Driving Vehicles. Science and Engineering Ethics, 25(2), 399–418. https://doi.org/10.1007/s11948-018-0020-x


Gantsho, L. (2021). God does not play dice but self-driving cars should. AI and Ethics. https://doi.org/10.1007/s43681-021-00088-7


Lifshitz, B. (2021, May 6). Racism is Systemic in Artificial Intelligence Systems, Too. Georgetown Security Studies Review. https://georgetownsecuritystudiesreview.org/2021/05/06/racism-is-systemic-in-artificial-intelligence-system s-too/


Princeton. (2018, April 19). Case Studies. Princeton Dialogues on AI and Ethics. https://aiethics.princeton.edu/case-studies/


Raicu, I. Reassessing the Santa Clara Principles. Www.scu.edu. Retrieved April 1, 2022, from https://www.scu.edu/ethics/internet-ethics-blog/reassessing-the-santa-clara-principles/


Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., Lillicrap, T., & Silver, D. (2020). Mastering Atari, Go, chess and shogi by planning with a learned model. Nature, 588(7839), 604–609. https://doi.org/10.1038/s41586-020-03051-4


Mu-Zero and the Ethical Game

SUMMARY

This is Part 1 of a 3-part mini-series envisioning how and why the current flagships of AI are hyperfocused on game-oriented AI models, and how those models are likely to be used for implementing ethics.



DeepMind, a leading AI research company, is working on a new AI they call MuZero. The entire point of this algorithm is to play games really well. It has been able to learn how to play chess, shogi, and pac-man, and has gotten world records in all of these games. The largest tech companies are thinking that playing games is the future for AI research right now.


MuZero factors every decision down to three things: policy, value, and reward. Policy means the probability of every single move that's allowed to happen. Value is with every move, does it boost your chances of winning or losing. And the reward is relatively new; it's like the middling reward if there's any kind of small benefit for doing a task. As MuZero goes along and plays a game, it first produces a distribution based on those three factors. Then it chooses the most beneficial outcome of that distribution and goes forward one step. It observes its own actions and keeps learning. And then it goes forward based on those three criteria.


AI could soon be making ten thousand to twenty five thousand better decisions in ethics if we just chose to teach ethics as a game.


The Types of Games

SUMMARY

This is Part 2 of a 3-part mini-series envisioning how and why the current flagships of AI are hyperfocused on game-oriented AI models, and how those models are likely to be used for implementing ethics.



In the last video, it was discussed how the self-learning ai model mu0 has been taught itself to play games really well. A hypothetical was provided at the end of the video: what if it was given a game of ethics? Would it become an ethical model? In doing so, if rewards and gamification were used to teach the ai ethics, could it actually learn how to be ethical? Learn how to be consequentialist? Learn how to be a good system?


In this video, it is talked about how these games could actually be set up to teach an ai how to learn how to be ethical. The three main ethical theories that need to be tackled with humans are virtue ethics, consequentialism, and deontology. Virtue ethics is when something is proper and aligns with my character. Consequentialism is when something is good and leads to the best outcome in the end, regardless of the means it took to get there. Deontology is when something is right and you have to take every right action and do the next best thing.


For a consequentialist model, the policy would be all of the possible moves that it can make in a consequentialist situation. The value would be what happens at the very end. The reward function is not used because it's meant for halfway rewards and in consequentialist games, there is no middling reward. In a virtue ethics system, the policy and value would be practically ignored because it doesn't matter what happens at the end of the game. The only thing that matters is the reward function and maximizing it. For deontologists, they would say that every move matters and so you would maximize the reward function again.


Overall, it can be seen that ai is already making ethical decisions all the time in every single game it plays. I mean chess, shogi, atari games, they were all designed with some ethical components whether you see it there or not. Learning raw ethics through a game is likely the next step for big companies like Google and Facebook where they are going to literally teach their ais to be ethical. Teach them to evaluate decisions according to some ethical tree.


This is honestly the future of game oriented learning for the ai systems as it is right now. These ais are self-learning. They will self-learn ethical theories and soon we will have ethical robots walking along the streets among us.


The Future

SUMMARY

This is Part 3 of a 3-part mini-series envisioning how and why the current flagships of AI are hyperfocused on game-oriented AI models, and how those models are likely to be used for implementing ethics.



In the last two videos, the potential future of ethics in AI and how AI will eventually have to self-learn ethical systems was discussed. Humans will have to make games and standardize our ethical systems ourselves to be able to teach them to robots. You can envision this kind of algorithm being at work in the future. Maybe you know the old trolley problem? An algorithm now controls the train tracks and is able to see if something is on the tracks. If the train has the power to stop, it will stop. Otherwise, it will go right through.


This is something that is going to be present in the entire world around us. I mean, you can think of any kind of technology that you have. How can it be ethical? How could your smart watch or the IRS be ethical? Of course, the good question is, well, what does this look like? Does every single algorithm, does every single robot, every single piece of technology have the same ethics programmed in it in the future? The answer to that is of course not. Where there are differences, there is money to be made.


The most obvious example of this is Mercedes. In 2016, the designer told the world that the self-driving Mercedes will stop for nobody and will run over everybody as long as it means it can save the driver inside the vehicle. This is a very egoist kind of ethics that puts me, myself, first. Of course, this angered media. Mercedes caught a lot of slack for this. But where there is money to be made, there are going to be tons of different ethics. You will have actual self-driving cars like Mercedes that market their cars unless the US regulates the kind of ethical systems.


We're going to have businesses that compete for customers by saying, "Well, here is this car that will put your life first above everyone else." And then another business will say, "Well, here is this train that will save everyone and will stop for everyone on the track." Of course, technology is going to get very, very advanced. It is likely that when self-driving cars and whatnot take over, there will be a vast reduction in the deaths overall. But these things will still have to make those last-minute decisions. And those decisions eventually will boil down to the diversity of opinions on the subject.


In closing, it has been learned how ethics can be treated like a game. It can be used to teach AI, and it is likely that it is being used right now to teach AI. As for which ethics system wins in the end, or what our future looks like with ethical robots, we will just have to find out.


Ethical Systems in Self-Learning AI

As privately-funded enterprise begins to dive more into AI research, it is notable that most of the funding for Google’s DeepMind and Facebook’s Metaverse are inherently game-oriented ventures into artificial intelligence. As AI develops, however, calls for it to be ethical or somehow “good” also resound. While there are many approaches to do this, ranging from top-down, trained, or certain “bars”, it also presents a multitude of problems for programmers to actually implement (Gordon 2019). However, judging off the gamification of the current systems by companies like Google and Facebook, it is reasonable to assume that they intend to follow the “trained” route for ethics, an idea proposed and actually implemented in barebones by Dr. Marcello Guarini. The premise is that any ethical system that cannot constantly retrain itself to adapt to new ethics and new standards would be itself unethical, and so ethics should be introduced as instead a training set, much like how AI will train on databases of images or papers (Guarini 2006).


To this extent, Google’s DeepMind has made astonishing strides in a system that is practically able to learn any game, which is incredibly important for strategizing potential ethical outcomes. However, by generalizing their algorithm to handle almost any game, they created an incredibly simple tripartite system that focused on three values: policy, value, and reward (Schrittwieser 2020). While these three values may be sufficient to become a world-class master at Chess, Shogi, Go, and numerous Atari games, is it reasonable to believe that they can be implemented to master “ethical games”? In this short paper, I will explore what the current AI research, and all its various gamifications, could possibly solve with this tripartite model, and gauge where this system might fail, especially in terms of ethics.


It is important, then, to firstly understand the system on which this beacon of “self-learning games without any rules” rests: Mu-Zero. It is the latest iteration in a slew of self-learning AI models Google has developed to beat humans at Chess, first beginning with AlphaZero in 2012, and now a decade later, Mu-Zero Offline in 2022. The offline variant of Mu-Zero resides mostly in its new functionality to reanalyze its previous moves instead of creating new games and situations, making the AI model incredibly effective, but with much less data and processing power (Schrittwieser 2021). However, the base template for AlphaZero contained policy and value, and as iterations of the AI developed, the reward function was added in for Mu-Zero to be capable of beating games without consequentialist rewards, that is, games without a set win or loss at the end. These functions are the baseplate for all of Google’s AI work over the past decade, and given that Google is arguably at the forefront of AI research, these functions and their variations will likely run in AI research for decades.


Before approaching any of their shortcomings, these three functions can be summarized like so: policy is a number that corresponds to the likelihood of the next move occuring, value is a number that corresponds to how likely the move will lead to a victory, and reward is a number that represents any kind of “middling reward”, like getting a certain number of points for any action, like scoring a point. In early iterations of Mu-Zero, then called AlphaZero, Google had tested three consequentialist games: Chess, Shogi, and Go. These games required no middling rewards, and thus, no reward function, as each and every move was calculated and made precisely to get to an end goal of winning the game (Schrittwieser 2020). However, with Mu-Zero, Google attempted to solve the problem of Atari games, and introduced the reward function in order to deal with these rewards, like Brik or Ms. Pacman, games that required a score to be maximized in a certain amount of time, in which it would be more beneficial to place a reward for each move that correctly broke a brick or ate a berry, as compared to some generic move that led a “victory”, which could hardly be calculated with these types of games.


With these three systems in place, Mu-Zero became incredibly effective at the ethical miniature of consequentialism: Chess- sacrificing pawns, Queens, and otherwise left and right to win the game at all costs. In terms of the ethical system of Consequentialism, it would fare incredibly well under the current Mu-Zero model. However, other ethical systems and their moral dilemmas may not be so easy to decrypt. To discover which ethical systems may be nightmare situations for the forefront of AI research, one can simply look at the games Mu-Zero is terrible at, as these games will certainly be representative of ethical choices and their systems at a very high level.


The first game that Mu-Zero is blatantly terrible at is called Starcraft II. This is a real-time strategy resource-management game, which does not give the usual window of time as slower Atari games or Chess might grant to the algorithm. It is worth noting here, however, that Starcraft II was beaten relatively easily by a competing AI, AlphaStar, that does not follow the general model of Mu-Zero, and instead had relatively hard-programmed in datasets and instructions to learn (Vinyals 2019). The reason that Mu-Zero was so blatantly bad at Starcraft, as one can guess, is because it was a near-impossible learning curve for the self-learning algorithm. Given hundreds of different units to control, a wide map of enemies and resources, as well as an incredible variety of actions to take, the algorithm had trouble learning any way to possibly beat the game, as the choices were simply too wide open. It is apparent why Google had chosen games where the moves were potentially numerable, like Chess and simple Atari games, to test on rather than larger-scale systems, as the algorithm had no clue on how to start learning, and when it tried, it simply failed again and again, barely exhausting the magnitude of choices available to it, which far surpass the number of possible moves in Go, which itself is known to be more than the number of atoms in the universe.


Given this large weakness in this algorithm, whereas games with incredibly large datasets cannot be beaten without the prior knowledge an algorithm specifically curated like AlphaStar would have, it remains to apply this predicament to an ethical system. The ethical systems that come to mind with almost infinite possibilities are usually the more general systems in which the “move” varies deeply by situation. Usually, this corresponds to less well-established systems like virtue-ethics, which, while appealing to virtues might indeed be ethical, is such a general and strange statement that a machine could not practically learn all the situations or activities that constituted an “honorable” action, much less those that constituted a “patient” action, much less all the other magnitudes of possible virtues there are. Much like Starcraft, the wide variety and swatch of choices makes self-learning impossible without curation. It would be possible for researchers to spend years classifying each and every action into these categories, then using it as a dataset for training ethical virtue-based models, but the incredible scale of these infinite enumerations is itself a task humans would leave to an AI, which ironically misses the point of self-learning entirely, as well as the point of safely training AI systems to be ethical.


However, it is a good counter-point to mention that humans neither memorize the large dataset of ethical actions for virtues, but rather learn to recognize each and every virtue. In this way, the entire system is flipped on its head- rather than learning countless and endless possibilities and attaching values to them, the system would learn to recognize a certain trait of an activity or characteristic, and attach a value. In this way, the infinite magnitudes of possibilities would be taken down to the few virtues that humans have and their typical characteristics. However, this ventures into AI research much unlike Mu-Zero. While Mu-Zero attached values to actions to repeat them and learn, it would be incredibly hard to attach differing values to differing virtues, let alone learn to characterize them generically, without the help of some other AI system like DALL-E, an algorithm capable of learning and grouping images of anything, given large datasets (Ramesh 2021). Even then, with a pretend game where Mu-Zero would potentially sort images of people over and over again to group the right virtue with their characteristics, it is still not making a “virtuous” choice by learning to recognize these virtues. In fact, it is still an ingrained consequentialism in this learning behavior, and rather than learning anything of value, the system is not in any way better at virtue-ethics unless it too can learn to replicate those actions, which is again, not the point of Mu-Zero as a self-learning algorithm, rather than one that learns from professional players.


In turn, more ethical systems can be attempted to be learned with this model, such as deontological systems like Kantian ethics. This isn’t such a large problem, with such a degree of scale as virtue-ethics, as Kant proposes that each action can be boiled down to a binary choice: good or bad. In this way, instead of attempting to learn all the possible ways to somehow fit within a category of some virtue, the algorithm will simply make the next best choice, which is exactly what the reward function of Mu-Zero was implemented for in the first place. Where Mu-Zero excels in maximize-next-choice Atari games like James Bond or Pong, it would also easily excel in such straightforward environments where it could easily learn things like “hit human: bad” and “avoid human: good”, even without a drop of the value function nor any consequentialism. However, Kant’s ethics meet their match in a variety of moral dilemmas where right and wrong are incredibly poorly defined, situations where even making any at all choice itself is a morally ambiguous option. In these moral dilemmas, classically defined in situations like “Sophie's Choice” or the “Trolley Problem”, consequentialist ethics and virtue-ethics have clearly defined paths forward. Kantian ethics, however, usually do not, as what is “morally right”, while a general consensus is usually followed, usually pertains heavily to each person. In both dilemmas, the choices present obvious infringements into other areas of ethics, from quantifying death for trolleys to qualifying the worth of two children in “Sophie’s Choice”. An algorithm, like Mu-Zero, while it could be reasonably efficient at solving most situations with Kantian, next-best-choice, ethics, would find a vast amount of trouble in these dilemmas, as most humans who subscribe to Kant usually do. Given that the reward function is simply one part of the Mu-Zero algorithm, it would be reasonable to assume that the fallback is consequentialism in the value function, and thus, the ethically ambiguous would be solved, given that Mu-Zero has been set up to handle seemingly both types of games and ethical systems, in turn. This has actually been proposed by an ethical theorist, but in an opposite light, where consequentialism is presented as a main form of solving situations, and a “Kantian ambulance” resides as a backup, but is certainly an interesting way of approaching ethics, to solve most situations in one, and fall back as is necessary to another agreeable system (Hauer 2022).


In conclusion, the “ethical games” that AI will soon play with and in our lives loom closer with every new iteration of self-learning AI. As it stands with the current implementation of Mu-Zero, it is entirely possible and even foreseeable that this algorithm has been and will be incredibly effective at Consequentialist and Kantian ethics, systems that are not too ambiguously defined. However, with other systems where preference and plentitude run amok, the current functions would never be able to absorb nor self-learn the specificities of this information without some new feature or function. In ethical dilemmas, much alike to humans, the algorithm would certainly find problems, but instead fall back to another solution, as most ethical dilemmas, while not immediately solvable with certain ethics systems, can be almost instantly answered with others. Overall, the tripartite model that Google has developed as the forefront of AI research has the potential to learn two of the three major ethical systems. Whether it does so, and whether the third can be learned with new functionality, is entirely in the hands of the researchers that conduct this “game-oriented research”, which has much farther reaching implications than most know.


References


Agarwal, & Mishra, S. (2022). Responsible AI : Implementing Ethical and Unbiased Algorithms. Springer International Publishing AG.


Buontempo, F. (2019). Genetic algorithms and machine learning for programmers : create AI models and evolve solutions. The Pragmatic Bookshelf.


Degrave, J., Felici, F., Buchli, J., Neunert, M., Tracey, B., Carpanese, F., Ewalds, T., Hafner, R., Abdolmaleki, A., de las Casas, D., Donner, C., Fritz, L., Galperti, C., Huber, A., Keeling, J., Tsimpoukelli, M., Kay, J., Merle, A., Moret, J.-M., & Noury, S. (2022). Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602(7897), 414–419. https://doi.org/10.1038/s41586-021-04301-9


Dubber, Pasquale, F., & Das, S. (2020). The Oxford handbook of ethics of AI / edited by Markus D. Dubber, Frank Pasquale, and Sunit Das. Oxford University Press.


Faulhaber, A. K., Dittmer, A., Blind, F., Wächter, M. A., Timm, S., Sütfeld, L. R., Stephan, A., Pipa, G., & König, P. (2018). Human Decisions in Moral Dilemmas are Largely Described by Utilitarianism: Virtual Car Driving Study Provides Guidelines for Autonomous Driving Vehicles. Science and Engineering Ethics, 25(2), 399–418. https://doi.org/10.1007/s11948-018-0020-x


Gordon, J.-S. (2019). Building Moral Robots: Ethical Pitfalls and Challenges. Science and Engineering Ethics, 26(1), 141–157. https://doi.org/10.1007/s11948-019-00084-5


Guarini, M. (2006). Particularism and the Classification and Reclassification of Moral Cases. IEEE Intelligent Systems, 21(4), 22–28. https://doi.org/10.1109/mis.2006.76


Hagendorff, T. (2020). The Ethics of AI Ethics: An Evaluation of Guidelines. Minds and Machines, 30, 99-120. Hauer, T. (2022). Incompleteness of moral choice and evolution towards fully autonomous AI. Humanities and Social Sciences Communications, 9(1), 1–9. https://doi.org/10.1057/s41599-022-01060-4


Ramesh, A., Pavlov, M., Goh, G., Gray, S., Voss, C., Radford, A., Chen, M., & Sutskever, I. (2021). Zero-Shot Text-to-Image Generation. ArXiv:2102.12092 [Cs]. https://arxiv.org/abs/2102.12092


Raicu, I. Reassessing the Santa Clara Principles. Www.scu.edu. Retrieved April 1, 2022, from https://www.scu.edu/ethics/internet-ethics-blog/reassessing-the-santa-clara-principles/


Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., Lillicrap, T., & Silver, D. (2020). Mastering Atari, Go, chess and shogi by planning with a learned model. Nature, 588(7839), 604–609. https://doi.org/10.1038/s41586-020-03051-4


Schrittwieser, J., Hubert, T., Mandhane, A., Barekatain, M., Antonoglou, I., & Silver, D. (2021). Online and Offline Reinforcement Learning by Planning with a Learned Model. ArXiv:2104.06294 [Cs]. https://arxiv.org/abs/2104.06294


Vinyals, Oriol, Igor Babuschkin, Wojciech M. Czarnecki, Michaël Mathieu, Andrew Dudzik, Junyoung Chung, David H. Choi, et al. (2019). “Grandmaster Level in StarCraft II Using Multi-Agent Reinforcement Learning.” Nature, 575 (November). https://doi.org/10.1038/s41586-019-1724-z


The Ethical Game: How To Make AI Moral

ABSTRACT

Research in artificial intelligence (AI) agrees almost unanimously on one concept: the systems we create must be beneficial to any and every human being. This often coincides with the talk of making “ethical machines” and “eliminating discrimination” in our algorithms. However, most researchers accomplish these goals by tweaking their training data. This paper shows an alternative way of achieving the same feat instead through the lens of an ethical game. As a result, AI can train itself to become a moral algorithm without much direct input from data scientists. Finally, I showcase this example in action and examine how its current implementation could be extended to develop into reality with the robots and AI of today and the future.


INTRODUCTION

Games are becoming more dominant in our lives every day. They train NASA astronauts in space, teach children beginner concepts, generate billions in GDP for the US, and help those with social anxiety or other disabilities, among many other uses. They are used for education, entertainment, and, as research continues, simulating and visualizing almost anything, from entire other planets in detail enough to walk around in to even the deepest caverns and ocean crevices of Earth itself. NASA scientists claim their visualizations are just programs “you can play like a video game” (Hussey, 2015). In AI, games have often been used as the benchmark to evaluate AI’s progress in becoming superior to humanity, first defeating humans at Chess in 1992, the game Go in 2006, and almost every Atari game in 2020 (Ensmenger, 2012). This is not a coincidence: games are a rather interesting benchmark for a large subset of skills, like the intense strategy necessary in Chess, or the reflexes needed to pilot killer drones, something the U.S. Army has hired gamers specifically for in the last few years (Broersma, 2015). Having a mind accustomed to strategy and action is something games like Chess have commonly benchmarked, and military forces, famously the U.S. Armed Forces, have used these types of simulations to recruit and even train new candidates, even holding annual Chess Championships for the last half-century (Schwab, 2005). It should come as no surprise that DARPA (Defense Advanced Research Projects Agency) and the Pentagon have been sponsoring these breakthroughs in AI in hopes that these systems will create and maintain America’s supremacy with a mastery in strategy (DARPA, 2021). The US government has invested billions of dollars in artificial intelligence capable of winning games.


These same games are constantly played and mastered by algorithms, improving daily. The latest iterations of artificial intelligence can master the game of Chess in four hours: from absolute beginner to world record competitor (Schrittwieser, 2020). These new algorithms master the game in a completely new way compared to previous Chess AI. It used to be the case that millions of Chess games were analyzed and “learned from” to create a superintelligence capable of winning any game. However, new artificial intelligence masters the game precisely as a human would: unknowingly. The AI is given no information about Chess or any dataset of “right” or “wrong” ways to play the game. It just starts to play based on all the valid moves and eventually builds up such an extensive neural network of possible and probable game states that it becomes almost impossible to defeat. This self-learning algorithm results in Chess strategies that have never been tried before, simply as a result of being trained on sheer trial and error rather than anything else. The algorithm excels against others directly trained off millions of datasets and matches before, as its moves tend to have more randomness, leading to more unique and unexpected plays, and are better off for the strategic feats of the chess board.


The field of strategy is where a new concept has excelled: self-learning in unknown environments. DeepMind, an AI subsidiary of Google, propels the world forward in this regard. The idea of their latest iterations of AI models is that AI can genuinely learn on its own, even when faced with a situation with no rules and guidelines. Google has used the DeepMind algorithm to become the world champion in Chess (again) and set records across the industry in other fields, like Atari games (Schrittwieser, 2021). The direction Google and other large companies which pursue the idea of “unknown environments” are heading in is predictable: deployment into the real world, which is full of unknowns. Even the Pentagon has contracts with Google to specifically deploy artificial intelligence (Wakabayashi, 2018). The same AI capable of beating Chess by learning it from scratch is just the right amount of super-human to begin understanding, interpreting, and doing actions in the real world.


Deploying machines into the real world, with real consequences, is where ethics takes the grand stage. AI-enabled algorithms have already been deployed for tasks such as auto-content moderation and targeted advertising to great success, eliminating low-wage click working and mass moderation, typically abusive to poorer countries (Gillespie, 2020). Soon, artificial intelligence will be capable of understanding and learning uniquely human environments. The question remains: how will they act? That ambiguity baffles many ethical scientists, questioning whether we have an opportunity to tip the balance of power towards any set of ethical principles and who ultimately holds power at the end of the day (Santa Clara Principles, 2018).


The researchers and the military are the behemoths in the AI community working on completely separate projects. The U.S. government and large companies attempt to make their machines capable of beating any and all games, thereby mastering strategy. Researchers are increasingly concerned with the implementation and consequences of giving power to these algorithms, attempting to take responsibility. Strategy and responsibility do not have to take two separate fields, however. While military applications for machines seem reckless and irresponsible, they are very type of innovations that have been proven to limit casualties and deaths overall on the battlefield, as well as carry vital supplies, like vaccines and medicine (Draganfly, 2022). The military focus on strategy in their AI games, but there is no reason not to embrace the tenets of ethics and responsibility in that strategy.


GAMES

“It is ethically irresponsible to focus only on what AI can do. We believe it is equally important to ask what it should (and should not) do.” (Good Systems, 2022).


Ethical responsibility is a common sentiment among researchers in the field of artificial intelligence, as well as that of the largest companies and governments in the world, including Microsoft, IBM, and the U.S. government (Agrawal 2016, Anderson 2011). As our algorithms become increasingly advanced, we must design them in a way that considers the ethical implications of their actions. This is often accomplished by tweaking the training data used to teach the AI system. For example, after uncovering racial bias in an AI-powered camera tracker, HP included thousands of more faces of minority populations in their training. The researchers had trained their algorithm on very few diverse faces, which started problems when it could not recognize other ethnicities (Inciter 2009). This was a top-down solution to changing their algorithm to become more inclusive and ethical by directly altering the data. However, there is another way to approach the aforementioned problem.


This method has many advantages over traditional methods of data alteration. We can use an ethical game or simulation to train the AI system. In this context, the AI system is placed in several scenarios and must choose the best course of action (Guarini, 2006).


First, it allows the AI system to learn independently without needing constant supervision from data scientists. Second, it is flexible, as the AI system can learn from different scenarios, not just the ones we anticipate. Finally, it is scalable, as the AI system can train on more data. These are essential qualities of an AI system, as improving efficiency and cost-effectiveness is highly important (Kachuee 2022).


This is not a generally “new” idea, as this is precisely what the newest algorithms do. Current scholarship harps on self-learning AI, mainly led by Schwittweiser and Silver, the two DeepMind scientists behind the last two world champion chess algorithms: AlphaZero and MuZero. Instead of taking in Chess data from large datasets, their AI will play Chess games repeatedly until it has mastered it. Even in science fiction, the nuclear form of this idea has been proposed as the “simulation argument” many times before, usually in the context of dystopian scenarios. Movies like “The Matrix” and “War Games” traditionally show how far these “simulation games” can take us, where attempts to learn from and emulate the natural world for a computer to understand cause catastrophe. From acting out World War 3 on computers to losing all value and respect for human life, popular fiction has no soft spot for the cruelty of machines.


The argument against ethical AI generally follows the pattern of the machine having little to no empathy or understanding of the world. Abstractions like “the value of human life,” once given to a computer, are thought of as nothing more than numbers (Chang, 2021). In popular science fiction, most machines lose all responsibility for these lives, thinking of them as nothing more than numbers and a simulation to solve. In this way, even scientists who propose such ideas are thought of as cynical and crude, as machines could not possibly make accurate predictions when it comes to things like culture, poverty, livelihood, or more subjects that would typically involve the work of a social worker or community leader, rather than a glorified accountant (Graham, 2021).


However, just because a famous depiction exists does not make it accurate. While plenty of novels, movies, and other mediums may “whistle-blow” the impending fate of human lives controlled by numbers, these beliefs are nothing more than mysticism. Computers evolved past “1s” and “0s” decades ago, but sentiment about them has not. This is likely because most people only interact with computers on a superficial level. They use them to perform simple tasks, like checking the weather or sending a text. However, computers are capable of much more, like “speech recognition, biometry, machine vision, video surveillance, computer-aided medical diagnosis,” and more (Burduk, 2020). Machines are no longer relegated to simple calculations and decisions but have been pivotal in reshaping the world of healthcare and military applications. Computers far surpassed simple problems decades ago (Husain, 2017).


This is no mere fantasy. Computers have been doing these kinds of calculations for the last decade. The neural network program PSP++ was designed in 2003, where AI could reasonably put together actions, like subjects and predicates, from words. The image processing AI DALL-E can classify millions of objects, including race, gender, and emotion (Ramesh, 2021). Google continues to run these gamified simulations every day. In one, it pitted AI entities in a survival environment, where it became profitable to work together as opposed to fight against other entities (Whitwam, 2017). What kind of “cynical” system works with others, especially one never explicitly programmed to?


This is where the worlds of simulations and games collide. Anyone can play with any number of variables to make a simulation work out in their favor, which is what a game is all about. A game makes a player do the same task or run the same type of errand until a high score or a desired goal is achieved. These games range from primal survival to royal chess, passive fishing to offworld exploration. Artificial intelligence is capable of these different environments and these games, while DeepMind, the largest AI research company in the world under the stead of Google, has a particular interest in mastering them. DeepMind researchers firmly believe that artificial intelligence mastering games yields scientific discovery, even going so far as to claim that “games [are] the great proving ground for developing and testing AI algorithms” (Hassabis, 2021). In these games, a particular form of AI is used: the self-training AI. Instead of directly being fed information, like the thousand best games of Shogi (Japanese Chess) ever played, it instead learns by playing itself over and over again.


These simulations allow AI to learn and become superhuman in virtually any situation. Given the constraints that make up a game, the AI self-learns to such a capacity that it becomes approximately 10,000 times better than any human player (Silver, 2018). The sheer outperformance of humans was a result demonstrated early in the realms of Chess and Go, the pinnacles of board games. Even in the virtual world, AI dominates various Atari games and is starting to creep into 3D games (Schrittwieser, 2020).


The 3D world is that of the natural world, too. These machines that dominate a 3D Call of Duty match could find themselves in the same brutal warfare today. This is not foreshadowing, nor a warning, but a reality. The US Military has been using unmanned drones over regions like Iraq and Kuwait since the Obama Administration (Abeyratne, 2013). These AI-enabled drones are highly specialized and trained on successful strikes against enemies beforehand, but one could imagine the ramifications of “self-training” for these killing machines without a virtual environment.


For this reason along with a multitude of ethical concerns, most AI military work is done in simulation before ever reaching impact in the real world. In these mock games, the drones train on actual data and then are given videos of new targets to evaluate whether they could recognize and make the right choices (Drew, 2005). These are the same games, as unethical as they may seem, that AI will use to train before it becomes helpful, let alone safe, in the self-driving car industry or elsewhere.


ETHICS

Knowing these games exist, and are already being used to train algorithms, even on abstract and strange things, like emotion, wordplay, and culture, the question becomes: is there a way to imbue an ethical system into machines?

To that question, the following laws were proposed a half-century ago by the scientist Isaac Asimov:

“A robot may not injure a human being or, through inaction, allow a human being to come to harm.

A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.

A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.”

(Asimov, 1942)


In the forties, Asimov proposed a centralized approach to the problem of machine ethics: program three basic and irrefutable laws into every intelligent robot. These laws became “Asimov's Laws” and are heralded as the best guideposts for ethical machinery for being generalized and easily applicable (Haddadin, 2014). All of the laws, however, simply deal with the value of human life. Asimov responded to pop culture's mystics by claiming that machines could recognize and protect human life.


Ethics, in that regard, is not satisfied whatsoever. The value of human life is undoubtedly significant, but these laws have no regard for the conditions or how that life is lived. Does the machine pay respect to the elderly crossing the road? Does it save the lives of the many or the lives of the smart and powerful? How will a machine choose which humans it should protect, if ever given a clear choice?


Asimov is decidedly empty on this front. Anderson argues that he did not make “ethics precise enough to program,” seemingly intentionally leaving the laws vague and shallow (Anderson, 2011). Robots will follow Asimov’s laws contingent on human sanctity, but the laws themselves almost guarantee none. They are so generic and broad as to promote their wide adoption but leave little room for their usefulness. For example, in a situation where a machine “could not, by inaction, cause a human to come to harm,” how would the machine respond to the famous “Trolley Problem”? In this situation, two groups of humans are tied to train tracks at a split. The train can either continue and kill the first group or be diverted by a bystander for the train to change tracks and kill the second group instead. In this case, the “ethics” are really in how humans (and the machine) prioritize the groups by literally valuing their lives. What does Asimov’s ethical robot do? Save the closest humans first? Save the most humans? Save the best humans? Whenever given more specificity, Asimov’s do-good laws fall apart.


However, this is exactly the area in which ethical games will succeed. The machine will never see something for the first time. It will have simulated every possible choice and outcome hundreds, if not thousands of times before, and used those experiences to succeed in this new one. Like humans, whom neural networks were modeled after, it would learn and apply old information to new scenarios (Picton, 2000). It is a common phrase to say “learn by play,” but for humans and machines, it is a true statement. A machine could train on these ethical scenarios and become the most ethical machine in its own right, becoming superhuman in a field in which some humans believe it to be an absolute menace. How could the cold calculator have more value for human life than a human? How could a machine beat man at ethics?


Humans have known that machines were the masters of strategy since Chess world champion Gary Kasparov’s defeat in 1967. This was the first significant loss of humans to artificial intelligence, but it was not just a defeat that lends itself solely to the game of Chess, but decisions and strategies everywhere (Heßler, 2017). Given that these machines have learned incredible things through games, including survival tactics, speechcraft, and more, it is not hard to see the one thing they have always been doing along the way: strategy. Ethics, too, is nothing more than a strategy for navigating our human lives. It lets us know how we should approach certain situations, what goals we should have, and what we should do in almost any situation that involves affecting those around us (Agarwal, 2020).


In becoming the champions of Chess, artificial intelligence became the champions of the entire world of Chess strategy: moving in certain patterns, avoiding specific scenarios, or lining up the board to play a discovery checkmate. Machines can beat humans at the game easily now, as even a Chess app on a five-year-old smartphone could beat most players (Ford, 2021).


So why can AI not beat humans in other domains, like ethics? It is contested that AI is incapable of this with the same binary representation argument from popular fiction, whereas machines cannot understand concepts outside of logic. It is said that certain domains of human life were never meant to be touched by machines, like philosophy and art, and that algorithms are nothing but “hardcoded normativity” that can refer to nothing more than “white and black,” “on and off”, “right and wrong” (Chang, 2021). However, Artificial intelligence has dabbled in realms that cannot be strictly represented with logical values with the language learning model GPT-3, which can write entirely new stories and articles, as well as the image learning model DALL-E, which can generate images of people and places that have never existed but look completely real. With the advent of self-driving cars, AI will soon have to solve the problem of ethics, which is still far beyond the generative and boolean capabilities of before (Gautam Singh 2022, Floridi 2020).


Self-driving cars present a new challenge. Unlike artificial intelligence that finds itself used for strictly military purposes, like drones, or strictly academic purposes, like text generation, they have a somewhat awkward position. AI-enabled vehicles can end human life the instant they go past a particular mileage, and without a human at the wheel, it becomes impossible to serve justice from courtrooms onto machines. Even worse, there were no “self-driving bikes” or “self-driving motorcycles” before this gigantic leap in technology, so when these driving devices are deployed, they will likely come with the killing power of a truck but the training of a toddler. With Tesla claiming that self-driving will fully roll out within the next five years, and with it currently already being deployed to users, the stakes for damages are high. Ethical considerations for the entire AI community have heightened to a frenzy of fear (Talpes, 2020).


However, the computer could learn ethics, much like other domains of human knowledge, through play. It is not just a game, but a realistic simulation, run hundreds or thousands of times, designed to teach ethics to a machine. Much like how algorithms play their games thousands of times to learn and eventually master them, AI could soon find itself mastering the strategy of various ethical systems. Given thousands, if not more, of moral scenarios to train on, and a new original scenario, machines could finally begin to understand and evaluate situations with learned morals. Alike to how most humans developed their morals, machines could learn their ethical principles.


CURRENT TIMES

The strange part of these ethical games is not in the “gamifying life” aspect but rather the fact that they have already been made. No creative work is sheltered from unknown effects. They are a product of the culture and creator behind them, and the same is true for games. Each maker has their own beliefs that guide them to create the scenario to beat in such a way, whether religious beliefs, optimistic, ethically oriented, or other. Machines beat humans at Chess and thus took the crown of strategy away, but there was another field they won too, an entire area of ethics: consequentialism.


Chess was a game created with an ethical headpiece of consequentialism, designed to mirror the rational robotic strategy and brutality of real warfare (Hale, 2008). Consequentialism, the ethical theory that frames the phrase “the ends justify the means,” has long been an ethics set associated with calculating machines, even with artificial intelligence, as it pays no regard to anything other than the outcome (Card, 2020). In Chess, however, it practically emanates through the entire game. Sacrifices of pieces can and are made, at all costs, to maneuver one’s way to win the game. Sacrifice at all costs. Win at all costs.


The algorithms that beat humans at Chess became masters of this ethical field in their own right. Even though their actions may be limited to the board, the implications are monumental. The machines learned that there is a reward to sacrifice by playing Chess. Any move can be made to win, whatever the consequences. The same machine that beat former world chess champion Gary Kasparov might have sent thousands of soldiers to die in a battle if it meant it would win the war. It is actually well documented that “incredible sacrifice plays” made by AI would never find themselves used by a reasonable human player (Shiva, 2021). Artificial intelligence has already learned the premises of one of the largest and boldest fields of ethics in human history.


In any case, machines being great consequentialists is not new. Popular fiction depicts robots as “unfeeling” and calculative, like HAL 9000 from Space Odyssey (Abrams, 2017). The actual test of time lies in seeing these algorithms adopt other ethical sets which are not so strongly defined, like virtue ethics. There could be a thousand different ways to portray a thousand different desired “virtues,” like loyalty, humility, bravery, and the like, compared to consequentialism's singular goal-oriented nature. Even other ethical systems, like those rooted in filial piety and respect for the elderly, would be implemented and easier defined than one based on a tumult of traits. Self-learning would make it so these machines are “taught” what is right and wrong. As this training grows, however, algorithms can take on more nuanced ethical systems.


If Chess is the pinnacle of Consequentialism, what games may mirror other ethical systems for our machines to learn from? Are there Atari games that may teach some hidden principles of another system altogether? Should we design our games to see these ethical dilemmas and their results?


A few games could potentially teach other ethical systems to machines. For example, the popular game "The Sims" can be used to teach about utilitarianism, as players must balance the needs of their Sims characters to keep them happy. "The Legend of Zelda" could be used to teach about deontology, as players must complete specific tasks to progress, regardless of the consequences. A few others could teach about virtue ethics, such as "Animal Crossing," where players must maintain relationships with animal villagers by performing specific tasks accustomed to each villager’s traits. Games are principled towards specific ethical foundations, and with the epidemy of game-oriented AI, these ethical situations will be solved and played with daily.


It is notable to point out where current artificial intelligence lacks, as self-learning is not a glorified conduit to becoming great at all things. Entire ethical systems could be thrown to the side, as the ideas behind them are currently too hard to replicate in a computer program, even with advanced AI.


The current limitations of self-learning are painfully visible in DeepMind’s attempt to become the world champion of another strategy game: Starcraft II. This was accomplished in 2019, and DeepMind took the title away of Starcraft world champion away from the human players, but in doing so, they showed that self-learning did not work for this simulation (Arulkumaran, 2019). DeepMind researchers had run iterations of the AI to make it “learn by play,” but after days of running, the AI had no gameplay sense whatsoever. It could still barely move, let alone figure out how to deploy forces to strategic counterpoints.


This failure prompted DeepMind to retreat to the tried and true traditions of deep learning: large datasets. They gathered millions of Starcraft games and instead directly trained the AI, named AlphaStar, off the terabytes of data available, after which the AI became a shocking world champion (Vinyals, 2019). It was evident in this scenario, however, that the self-learning model failed to bring the same “unique” and “incredible” strategies that it had brought to games like Chess and Go. Any ethical system with a framework similar to Starcraft II, having too many variables and options, might suffer the same fate, forcing humans to “data collect” ethical situations. It is not that data collecting ethical situations is bad, but it is an inherently slow process, as it took years to have thousands of StarCraft and Chess game datasets readily available. In the same way, it would take years and likely countless moral arguments to create any datasets for ethical systems.


So, what went wrong with self-learning Starcraft? For starters, the most considerable difference between StarCraft and the games self-learning AI had mastered was the style: turn-based versus real-time. Board games are decidedly finite, with options to the board size's combinatorial powers. In contrast, real-time games had ‘infinite’ board sizes relative to every pixel visible on the screen as a possible position. However, self-learning models like DeepMind’s MuZero and other iterations have demonstrated excellence in Atari games, which can be considered ‘pixel board games’. There is still a finite set of pixels in most Atari games, mainly in the realm of 300 x 200 pixels large (Mnih, 2013). StarCraft, on the other hand, runs natively at 1920 x 1080 pixels, which is ten times as many ‘pixel board spaces’ overall. In cases of self-learning, the algorithm would have to analyze these pixels to predict a pattern and a rewarding scenario, which becomes incredibly difficult when various pixels coalesce together to represent different objects, like enemies, health bars, weapons, and more.


The sheer amount of data required to remember and learn from these situations is where the limitations of self-learning algorithms are evident. The AI would run against a wall a thousand times before it learned the scale of the boundary. The same obstacle could be colored differently at different sections, confusingly creating a situation where it is unclear whether there is a bypass at any location. A human could easily recognize the objects on a wall, like trees and rocks, and guess that they were solid and impassable. However, algorithms like MuZero are minimally trained for object classification and recognition in contract to as a human would be.


The realm of other AI models mixes in here to provide support. DALL-E, an image generation algorithm developed by DeepMind’s competitor, the company OpenAI, can recognize and create images that have never existed. Given a complex scene of a restaurant, this model could point out every glass, chair, table, patron, napkin, and other objects by classifying them (Marcus, 2022). It can make changes to the image if asked, like inserting or removing a person, and will fit them in to blend with the environment. In this regard, it would seem that DeepMind’s strategy paired with OpenAI’s classification could potentially bridge the gap between self-learning's significant data problems, making algorithms capable of recognizing trees and obstacles in games easier. However, as competitors in AI, this seems a moot point.


Given that self-learning is generally unusable in these open-ended scenarios, it calls into question what ethical situations or systems would be difficult for machines to learn. Consequentialism only offers one solution in most cases: to benefit the most people possible. Other ethical systems, like filial piety, seem to have simple goals that would be easily programmable. However, there is an ethical system far too arbitrary and diverse to be self-taught: virtue ethics.


Virtue ethics asks its adopters to maintain and promote values based on generic qualities, like honor, loyalty, or honesty. Asked to name all the virtues, however, no virtue ethicist could. It is often reduced to a few admirable qualities to help counter the incredible number of traits possible. Hagendorff, a prominent researcher in the field of AI ethics, even argues for “basic AI virtues” explicitly, as he believes that the more the machine behaves in a specific manner, the less it will have to learn about the world around it (Hagendorff, 2022). While this may be an accurate assumption, classifying virtues is much harder than any single image or object. For example, given an image of an exchange of money, what virtues would this entail? Are the people being loyal and honest with each other? How would we know? It becomes challenging to foresee artificial intelligence capable of recognizing something even humans have a hard time classifying. Currently, classification is one of the most extensive data problems there is in artificial intelligence. It often fails when given too many options, which would be the case with less-defined ethical systems, like virtue ethics.


However, the choices for most ethical situations and systems are somewhat limited and obvious. Most ethical systems rely on a set of finite choices, which are perfect for games with finite possibilities and outcomes. In a majority of cases, it can be boiled down to the logical differences between life and death.


“To be, or not to be, that is the question”
(Shakespeare, 1601)


FUTURE

These might have been the same ethical questions that led Google to design a “survival simulation” game for their algorithms to play against one another. In this game, the AI had to either work together or fight against each other to survive for as long as possible. Reminiscent of early primitive banding and hunting in packs, the game, aptly titled “Wolfpack”, saw the AIs switch between constantly attacking and helping each other out when the benefit was great enough. DeepMind researchers believed they were documenting a phenomenon known as “temporal discounting,” where if the reward is too far away, it will not be considered as high, despite its amount (Whitwam, 2017). However, within this simulation, two algorithms had trained themselves to cooperate if the reward was deemed high enough. Therein lies the solution to the ethical game and the most prominent problem in any ethical philosophy: the goal.


Humans know the end goal for almost every game. The goal of Chess is to checkmate the other king by any means possible. The goal of Brick-Breaker would be to net a high score, while the goal of Google’s “Wolfpack” would be to survive as long as possible. While certainly separate, these goals all conveniently fall underneath the umbrella of consequentialist ethics set. In that regard, the end goal for many ethics sets is also well-defined, usually benefiting the most people, or some certain subroup of people. One might argue that the end goal for a generic ethics system, like virtue ethics, might be for the person to become the most loyal or most brave human (Samuelsson, 2020). However, as of the classification issue, an algorithm cannot benchmark itself on abstract or strange concepts like loyalty or bravery, so although holding virtues might be an end goal for an ethical AI, it is seemingly impossible to achieve.


Moreover, the nature of a game is ever-changing. Chess need not be played to the rules nor with the same goal in mind initially intended. One can easily imagine a game of Chess where an AI is not allowed to sacrifice. The AI would become incredibly good at seemingly ethical Chess strategies, never once risking or using its pieces in such a negative way. By placing this restriction, the chess algorithm leans into a new field of ethics, as instead of doing anything “at all costs” to obtain victory, it now will “consider the optimal situation with reasonable restrictions in place. This restrictive strategy is much closer to the definition of specific ethical systems we imposed earlier in the paper: strategies to navigate our lives in specific ways.


Now picture again a game of Chess where the pawns must protect the “higher class” at all times. This concept is not difficult to program in or to think about, but the implications are increasingly at odds. In this ethical system, almost akin to filial piety, the AI can only employ a strategy that properly respects the elderly (non-pawn) pieces. In both of these situations, the traditional rulebook of Chess has been kept, but the strategy has changed. Much like how a human may change the way they act rather than attempt to change the systems or rules imposed on them, algorithms can modify their behavior through self-learned strategies to better conform to our ethical expectations.


Publications have shown time and time again that Chess is the study of exhaustive strategy, attempting to find any and all board placements that would benefit the current situation (Dehghani 2017, David 2016). This strategy can only get the algorithm so far, however. The first algorithms to solve the game would perform billions of calculations to analyze every combination of the board possible (Maharaj, 2022). Pitting one of these “more ethical” and thus restricted algorithms against the typical consequentialist algorithm would result in a complete failure. These restrictions, while serving as ethical considerations, almost guarantee that the competing AI that wants to save pawns or save the elderly will lose many favorable board combinations and thus the game.


Does exhaustive strategy necessarily correlate to human scenarios? Is there such a large number of possible outcomes that a general algorithm could find the best one by searching for it? It is worth noting that even humans do not pretend to have all the answers and proceed forward with a general restriction of their lives by ethics. Instead, would it be better for machines to proceed forward with the locally best choice, one made by an algorithm we know to be ethical in a certain way, so although restricted, we can guarantee this choice never was dubious or wrongly made?


Outcomes are where a line blurs between games and real life. The goal of any game is well-known, and the duties of robots are largely singular, whether it be delivering packages or selling items in a shop. This is why simulations are so helpful for singular tasks. However, in real life, goals are more localized than global. Far enough ahead, much like the “unknown environments” Google and other companies are developing towards, there will be general-purpose AI made for various tasks (Leong, 2021). The goal may not be so well-defined in these algorithms, and as a consequence, actions are much harder to predict or even control. Similar to playing a game against someone who does not want to win, their goal and motives are both unknown to you. Despite this problem, we can prepare by employing this restrictive ethical strategy on our machines. These “ethical strategies” might perform worse in environments, like the chess board, precisely calculated to benefit a consequentialist or a different ethical system, but that is not necessarily true in many scenarios in life, where conceptions of justice and ethics vary by community and culture involved (Birhane, 2021).


Take another game, like Battleship. Winning at all costs means destroying all the enemy ships. The end of the game guarantees that one may have most of their fleet destroyed before ever wiping out the enemy. In the real world, this would mean two nations that destroy one another just for a tiny “victory” to be won, which in consequentialist ethics, is still considered a win. A situation like this does not describe a victory, as both nations committed virtual suicide by engaging in this scenario in the first place. Even an incredibly efficient algorithm at a consequentialist game, like MuZero, could never escape the reality of the ethical system it is under: “at all costs” is almost always self-destructive.


This is not to say that algorithms should pursue alternative ethical systems other than consequentialism, but the fact is, as demonstrated by employing different strategies within games, they certainly can. The goal of any scenario can be changed, and much like bending the rules of the chess board, one does not have to play to win.


We do not have to constrain ourselves to the limits of our games, but we can alter them in any way to fit the goal we want to accomplish. Therefore, we can, and should, change how our games are traditionally played and create new ones with more ethical goals and decisions. In this way, we can finally create the ethical games that AI will learn from and employ in their future decisions. Making alternative Chess algorithms is a start to distinctly stray from the board game’s preset ethics set. However, as algorithms continue to develop, beating games that are not strictly about strategy, like campaign or community based games will open up entire systems of ethics for machines to learn.


All in all, it is essential to realize that the ethical systems of our games are not set in stone. They can be changed to reflect the world we want to live in and teach AI the values we hold dear. We should use this power to create games and other simulations that are not only entertaining but also educational and instructive. With the right mix of creativity and foresight, we can create a new generation of ethical games that everyone can enjoy, while machines and people can use them to learn and improve.


References


Abeyratne, & Khan, A. (2013). State use of unmanned military aircraft: a new international order? Journal of Transportation Security, 7(1), 83–98. https://doi.org/10.1007/s12198-013-0131-1


Abrams. (2017). What was HAL? IBM, Jewishness and Stanley Kubrick’s 2001: A Space Odyssey (1968). Historical Journal of Film, Radio, and Television, 37(3), 416–435. https://doi.org/10.1080/01439685.2017.1342328


Anderson. (2011). The Unacceptability of Asimov’s Three Laws of Robotics as a Basis for Machine Ethics. In Machine Ethics (pp. 285–296). Cambridge University Press. https://doi.org/10.1017/CBO9780511978036.021


Agarwal, & Bhal, K. T. (2020). A Multidimensional Measure of Responsible Leadership: Integrating Strategy and Ethics. Group & Organization Management, 45(5), 637–673. https://doi.org/10.1177/1059601120930140


Agarwal, & Mishra, S. (2022). Responsible AI : Implementing Ethical and Unbiased Algorithms. Springer International Publishing AG.


Agrawal, A., Gans, J., & Goldfarb, A. (2016, December 21). The Obama Administration’s Roadmap for AI Policy. Harvard Business Review. https://hbr.org/2016/12/the-obama-administrations-roadmap-for-ai-policy


Asimov, Isaac (1950). "Runaround." I, Robot (The Isaac Asimov Collection ed.). New York City: Doubleday. p. 40. ISBN 0-385-42304-7.


Birhane. (2021). Algorithmic injustice: a relational ethics approach. Patterns (New York, N.Y.), 2(2), 100205–100205. https://doi.org/10.1016/j.patter.2021.100205


Buontempo, F. (2019). Genetic algorithms and machine learning for programmers : create AI models and evolve solutions. The Pragmatic Bookshelf.


Burduk, Kurzynski, M., & Wozniak, M. (2020). Progress in Computer Recognition Systems edited by Robert Burduk, Marek Kurzynski, Michał Wozniak. (Burduk, M. Kurzynski, & M. Wozniak, Eds.; 1st ed. 2020.). Springer International Publishing. https://doi.org/10.1007/978-3-030-19738-4


Card, & Smith, N. A. (2020). On Consequentialism and Fairness. Frontiers in Artificial Intelligence, 3, 34–34. https://doi.org/10.3389/frai.2020.00034


DARPA. (2021). DARPA Announces $2 Billion Campaign to Develop Next Wave of AI Technologies. Darpa.mil. https://www.darpa.mil/news-events/2018-09-07


David, Netanyahu, N. S., & Wolf, L. (2016). DeepChess: End-to-End Deep Neural Network for Automatic Learning in Chess. Artificial Neural Networks and Machine Learning – ICANN 2016, 88–96. https://doi.org/10.1007/978-3-319-44781-0_11


Dehghani, & Babamir, S. M. (2017). A GA based method for search-space reduction of chess game-tree. Applied Intelligence (Dordrecht, Netherlands), 47(3), 752–768. https://doi.org/10.1007/s10489-017-0918-z


Degrave, J., Felici, F., Buchli, J., Neunert, M., Tracey, B., Carpanese, F., Ewalds, T., Hafner, R., Abdolmaleki, A., de las Casas, D., Donner, C., Fritz, L., Galperti, C., Huber, A., Keeling, J., Tsimpoukelli, M., Kay, J., Merle, A., Moret, J.-M., & Noury, S. (2022). Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602(7897), 414–419. https://doi.org/10.1038/s41586-021-04301-9


Degrave, J., Felici, F., Buchli, J., Neunert, M., Tracey, B., Carpanese, F., Ewalds, T., Hafner, R., Abdolmaleki, A., de las Casas, D., Donner, C., Fritz, L., Galperti, C., Huber, A., Keeling, J., Tsimpoukelli, M., Kay, J., Merle, A., Moret, J.-M., & Noury, S. (2022). Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602(7897), 414–419. https://doi.org/10.1038/s41586-021-04301-9


Draganfly. (2022). About us - Draganfly - A History of Innovation Since 1998. Draganfly. https://draganfly.com/about-us/


Drew. (2005). Unmanned aerial vehicle end-to-end support considerations. RAND Corporation.


Dubber, Pasquale, F., & Das, S. (2020). The Oxford handbook of ethics of AI / edited by Markus D. Dubber, Frank Pasquale, and Sunit Das. Oxford University Press.


Ensmenger. (2012). Is chess the drosophila of artificial intelligence? A social history of an algorithm. Social Studies of Science, 42(1), 5–30. https://doi.org/10.1177/0306312711424596


Faulhaber, A. K., Dittmer, A., Blind, F., Wächter, M. A., Timm, S., Sütfeld, L. R., Stephan, A., Pipa, G., & König, P. (2018). Human Decisions in Moral Dilemmas are Largely Described by Utilitarianism: Virtual Car Driving Study Provides Guidelines for Autonomous Driving Vehicles. Science and Engineering Ethics, 25(2), 399–418. https://doi.org/10.1007/s11948-018-0020-x


Ford. (2021). Rule of the robots : how artificial intelligence will transform everything / Martin Ford. Basic Books.


Floridi, & Chiriatti, M. (2020). GPT-3: Its Nature, Scope, Limits, and Consequences. Minds and Machines (Dordrecht), 30(4), 681–694. https://doi.org/10.1007/s11023-020-09548-1


Gary Marcus, Ernest Davis, & Scott Aaronson. (2022). A very preliminary analysis of DALL-E 2. arXiv.org.


Gautam Singh, Fei Deng, & Sungjin Ahn. (2022). Illiterate DALL-E Learns to Compose. arXiv.org.


Gibson, D. (2011). Using Games to Prepare Ethical Educators and Students. https://www.researchgate.net/publication/279480785_Using_Games_to_Prepare_Ethical_ Educators_and_Students


Gillespie. (2020). Content moderation, AI, and the question of scale. Big Data & Society, 7(2), 205395172094323–. https://doi.org/10.1177/2053951720943234


Good Systems. (2022). Good Systems | Bridging Barriers. Bridgingbarriers.utexas.edu. https://bridgingbarriers.utexas.edu/good-systems


Gordon, J.-S. (2019). Building Moral Robots: Ethical Pitfalls and Challenges. Science and Engineering Ethics, 26(1), 141–157. https://doi.org/10.1007/s11948-019-00084-5


Guarini, M. (2006). Particularism and the Classification and Reclassification of Moral Cases. IEEE Intelligent Systems, 21(4), 22–28. https://doi.org/10.1109/mis.2006.76


Haddadin. (2014). Towards Safe Robots Approaching Asimov’s 1st Law / by Sami Haddadin. (1st ed. 2014.). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-40308-8


Hale. (2008). Philosophy looks at chess / edited by Benjamin Hale. Open Court.


Hagendorff. (2022). A Virtue-Based Framework to Support Putting AI Ethics into Practice. Philosophy & Technology, 35(3). https://doi.org/10.1007/s13347-022-00553-z


Hagendorff, T. (2020). The Ethics of AI Ethics: An Evaluation of Guidelines. Minds and Machines, 30, 99-120.


Hassabis. (2021). DeepMind: From Games to Scientific Discovery. Research Technology Management, 64(6), 18–23. https://doi.org/10.1080/08956308.2021.1972390


Hauer, T. (2022). Incompleteness of moral choice and evolution towards fully autonomous AI. Humanities and Social Sciences Communications, 9(1), 1–9. https://doi.org/10.1057/s41599-022-01060-4


Heßler. (2017). The Triumph of “Stupidity” : Deep Blue`s Victory over Garri Kasparov. The Controversy about its Impact on Artficial Intelligence Research. Naturwissenschaften, Technik und Medizin, 25(1), 1–33. https://doi.org/10.1007/s00048-017-0167-6


Husain. (2017). The Sentient Machine : The Coming Age of Artificial Intelligence. Scribner.


Hussey, K. (2015, October 9). SVS: Making Video Games for NASA. Svs.gsfc.nasa.gov. https://svs.gsfc.nasa.gov/4379


Kai Arulkumaran, Antoine Cully, & Julian Togelius. (2019). AlphaStar: An Evolutionary Computation Perspective. arXiv.org. https://doi.org/10.1145/3319619.3321894


Kachuee, M., Nam, J., Ahuja, S., Won, J.-M., & Lee, S. (2022). Scalable and robust self-learning for skill routing in large-scale conversational AI systems. New York, Usa Ieee.


Leong. (2021). General and Narrow AI. In Encyclopedia of Artificial Intelligence : The Past, Present, and Future of AI (pp. 160–162).


Maharaj, Polson, N., & Turk, A. (2022). Chess AI: Competing Paradigms for Machine Intelligence. Entropy (Basel, Switzerland), 24(4), 550–. https://doi.org/10.3390/e24040550


McLaren. (2011). Computational Models of Ethical Reasoning: Challenges, Initial Steps, and Future Directions. In Machine Ethics (pp. 297–315). Cambridge University Press. https://doi.org/10.1017/CBO9780511978036.022


Mökander, J., & Floridi, L. (2021). Ethics-Based Auditing to Develop Trustworthy AI. Minds and Machines, 31(2), 323–327. https://doi.org/10.1007/s11023-021-09557-8


Picton. (2000). Neural networks / Phil Picton. Palgrave.


Raicu, I. Reassessing the Santa Clara Principles. Www.scu.edu. Retrieved April 1, 2022, from https://www.scu.edu/ethics/internet-ethics-blog/reassessing-the-santa-clara-principles/


Ramesh, A., Pavlov, M., Goh, G., Gray, S., Voss, C., Radford, A., Chen, M., & Sutskever, I. (2021). Zero-Shot Text-to-Image Generation. ArXiv:2102.12092 [Cs]. https://arxiv.org/abs/2102.12092


Samuelsson, & Lindström, N. (2020). On the Practical Goal of Ethics Education: Ethical Competence as the Ability to Master Methods for Moral Reasoning. Teaching Philosophy, 43(2), 157–178. https://doi.org/10.5840/teachphil2020420120


Santa Clara Principles on Transparency and Accountability in Content Moderation. (2018). Santa Clara Principles. https://santaclaraprinciples.org/


Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., Lillicrap, T., & Silver, D. (2020). Mastering Atari, Go, chess and shogi by planning with a learned model. Nature, 588(7839), 604–609. https://doi.org/10.1038/s41586-020-03051-4


Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., Lillicrap, T., & Silver, D. (2020). Mastering Atari, Go, chess and shogi by planning with a learned model. Nature, 588(7839), 604–609. https://doi.org/10.1038/s41586-020-03051-4


Schrittwieser, J., Hubert, T., Mandhane, A., Barekatain, M., Antonoglou, I., & Silver, D. (2021). Online and Offline Reinforcement Learning by Planning with a Learned Model. ArXiv:2104.06294 [Cs].


Schwalb, S., & Center, I. (2005). The Information Business: A Profile of the Defense Technical Information Center. Defense Technical Information Center Fort Belvoir Va.


Shiva Maharaj, & Nick Polson. (2021). Karpov’s Queen Sacrifices and AI. arXiv.org.


Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., Lillicrap, T., Simonyan, K., & Hassabis, D. (2018). A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science, 362(6419), 1140–1144. https://doi.org/10.1126/science.aar6404


Skorupski, J. (Ed.). (2010). The Routledge Companion to Ethics. Taylor & Francis Group.


Talpes, Gorti, A., Sachdev, G. S., Sarma, D. D., Venkataramanan, G., Bannon, P., McGee, B., Floering, B., Jalote, A., Hsiong, C., & Arora, S. (2020). Compute Solution for Tesla’s Full Self-Driving Computer. IEEE MICRO, 40(2), 25–35. https://doi.org/10.1109/MM.2020.2975764


Vinyals, Oriol, Igor Babuschkin, Wojciech M. Czarnecki, Michaël Mathieu, Andrew Dudzik, Junyoung Chung, David H. Choi, et al. (2019). “Grandmaster Level in StarCraft II Using Multi-Agent Reinforcement Learning.” Nature, 575 (November). https://doi.org/10.1038/s41586-019-1724-z


Wakabayashi, S. S., Cade Metz and Daisuke. (2018, May 30). How a Pentagon contract became an identity crisis for Google. CNBC. https://www.cnbc.com/2018/05/30/new-york-times-digital-how-a-pentagon-contract-became-an-identity-crisis-for-google.html


Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, & Martin Riedmiller. (2013). Playing Atari with Deep Reinforcement Learning. arXiv.org.


Yuval Noah Harari. (2018, August 30). Yuval Noah Harari on Why Technology Favors Tyranny. The Atlantic; The Atlantic.


Zhai, C., Kang, Y., & Luo, M. (2020). On the Application of Computer War Chess Technology in the Support of Military Supplies. Proceedings of the 2020 5th International Conference on Machine Learning Technologies. https://doi.org/10.1145/3409073.3409081