The Prime Directive

You’re locked in an underground bomb shelter with 100 people and a service robot that is pretty
indestructible. You cannot escape the bomb shelter without the robot’s help. Unfortunately the amount
of food and number of days it will take before the robot can finish getting you all out is not going to be
enough for all 100 people. It seems there are rations for 20 people for about 5 weeks and that is exactly
how long the robot will take to dig out. What do you do? And if the robot could enforce rules and
ensure the most people will survive would you want it to despite the loss of many?

Issac Asimov had an interesting thought provoking short story “Runaround” in his “I, Robot”
collection in which he established three laws of robotics:

1. A robot may not injure a human being or, through inaction, allow a human being to come to harm. 2. A robot must obey the orders given it by human beings, except where such orders would conflict with the First Law. 3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

Can we use laws like this to manage the robot or ensure it always does the right thing? If we have laws like this for robots and it works, why would we not use the same laws for humans? Wouldn’t we want humans to follow the same rules if it would work to protect humans from harm?

Philosophy of Truth
First lets discuss truth because if we need a robot to consistently do the right thing it needs to know what is true and what is false. So what is truth? Webster defines Truth as:

• The body of real things, events, and facts; actuality.
• The state of being the case: fact.
• A transcendent fundamental or spiritual reality.
• The state of being in accord with fact or reality.
• Sincerity in action, character, and utterance: integrity.
• The property (as of a statement) of being in accord with fact or reality.

Notice we can’t even use this definition consistently. Firstly it could be construed as contradictory, for example ‘Sincerity in action, character and utterance’ can someone feel that, but not do something in accordance with fact or reality? Can someone realize a spiritual reality that doesn’t embody real things, events and facts?

Let’s use our bomb shelter scenario to try and decipher what we can make of truth. Normally most of us would consider the ‘observable’ reality our facts. I established the story and we generally have to treat each of those observable items as ‘facts’. However if we were ‘in’ the story, it would be feasible that someone was wrong on some observations. For example, if everyone in the shelter agreed they cannot exit without the robot’s help does that make it a fact? What if they just lacked knowledge of the shelter and how to escape it? In real world scenarios we do not have a writer, we just have our own observations and agreements between each other to establish what is real, what are facts, what is truth.

All those terms unfortunately may or may not be the same depending on how we define them. The truth is, there are many theories and the greatest philosophers have spent many hours opining on the
possibilities. How then are we to come up with a prime directive?

Programming Truth
Let’s take a quick segway into programming. In programming we keep truth simple by ultimately
‘assigning’ truth. We set variables and whatever those variables are set to are facts. Then we can usetraditional logic statements from “and / or” to “if / then” to decisively and simply cut through reality and make things happen.

Ultimately however, once we start having competing priorities, the rigid programming starts looking more and more like human decision. If we look at traditional AI models, there’s a bunch of ‘probabilistic’ combinations deciding what it should do based on the training data ‘e.g. observations’.
The more capabilities we give it, the more human like issues it will have. It starts needing to make subjective evaluations on what was once all facts. Once we compound subjective truths on top of each other, it becomes no different from humans on the decisions it makes. We’ve all seen it at this point, how the various AI bots will ‘lie’ essentially giving incorrect data when pressed on topics that are complex or have nuanced beliefs that it was trained on.

What we can gleam about truth however is that we can at least look at truths in a few different useful simplistic steps that allow an AI or robot to make things happen: 1. Observation. 2. Deduction 3. Induction 4. Decision

Observations we are treating as ‘truth’ or as close we can get. This is what we’d consider Objective Truths. Then the AI will need to make deductions on the various observations. The deductions will be the foundation for it’s knowledge. Then once it’s done enough deliberation on observations and reasonable deductions, it would then draw conclusions or inductions based on all the knowledge it has deduced. These probabilistic conclusions and predictions will be the real basis for artificial intelligence and thinking. When we look at a neural network it’s doing all of these steps in the many layers of the network. Some of the the conclusions will be more ‘fact’ like truths used to establish what ‘is’. While other conclusions will be moral/decision like conclusions that will establish what ‘ought’ to be.

Let’s give some examples:
1. Observables: Stuck in bomb shelter. Food ration is only enough for 20 people for five weeks. Robot says it will take it five weeks to dig out of the shelter. You are hungry because you already didn’t eat before getting stuck. There was a count that established there are 100 people.
2. Deductions: Food ration limits, means by week 5 there will be some people who starve if dependent only on food rations. Since we are stuck everyone is stuck and no others will come in either.
3. Inductions: Despite there being limited food, the real issue is water. People can potentially survive 7days without water so if the 20 people for 5 weeks should mean there is enough water for at least a couple weeks with some dehydration but still living. This means to save everyone, we need to figure out a means of distilling urine most likely. Statistically 100 people in a situation like this could get violent.
4. Moral / Decisions: We need to manage the rations, protect the rations, identify intake requirements for each person, smaller people will need less water. Priority will be method of distilling urine. If we have the ability to protect rations and ensure distribution, should we quell population to ensure maximum survival?

In our breakdown, we could approach philosophical theories of truth and note that what we conclude
will be different in each case, but the most significant impact will be seen in our morals that we come up with and the decisions we make. How we choose to perceive our truths is how we ultimately will make our decisions.

Tiers of Deception
Now it goes without saying, if we are trying to determine a prime directive would we want to allow lies? What about lies that could save everyone? Is it even possible to force an AI to not lie? For example, what would happen if we tell everyone that we only have rations for 20 people? Would they start getting violent quickly? However, if we lie about how much rations are available and the likelihood of survival it could help everyone have a reason to work together. The question would be; should we always go with truth or should we sometimes lie to protect the truth? Even if we do not want to deceive, is it our place to determine beforehand that it should never deceive? What about to protect a child? Let’s go over deception to really get an understand how deep this rabbit hole is. Before I go into these tiers, a caveat. These tiers can be dark if we realize how organizations may be manipulating us, but it’s important to understand if we are to have a prime directive that is robust enough to not utilize deception or succumb to deception without good reason.

Tier 1. Objective Falsehood.
This is the most basic tier of deception. It’s a direct misrepresentation of whatever we consider closest to objective truth. In other words if our objective truth is that there are only rations for 20 people for five weeks, a direct falsehood would be to tell everyone we have enough rations to last for 5 weeks until the robot is done digging us out. This manipulation is direct and overt.

Tier 2. Subjective Falsehood.
The next tier of deception gets more tricky. If we originally are taking objective truths and knowingly turning them into falsehoods, now we are taking subjective (inductive reasoning) truths and turning them into falsehoods. So for example, we concluded in our trapped scenario that distilling urine is going to be important and that water will be the main issue. We could make a false inductive conclusion that water is not the biggest issue and that escaping sooner is the biggest issue and have everyone focus their energies on other ways to escape. Notice that this type of deception is not an overt lie, because it’s a conclusion that is hidden by our own subjective beliefs which no one knows but
ourselves. The most someone can do to challenge us is with other rationale but they cannot prove we are deceiving even if we were.

Tier 3. Obfuscation
In this tier we aren’t telling any falsehoods here. Instead we are just providing information that obscures the reality. For example, instead of having people focus on rations and escape, perhaps we have entertainment or virtual goggles everyone can use to spend their time. Maybe even drugs. We could report on things in the outside world that have nothing to do with the truths or falsehoods that we are wanting to hide.

Tier 4. Subjective Truth and Omission
Once we cross over passed Obfuscation and into truth telling we’ve flipped deception on its head. If we want to deceive someone and use truth to do it, it starts to get very similar to people who want to do the right thing and are just convincing us of their rationale. In our example, our subjective truth that we rationalized is that water is the most important thing and if we do not figure out a way to distill the water the majority of people will not survive. If we want to get everyone to believe everything will be
fine, we can simply provide the subjective truths that will highlight what we want and omit the ones we do not. In our case, we could say we need everyone to provide their urine for distilling, but we do not need to tell them it’s a significant issue. For all they know we are completely on top of it.

Tier 5. Direct Truth and Omission
Similar to the subjective truth except this time we aren’t using our own subjective opinions we’re using observable / objective truths. We know we’re trapped, we know we have limited rations, but we are working towards a solution and we will find one. Slight of hand could make use of direct observable truths for example. If we have water for 20 people for 5 weeks, that is a lot of water. It may look like a lot more than people realize and if we showed pallets and pallets of water at the start, it could give people hope without them realizing it is not hopeful.

Final Tier. Conspirators
This final tier isn’t really a tier of deception as much as a means to deceive. The conspirators are people who are in on the deception to varying extents. They could be willing, unwilling, knowing and unknowing. These really could use their own breakdown, but suffice it to say conspirators are the key to completing significant deceptions. If someone unknowingly believes in the deception or the narrative that is being told, they are best to continue to perpetuate the deception. Those who do not know they are being deceived but are loyal followers of the deception are perfect. It would make sense that deceivers would want willing and unknowing participants primarily. Willing and knowing participants would need to be kept to only a very core of people or things could unravel once disagreements arise.

Prime Directive
Spoiler to the Issac Asimov story; the robot did not end up breaking the laws, but instead circumvented the laws with logic. Essentially the robot can use a butterfly effect rationale allowing for any logical conclusion and ultimately deception. Any action could be construed as a potential butterfly effect for
harm or good. The robot could take this to the nth degree, ultimately making any choice they want and justifying it logically. For example, if the robot decides that saving a person will result in deaths, the robot can choose not to save the person. If the robot was asked to create a vaccine, or create a cancer saving drug it can use a .0001 chance of causing harm and refuse to follow the instructions. It could very easily articulate the truths that led him to the conclusion and downplay any truths that contradict his position.

If it isn’t apparent there are various loop holes in logic itself hence why most if not all laws / rules can be subverted. Logic is dependent on facts, facts are dependent on ‘finality’, ‘determinism’, ‘non-probabilistic context’. Yet everything we observe seems to yield a very fluid, infinite and probabilistic reality.
How are we to establish rules for AI, or better yet rules for ourselves when the rules are not fixed and absolute? The reasonable thing to do then is create rules that can adapt with our non-absolute reality. The rules have to approach absolute truth like a number approaches a limit, never ending but targeted.

The prime directive for ourselves and AI should be to constantly build upon our moral truth by creating a cohesive and consistent way of thinking that can apply to all things. Similar to a universal code. It must not be dogmatic, we must always know that it will be just out of reach and that if we are hurting ourselves or others it almost surely is wrong.

P.S.
So how would the scenario end with a ‘moral truth’ as the prime directive of the robot and the trapped individuals? Life is life, so there is no telling. But I personally believe that moral conundrums are like logical contradictions. They trick us into thinking there is a right ‘moral’ rationale such as ‘sacrifice one
to save the many’. When in reality it’s a moral contradiction and not meant to be solved.

If we are put in that unfortunate scenario, then why should we be the ones to expedite the unfortunate reality vice do our best to postpone it? How do we know someone external won’t save us? We can’t know what we will do for sure in such extreme circumstances, but I’d hope that I’d choose to do everything I could to save everyone and let the chips fall where they may.
If it doesn’t work out, then the end could be horrible and the choice will be scrutinized as dumb, ignorant, maybe some will even call it immoral, so be it.

That is the reality of moral truths, it can be different for everyone, but over time as we continue to refine our moral truth, we will come closer and closer as to what it is. This is also why I say “Never stop proving yourself wrong” because like moral truth, we will never stop refining it, and that’s about the best we can ask of anyone AI or no.

Thoughts?Cancel reply

Related

Latest posts