Imagine being a 7 year old. You are standing at the entrance of maze-like-land that has two paths. One path is candy land, it has all your favorite food and candy and the music you love. The other is the boring path, which has all the boring food, low-fat yogurt, sugar-free cereal and the most boring music you can think of. Chances are, just about every kid is going to choose the path filled with things they love. Now if we were to try and convince the child not to take the candy land path, how would we do it? There would have to be something identified in the future that is greater than candy land. Addiction in many ways is like candy land, a sea of good feelings, while the alternative is withdraw, dull and undesirable tasks. Let’s talk about ways to break the chains.
In the artificial intelligence space, one paradigm is to create an algorithm to select the best action by calculating rewards earned (or lost) for each action as it relates to predicted states. In our 7 year old example, what we predict will happen as a result of going in each of the paths would govern our decision. We may take a few simple ideas into consideration: 1. How good is each path? 2. Could something bad happen to me if I choose candy land? 3. How likely will something bad happen?
Everything else can be wrapped into those simple questions, which ultimately are very similar to reinforcement learning concepts. What is the positive or negative reward for going into a particular state (candy land vs boring land), extrapolate the predicted states out as far as possible based on probabilities and make the choice. In the reinforcement learning space, this ideal path is considered the one with the highest “utility”. So we may see a lot of “individual” rewards (the candy), but it would be the overall utility actually used, which combines all the predicted states and rewards.
When programming a simple reinforcement learning bot, the candy land scenario, will always win if staying in candy land forever is an option. The only way to break the cycle would be to make staying in candy land bad, or boring land somehow overall better. For both a bot and a typical 7 year old, the only way to predict a higher utility for the boring land would be to see either a large number of negative rewards after candy land or a large number of positive rewards after boring land. We essentially need to make boring land look great and candy land look bad.
Probability of Random
Notice however to really tip the scales, the child would have to firmly believe that the chances of something even more amazing would happen after the boring path to really make that choice. If we randomized what would happen after the boring path, then immediately that changes the decision process. Without a greater reward guaranteed, we enter a state where we may as well just go for candy land right now. As the probability of something good after boring land gets worse and worse, the boring land decision becomes harder and harder to make.
The effect of probability and randomness is why consistency in working with children, animals and everyone really is so important. The more we are able to assess probable outcomes the more we can govern our own choices. If we can’t predict random rewards or punishments as they relate to our current state and actions it can make the rewards useless at best and counteractive at worst.
A child will endlessly eat up cookies, ice cream and good food until they see a highly probable predicted future that would be way worse or better than eating cookies and ice cream all day. The farther out a child can predict, the higher chance they will see other possible paths.
Now imagine that we are or know someone who has an addiction, we’ll use tobacco as an example. If smoking makes us feel a little better, a little more relaxed and allows us to think without being as anxious and snappy; while not smoking does the opposite, we have a similar situation where on one side is good things and on the other side is bad things. Just like the child and candy land vs boring land, there is no reason to choose the bad path without the ability to make a prediction as to what happens afterwards.
Chances are if someone is addicted and smoking regularly, then they have a history of smoking with nothing bad happening. They may acknowledge that lung cancer is a possibility, but see that as so many years into the future that they don’t care. In fact, if we look at statistics, only 15% of people who smoke develop lung cancer. That’s a very low percentage for an every day reward. Imagine trying to program that into a computer; literally the overall “utility” of smoking would be better to keep smoking if those were the only factors and predictions taken into consideration.
We’d have to start stacking other reward structures on top, for example making a significant other happy. If a partner hates the smell of smoke, that may have some effect. But we can similarly stack more options that support the addiction like nicotine gum, vape, dip, patches etc. In fact, with some options we eliminate the risk of lung cancer all together and we’re back at square one; what is the real detriment, how do we get the overall utility of using nicotine to be lower than the alternative?
Power of Imagination
Whenever we make predictions about what could happen, we are using our imagination. We don’t know what will happen, we just imagine based on our experience and understanding. Our imagination could be based on logic and reasoning, or it could be much more random and free from limits. For better or worse, to fight an addiction that doesn’t have immediately apparent negative effects, we have to use our imagination.
The child who was told they would get a trip to their favorite theme park if they chose the boring land immediately uses their imagination to determine “What will be in this theme park?”, “Are we really going to go?”, “Does my mom always do what she says we’re going to do?”, on and on they imagine all these different possibilities. For a child who has never been to a theme park, that could make the theme park sound amazing, or it could make it sound just as boring. It all depends on their imagination.
Believe it or not, it is no different for adults who often only have a more constrained and logical imagination. We use our imagination every time we make a prediction about what hasn’t happened or what could happen, the “counterfactual”. Some believe that our ability to imagine, to see and articulate counterfactuals is the difference between humans and most other animals.
If it isn’t clear, what this ultimately means is the imagination is immensely powerful. A lot of good can come from the imagination, but so can a lot of bad. We can convince ourselves that we’re 10 times luckier when we don’t smoke. Just as easily, we can convince ourselves to do bad things. Even if we know logically that it wouldn’t make sense, we can “imagine” the predicted outcomes in a way that suits our own desires and beliefs.
A baby elephant can be tied to a tree with a small rope to keep them from roaming too far. In the beginning the elephant will test the rope and try to break free. But after a while, it “learns” and decides no need to try and break the rope. The same elephant that grows up to be 12,000 lbs can be tied to a tree with that same rope, even though they could easily break it with a strong tug. If they never needed to break the rope they often won’t even try. The real truth behind this story isn’t necessarily that the elephant doesn’t think they can break the rope, it’s possible that they can’t imagine why they’d even need to.
Someone who has an addiction, has a major problem to overcome that has them only able to perceive using as their best option. In fact given their situation and the positive rewards they get from the addiction vs the negative rewards for withdraw, combined with an unpredictable future, we get candy land vs boring land. Those in the worst situations, have very little ability to predict a positive future and as such relegate themselves naturally to what is more immediately available and known.
To summarize, to break mental chains we have to make our desired path the one with the most “utility”, in other words the path that yields the highest overall rewards. To do so we have an infinite number of options, but they seem to be able to be summarized into a few areas:
1. Identify there is a problem, either through key indicators like job problems, bad relationships, inability to save money, inability to think clearly, or by simply “knowing” and desiring change.
2. Determine a path that is positive that can be applied for an entire lifetime. Ensure that the identified path is one that is truly positive by taking differing perspectives from at least two qualified individuals (they are where we want to be) that do not support the undesired path in any way.
3. Understand how large the addictive rewards are in comparison to the desired path. Associate negative rewards and more problems with the addicted/undesirable option. Associate more rewards and solutions to the desired option.
4. Learn to predict and imagine well beyond the immediate future. The greater the ability to create a long sequence of predictions that is in line with the desired option, the easier it will be.
These points sound simple, but require an immense imagination. Addictions and negative spirals can be extremely powerful, so the desired positive path will almost always require a large amount of creative imagination and a very large timeline to convince ourselves that the addiction isn’t worth it. If we know someone who has an addiction, try to be patient and understand their sober choices may not make sense unless we see how candy land works.
As always, if you or someone you know truly has a problem, get in touch with the professionals. No one should bottle it up and try to do it alone.
SAMHSA’s National Helpline