To say that mathematics and artificial intelligence are interlinked is an understatement. Not just because binary data processing itself is a form of calculation, but because mathematical principles and concepts form an integral connection to the underlying principles, as well as capabilities, of AI as currently understood.
The law of large numbers is one theorem in particular that accurately represents the fundamental relationship of data analysis with operative repetition. In this article, we will take a short dip to learn about the law’s specific connection to AI and to familiarize ourselves with this inevitable correlation.
What is the Law of Large Numbers?
Put it simply, the Law of Large Numbers is an intuitive theorem showing that as you repeat a trial or experiment with a probabilistic outcome, for example, a coin toss, the expected or calculated percentage of one result gradually becomes its average value.
In our coin toss example, if let’s say, you toss a coin ten times, you would most likely get a sample data or result that is skewed from the intended calculated percentage of the outcome. Instead of seeing a 50% percent chance for heads or tails with each result, you could get, for instance, three tails and seven heads, or six tails and four heads.
However, when we repeat the coin toss many, many times, in the order of thousands, or even millions of times, towards infinity, we then get an average result the becomes more or less the intended 50% outcome for both sides. This is where the “large numbers” part of the theorem comes into play. It demonstrates the reliability of a given calculated probability when given a significant amount of repeated trials and experimentation.
Of course, the actual theorem extends to more specific details, such as whether you can get the exact expected value (strong law), or just a mean prediction that is never actually the absolute precise value (weak law). But we more or less get the idea; the law of large numbers states that the more trials given, the closer it gets to its calculated probability.
Applications of the Law of Large Numbers
Since the law of large numbers generally deals with any sort of trial with a probabilistic outcome, its applications are very wide and technically considered generic. However, there are few applications that may be more important for this theorem than others, such as:
- Statistics – one trial represents a variation within a data entry. The huge amount of samples determines the average outcome. For example, the likelihood of disease as equated with occurrences among a sample population.
- Logistics – one trial represents each point within an operation. The resulting efficiency of all points within the entire organization determines the average outcome. Reliable calculation of product transportation and delivery logistics, for instance, requires averaging operation variables, such as weather conditions, route availability, number of vehicles, modes of transport, and so on.
- Experimentation – a simpler term for probability theory. One trial is literally one experiment. Repeated experiments determine the average outcome. This is where our previous coin toss example falls.
The basic principle as to why this theorem is very important lies in its reliability. Even if certain random outcomes produce a completely opposite or unintended result, because it is stretched and averaged over a long-term and high number of sample trials, you still get a more or less accurate prediction of what was initially calculated.
In other words, it is fundamental to data analysis, both in the accumulation and gathering of data, as well as parsing the information to arrive at a precise conclusion.
Law of Large Numbers and AI
At this point, the importance of the law of large numbers to artificial intelligence is probably becoming more apparent. After all, if AI is designed to work with data in order to complete tasks where a human should or could, it will be necessary that the system arrives at its most logical interpretation using all available relevant information.
In this regard, machine learning may perhaps be the best and most straightforward example. By our own previous definition:
“A machine learning system is capable of analyzing an enormous amount of data in a short time, creating solutions or conclusions from it.”
Consider the phrase “enormous amounts of data.” Machine learning systems adapt from this huge amount of information in order to optimize their algorithms. Therefore, data entries represent trials, while resulting averages are the patterns and collated pieces of data that the machine learning AI uses to take a corresponding action or decision.
This may sound repetitive because it actually is. The basic structure of a machine learning AI is precisely what the law of large numbers represents as a mathematical theorem, only translated in a more operative format. There are, of course, variations to the exact information given, as well as its practical applications, but the core concept remains consistent.
Special Applications in AI
We now understand that the law of large numbers is basically the schematics of modern AI. What are its applications then? Again, due to the fundamental nature of this theorem to data science in general, almost any AI that requires data input can be used as an example given enough experimentation and number of trials.
However, there are three specific applications that we would like to mention. These examples are by no means the best technical representation of AI using the law of large numbers. But they are nonetheless exceptional, because of how they are integral, or how they may soon become integral, to modern 21st century living.
- Statistical Arbitrage
Taking out the more obvious and straightforward of the three, this application is perhaps more popularly known for its derivative form: stock trading. Stock trading AI has been one of the most known types of big data AI that demonstrates the law of large numbers directly.
Automated trading strategies are built from trading algorithms, which are then built from data collected from trading histories and observed trends in the stock market. Generally, machine learning systems in this application uses the averaged variables in order to pinpoint potential correlations, and thus provide buying and selling suggestions depending on constantly changing economic variables.
The mathematical intricacies are, of course, quite complex. But as a user today, what you only need to understand now is that as more and more (trustworthy) trading data is acquired, the higher the probability of the AI to extrapolate market trends accurately.
- Vehicle Automation
AI development for self-driving vehicles takes the law of large numbers quite literally, and runs with it (pun intended). Tesla for example, parses and collates data from countless Tesla car users, “using billions of miles to train neural networks”.
In this example, car mileage data is averaged to plot out and optimize paths and driving policies. Recorded video and images are repeatedly analyzed by the AI, so that it eventually predicts visual elements with a reliable rate of probability. Even data involving the driving decisions of other cars on the road, is averaged to help the AI make better predictions of what other drivers are most likely to do in the near future.
Moreover, self-driving AI systems also have to contend with more abstract variables, such as the unpredictability of human behaviour, which is something we can all agree is far more complex than a mere coin toss.
- Medical Diagnosis and Treatment
Providing the right cure for the correctly identified illness or disease is the crux of medical technology today. Especially when medical professionals identify way more physiological variables today than doctors of the centuries past would have been able to visualize.
An AI, one trained specifically to handle even the tiniest variations in medical data, would be the perfect tool to find patterns and consistencies that might otherwise have eluded a human specialist. Watson, IBM’s Jeopardy! breaking powerhouse AI, may have eventually made errors due to bad initial data. But it did at least deliver the main concept of using machine learning to diagnose medical issues as well as providing the recommended treatment.
In this case, the law of large numbers served to fulfill the tasks of extraction, classification, and prediction. Extraction via organizing unstructured information based on seemingly unrelated medical records. Classification via analysis of repeating patterns and occurrences no matter how tiny, insignificant, or even unnoticeable. And lastly, the prediction towards a method of treatment based on what the AI thinks is the most likely information to be proven accurate and true.
Other notable demonstrations of the law of large numbers in AI that are potential game changers, such as deep learning-based weather prediction and the ever-improving gambling AI, are also bound to shape the future of our world in some way, and could take us to directions we have yet to even begin to consider.
As one Google Translate engineer put it, “when you go from 10,000 training examples to 10 billion training examples, it all starts to work. Data trumps everything.”
Garry Kasparov, yes the man defeated in chess by the AI Deep Blue, mentions this quote from his book Deep Thinking: Where Machine Intelligence Ends and Human Creativity Begins. This one sentence sums up succinctly why the law of large numbers is inevitably intertwined with AI.
Reliability eventually gained through sheer number of attempts.
Data indeed, trumps everything… in large numbers.
Also published on Medium.