p(A_i|n_i) = \frac{n_i+1}{n+m}

where

p(A_i|n_i)

means "the probability that the next event will be in category i, given that n such events have occurred.

n_i

is the number of times each category was observed.

m

is the number of categories, and

n

is the total number of observations.

When nothing has been observed,

n

_{i} = 0, and n = 0, so to use an example from roulette, there are 37 "categories" (numbers that could hit) so m = 37 and the probability of say the #13 hitting is

P(13|0) = \frac{0+1}{0+37} = \frac{1}{37}

which is what you would expect. And obviously it would be the same for any other number. But when you have some data, the probabilities change. For example, if you have observed 50 spins and #13 has hit 3 times, the probability that it will hit on the next spin is:

P(13|3) = \frac{3+1}{50+37} = \frac{4}{87} = 0.046

It's important that the categories are mutually exclusive and exhaustive, i.e., that only one event can occur and that one of the events must occur. For example, the categories red and even are not mutually exclusive because the event could be red and even, and the events red and black, although mutually exclusive, are not exhaustive because the event zero has not been included.

If the event doesn't make any difference in terms of payoffs, it can be ignored. e.g. the tie bet in baccarat, so there would be just two categories in this case: P & B.

An important assumption for this formula is that nothing regarding the cause of the events is known about. Of course, in casino games we do know in a sense what causes the outcomes, but not in any detailed way. So most of the time in roulette we don't know why the ball falls into a particular number, because the train of events leading up to that outcome is too complex, although in principle it could be known, given enough information regarding the "initial conditions" (ball speed, wheel characteristics, ball type, environmental conditions, etc).

So if we do have some evidence that the wheel is biased, say, then we ought to take this into account, but we cannot use the above formula because it will give misleading results (the full machinery of Bayesian statistics is required, which is much more complex). So this formula only applies when we have data in terms of the frequencies of the events, nothing more - as far as we're concerned, the events are "random" and all we have are the frequency counts.

If you haven't realised yet, the formula is basically a trending system. It always gives those events which are "hotter" a higher probability. Obviously, this is controversial in regard to casino games, which are designed to have a fixed probability at all times. But are we justified in going along with this? even if we are, even if it's reasonable to conclude that casino games are deliberately set up to be as "fair" as possible (meaning no bias), variance is undeniable. So even if "in the long run" probabilities do converge to that predicted by the standard theory, the conventional probability formulas are of no help. What's needed is a way of tracking the short-term fluctuations (biases, if you prefer).

In the long run, the formula does predict what the other formulas say will happen in the long term. For example, if after 1000 spins you have 31 zeros, 496 reds, and 477 blacks, m = 3, n = 1000, n

_{i} = 477, so the probability of the next spin being black is:

P(black|477) = \frac{477+1}{1000+3} = \frac{478}{1003} = 0.477

which is not far off what it "should" be.

For some more background reading, see:

https://archive.org/stream/theprinciplesof00jevoiala#page/256/mode/2upIn particular I like the passage on page 260:

There is a more mathematical account of the formula here:

http://en.wikipedia.org/wiki/Rule_of_succession