Friends of mine asked what is the probability of being a "miracle baby", for example a baby born on 9/14 and weighting 9lb14oz? In other words what is the probability that a baby's birthday expressed as MM/DD corresponds to his birthweight expressed as MM pounds and DD ounces? (Finally in my life I found a use of imperial units: they make theoretical number problems more interesting ;)
Some complications need to be taken into account. A miracle baby cannot be born on the 16th (or later) day of the month because weights are expressed with a number of ounces <= 15. A miracle baby cannot have a weight ending in "0oz" because there is no 0th day of the month.
The estimated probability of being a miracle baby is the probability of the birth weight being a valid date, multiplied by the probability of being born on this date.
To calculate the probability of the birth weight being a valid date, the first observation we can make is that almost all babies are between 1lb1oz and 12lb15oz. All intermediary weights are valid dates (1/1, 1/2, …, 12/15), except weights ending in "0oz". Readings ending in "0oz" are approximately as common as those ending in "1oz", or "2oz", …, or "15oz", because the bell curve of the distribution of weights is quite wide. So the probability of a realistic birth weight being a valid date (ie. NOT ending in "0z") is approximately 15/16. The exact Gaussian distribution parameters (eg. mean and standard deviation of Asian babies vs. American babies) have little influence on this, again because the bell curve is quite wide for all baby populations.
To calculate the probability of being born on any date, we can approximate it to 1/365 (this assumes birthdays uniformly distributed through the year).
Therefore the estimated probability of being a miracle baby is:
P(miracle baby) = P(birth weight being valid date) * P(being born on this date) P(miracle baby) = 15/16 * 1/365 P(miracle baby) = 0.257%
To verify this estimation, we can calculate the exact probability from real-world data. P(miracle baby) is exactly the sum of 365 terms:
P(miracle baby) = P(born 1/1) * P(weight 1lb1oz) + ... + P(born 12/31) * P(weight 12lb31oz)
(With invalid weights like P(weight 12lb31oz) equal to 0.)
To calculate P(born MM/DD) we will use birthday distribution data from the CDC. The most recent data set I can find is from year 2003. This shows for example that 4089950 babies were born that year, 7783 were born on Jan 1, so P(born 1/1) = 7783/4089950 = 0.190%.
To calculate P(weight XlbYoz) we need Gaussian distribution parameters. We use a mean of 3.39 kg and a standard deviation of 0.55 kg (source: Medical Statistics: A Textbook for the Health Sciences). P(weight XlbYoz) is the probability of the weight falling in a range that rounds to this value: Xlb(Y-0.5)oz to Xlb(Y+0.5)oz.
I wrote a little Python program (miracle.py) to calculate all P(born MM/DD) and P(weight XlbYoz), and add up the 365 terms. The CDC did not make it easy to import their data (sigh), but googling around I found the same numbers in a text file, so it was easy to copy/paste them into the code. The program prints the real-world probability of a US baby born in 2003 being a miracle baby:
$ ./miracle.py P(miracle baby) = 0.262%
This is very close to the estimation of 0.257%. This is because the distribution of birthdays is not that funky —so 1/365 was a good enough approximation. And as expected changing the mean weight and its standard deviation has little influence: a mean varying from 3.0kg to 3.8kg and a standard deviation varying from 0.3kg to 0.7kg put P(miracle baby) between 0.259% and 0.267%.