Stats, ML, Data - Programs and code related to the Central Limit Theorem


Question 1
A large truck can move a maximum of  pounds. Suppose a load of cargo containing  boxes must be transported via the truck. The box weight of this type of cargo follows a distribution with a mean of µ pounds and a standard deviation of  pounds. Based on this what is the probability that all  boxes can be safely loaded onto the truck?

Python code snippet to solve this problem (note the use of math)


from math import sqrt,erf max_weight = 9800 avebox = 205 sigmabox = 15 no_boxes = 49 exptot = avebox * no_boxes sigmatot = sigmabox*sqrt(no_boxes) prob = 1/2 * (1+erf((max_weight-exptot)/sigmatot/sqrt(2))) print(prob)

R code snippet to solve this problem (note the use of pnorm)

tot_mean = 205*49
tot_var = 15*15*49
p <- pnorm(9800, mean=tot_mean, sd=sqrt(tot_var))
cat(sprintf("%.4f\n", p))


Question 2

There is a sample of 100 values from a population where mean µ = 500 and with standard deviation σ = 80. 

What is the probability that the sample mean will be in the interval (490, 510)?


R code snippet to solve this problem based on the CLT (note the use of pnorm)

cat (pnorm (510, mean=(500), sd=(8)) - pnorm (490, mean=(500), sd=(8))+0.0001) 

Python code snippet to solve this problem (computing a gaussian fraction) - Note the use of erf (Error Function)
#!/usr/bin/python
import math
def gaussianfraction( mu, sigma, a, b ):
    if a=="neginf":
        valA = 0.5
    else:
        valA = 0.5*math.erf( (mu-a)/(2**0.5*sigma) )
    if b=="posinf":
        valB = -0.5
    else:
        valB = 0.5*math.erf( (mu-b)/(2**0.5*sigma) )
    return "%0.4f" % (valA-valB)

mu = 500
sigma = 80./10
a = 490
b = 510
print gaussianfraction( mu, sigma, a, b)



Question 3

There is a sample of 100 values from a population where the mean µ = 500 and the standard deviation σ = 80. Given this info, compute the interval that covers the middle 95% of the distribution of the sample mean. i.e, compute  and  such that P( < x < ) = 0.95

R program to solve this problem based on the Central Limit Theorem (note the use of Qnorm)

n=100
miu = 500
sigma = 80
se = sigma/sqrt(n)

write(qnorm(0.025, mean=miu, sd=se), stdout())
write(qnorm(0.975, mean=miu, sd=se), stdout())


Python program to solve this problem based on the CLT (note the use of Qnorm)

import math z = 1.95996398454005385560443065 mu = 500 sigma = 80 n = 100 mu *= n sigma *= math.sqrt(n) A = mu - z*sigma B = mu + z*sigma print "{:0.2f}".format(A/n) print "{:0.2f}".format(B/n)

Question 4
The amount of gas purchased weekly at a gas station follows the normal distribution with a mean of 50000 gallons and a standard dev. of 10000 gallons. The starting supply of gasoline is 74000 gallons, and there is a scheduled weekly delivery of 47000 gallons. Compute the probability that, after 11 weeks, the supply of gasoline will be less than 20000 gallons.

R program to solve this CLT based problem


n=11
miu = 50000
sigma = 10000


delivered_fuel = 74000 + 47000*11
max_comsumption = delivered_fuel-20000

miu_all = miu*n
sigma_all = sqrt(sigma^2*n)

write(1 - pnorm(571000, mean=miu_all, sd=sigma_all), stdout())


Python program to solve this Central Limit based problem 


import math res = (1.0 + math.erf((2.1/math.sqrt(11.0)) / math.sqrt(2.0))) / 2.0 print("{:.4f}".format(1.0-res))