UECM1633 Probability and Statistics for ComputingChapter 3 – 1CO2. Evaluate event probabilities, or chances of different results.CO3. Select suitable model for a phenomenon containing uncertainty.Chapter 3 Discrete Random Variables and Their Distributions3.1. Random VariablesRandom VariableA random variable is a function of an outcome,X f .In other words, it is a quantity that depends on chance.Discrete Random VariableA random variable that assumes countable values is called a discrete random variable.Example:1. The number of jobs submitted to a printer.2. The number of errors, error-free modules, number of failed components.3. The number of heads obtained in three tosses of a coin.Note. Discrete variables don’t have to be integer. This variable assumes a finite number.Continuous Random VariableA random variable that can assume any value contained in one or more intervals is called acontinuous random variable.Example:1. The software installation time, code execution time, connection time.2. The time taken to complete an examination.3. The weight, height, voltage, temperature, distance.Note. Rounding a continuous random variable, say, to the nearest integer makes it discrete.Sometimes we can see mixed random variables that are discrete on some range of valuesand continuous elsewhere.3.2. Probability Distribution of A Discrete Random VariableProbability Mass FunctionCollection of all the probabilities related to a discrete random variable X is the distribution ofX . The functionP x P X x X is the probability mass function, or pmf.Two characteristics of a probability mass function.1. 0 1 P x X for each value of x2. X 1xP x UECM1633 Probability and Statistics for ComputingChapter 3 – 2Cumulative Distribution FunctionThe cumulative distribution function, or cdf of a random variable X is defined asX y xF x P X x P y .The set of possible values of X is called the support of the distribution F .Some characteristics of a cumulative distribution function.1. F x X is a non-decreasing function of x.2. F x X is continuous from the right-hand side.3. lim 0 X xF F x , lim 1 X xF F x 4. 0 1 F x X Example 3.1.Consider an experiment of tossing 3 fair coins and counting the number of heads. Let X bethe number of heads obtained. The probability mass function P x X and the cumulativedistribution function F x X of X are shown in the following figure. White circles denoteexcluded points.UECM1633 Probability and Statistics for ComputingChapter 3 – 3Example 3.2.According to a survey, 60% of all students at a large university suffer from math anxiety. Twostudents are randomly selected from this university. Let X denote the number of students in thissample who suffer from math anxiety, compute the probability mass function (pmf) and thecumulative distribution function (cdf) of X. Then, draw a graph for its pmf and cdf .Example 3.3.For each of the following, determine the constant k so that P x X satisfies the conditions ofbeing a pmf for a random variable X, and then develop the probability distribution of X.(a) P x X xk for x 2,3,4,5.(b) P x k x X 12 for x 0,2,4,6.UECM1633 Probability and Statistics for ComputingChapter 3 – 43.3. Distribution of a Random VectorOften we deal with several random variables simultaneously. We may look at the size of aRAM and the speed of a CPU, the price of a computer and its capacity, temperature andhumidity, technical and artistic performance, etc.Joint Distribution and Marginal DistributionsIf X and Y are random variables, then the pair X Y , is a random vector. Its distribution iscalled the joint distribution of X and Y. Individual distributions of If X and Y are then called themarginal distributions.Similarly to a single variable, the joint distribution of a vector is a collection of probabilitiesfor a vector X Y , to take value x y , . Recall that two vectors are equal,X Y x y , , ,if X x and Y y . This “and” means the intersection, therefore, the joint probability massfunction of X and Y isP x y P X Y x y P X x Y y , , , .Again, X Y x y , , are exhaustive and mutually exclusive events for different pairs x y , , therefore, , 1 x yP x y .Addition RuleThe joint distribution of X Y , carries the complete information about the behavior of thisrandom vector. In particular, the marginal probability mass functions of X and Y can beobtained from the joint pmf by the Addition Rule, namely,X X Y , , yP x P X x P x y Y X Y , , xP y P Y y P x y That is, to get the marginal pmf of one variable, we add the joint probabilities over all valuesto the other variable.In general, the joint distribution cannot be computed from marginal distributions because theycarry no information about interrelations between random variables. For example, marginaldistributions cannot tell whether variables X and Y are independent or dependent.Independence of Random VariablesRandom variables X and Y are independent ifP x y P x P y X Y , , X Y for all values of x and y. This means, events X x and Y y are independent for all xand y; in other words, variables X and Y take their values independently of each other. To provedependence, we only need to present one counterexample, a pair x y , withP x y P x P y X Y , , X Y .UECM1633 Probability and Statistics for ComputingChapter 3 – 5Example 3.4.A program consists of two modules. The number of errors X1 in the first module has the pmfP x 1 , and the number of errors X2 in the second module has the pmf P x 2 , independentlyofX1 , where x00.50.710.30.220.10.130.10 P x 1 P x 2 Find the pmf and cdf of Y X X 1 2 , the total number of errors.Solution:𝑃(𝑌 = 0) = 𝑃1(0)𝑃2(0) = 0.5(0.7) = 0.35𝑃(𝑌 = 1) = 𝑃1(0)𝑃2(1) + 𝑃1(1)𝑃2(0) = 0.5(0.2) + 0.3(0.7) = 0.31UECM1633 Probability and Statistics for ComputingChapter 3 – 6Example 3.5.A program consists of two modules. The number of errors X in the first module and thenumber of errors Y in the second module have the joint distribution,P P P 0,0 0,1 1,0 0.2 , P P P 1,1 1,2 1,3 0.1 , P P 0,2 0,3 0.05 .Find (a)(b)(c)(d)the marginal distributions of X and Y ,the probability of no errors in the first module,the distribution of the total number of errors in the program.Determine if errors in the two modules occur independently. Solution:(a) 𝑃𝑋,𝑌(𝑥, 𝑦)y𝑃𝑋(𝑥)0123×01𝑃𝑌(𝑦) UECM1633 Probability and Statistics for ComputingChapter 3 – 73.4. Expectation and VarianceExpectationThe expectation of a discrete random variable X is its mean, the average value and is computedas X xE X x P x .Properties of expectations.For any random variables X and Y and any non-random numbers a, b and c:1. E X Y E X E Y 2. E aX aE X 3. E c c 4. E aX bY c aE X bE Y c 5. E XY E X E Y for independent X and YExample 3.6.Reconsider Example 3.5., compute E X , E Y and the expected total number of errors.UECM1633 Probability and Statistics for ComputingChapter 3 – 8Variance and Standard DeviationThe variance of a discrete random variable X is defined as the expected squared deviation fromthe mean, namely, 2 222 2 2X Xx xVar X E X E X E X E Xx P x x P x The standard deviation of a discrete random variable X is given byStd X Var X .Example 3.7.The following bar graph depicts the probability distribution of the number of breakdowns perweek for a machine based on past data.Find the mean and standard deviation of the number of breakdowns per week for this machine.Solution:Let X denote the number of breakdowns for this machine in a given week. x0123𝑃𝑋(𝑥)𝑥 ∙ 𝑃𝑋(𝑥)∑𝑥 ∙ 𝑃𝑋(𝑥) = 1.8𝑥2 ∙ 𝑃𝑋(𝑥)∑𝑥2 ∙ 𝑃𝑋(𝑥) = 4.3 UECM1633 Probability and Statistics for ComputingChapter 3 – 93.5. Covariance and CorrelationCovarianceCovariance XY Cov X Y , is defined as Cov X Y E X E X Y E Y ,E XY E X E Y .It summarizes interrelation of two random variables.CorrelationCorrelation coefficient between variables X and Y is defined as Cov X Y ,Std X Std Y , 1 1 .It measures the association of two random variables.Properties of variances and covariances.For any random variables X Y Z W , , , and any non-random numbers a b c d , , , :1. Cov X Y Cov Y X , , 2. X Y Y X , , 3. Var aX b a Var X 2 4. Var aX bY c a Var X b Var Y abCov X Y 2 2 2 , 5. Cov aX b cY d acCov X Y , , 6. Cov aX bY cZ dW acCov X Z adCov X W bcCov Y Z bdCov Y W , , , , , 7. aX b cY d X Y , , For independent X and Y,8. Cov X Y , 0 9. Var X Y Var X Var Y UECM1633 Probability and Statistics for ComputingChapter 3 – 103.6. Application to FinanceChebyshev’s inequality shows that in general, higher variance implies higher probabilities oflarge deviations, and this increases the risk for a random variable to take values far from itsexpectation.This finds a number of immediate applications. Here we focus on evaluating risks of financialdeals, allocating funds, and constructing optimal portfolios. This application is intuitivelysimple. The same methods can be used for the optimal allocation of computer memory, CPUtime, customer support, or other resources.Example 3.8.We would like to invest $10,000 into shares of companies XX and YY. Shares of XX cost $20per share. The market analysis shows that their expected return is $1 per share with a standarddeviation of $0.5. Shares of YY cost $50 per share, with an expected return of $2.50 and astandard deviation of $1 per share, and returns from the two companies are independent. Inorder to maximize the expected return and minimize the risk (standard deviation or variance),is it better to invest (A) all $10,000 into XX, (B) all $10,000 into YY, or (C) $5,000 in eachcompany?Solution: LetX be the actual (random) return from each share of XXY be the actual return from each share of YY (A)With $10000, we can buy 500 shares of XX which cost $20/share, collecting a profit of𝐴 = 500𝑋. (B) With $10000, we can buy 200 shares of YY which cost $50/share, collecting a profit of𝐵 = 200𝑌.(C) With $5000 each, we can buy 250 shares of XX which cost $20/share and 100 sharesof YY which cost $50/share, collecting a profit of 𝐶 = 250𝑋 + 100𝑌.In terms of the expected return, all three portfolios are equivalent.i.e. 120or2.550, each share of each company is expected to a return of 5%Portfolio C, where investment is split between two companies, has the lowest variance.Therefore, it is the least risky. It is better to invest in portfolio C. This supports one of the basicprinciples in finance: to minimize the risk, diversify the portfolio.UECM1633 Probability and Statistics for ComputingChapter 3 – 11Example 3.9.Suppose now that the individual stock returns X and Y are no longer independent. If thecorrelation coefficient is 0.4 , how will it change the results of the previous example? Whatif they are negatively correlated with 0.2?Solution:Only the volatility of portfolio C changes due to the correlation coefficient.For 𝜌 = 0.4Nevertheless, the diversified portfolio C is still optimal.Now if 𝜌 = -0.2,The risk of portfolio C increase due to positive correlation of the two stocks. When X and Yare positively correlated, low values of X are likely to accompany low values of Y; therefore,the probability of the overall low return is higher, increasing the risk of the portfolio.Conversely, negative correlation means that low values of X are likely to be compensated byhigh values of Y, and vice versa. Thus, the risk is reduced. Diversified portfolios consisting ofnegatively correlated components are the least risky.UECM1633 Probability and Statistics for ComputingChapter 3 – 12Example 3.10.So, after all, with $10000 to invest, what is the most optimal portfolio consisting of shares ofXX and YY, given their correlation coefficient of 0.2?Suppose t dollars are invested into XX and 10000 t dollars into YY, with the resultingprofit is Ct . This amounts for20tshares of XX and 10000 20050 50t t shares of YY. PlansA and B correspond to t 10000 and t 0.Solution:The following figure shows the variance of a diversified portfolio.The minimum of this variance is found atThus, for the most optimal portfolio, we should invest $4081.63 into XX and the remaining$5919.37 into YY. Then we achieve the smallest possible risk (variance) of ($2)19592,measured, as we know, in squared dollarsUECM1633 Probability and Statistics for ComputingChapter 3 – 133.7. Families of Discrete Distributions – The Bernoulli DistributionA random variable with two possible values, 0 and 1, is called a Bernoulli variable, itsdistribution is Bernoulli distribution. Any experiment in which only two outcomes (mutuallyexclusive and exhaustive) are possible, say, “success” or “failure”, is called a Bernoulli trial.Let p denote the P success and q denote the P failure on each trial, the probability massfunction of a discrete random variable X follows a Bernoulli distribution is given byP X x p q x x 1 for x 0,1where q p 1 .Mean and VarianceSuppose X B p ~ . Then p and 2 pq .Example: Trial“Success”“Failure”1. Toss a coinHT2. Roll a dieSix1, 2, 3, 4, 53. Fire a riffleHitMissed 3.8. Families of Discrete Distributions – The Binomial DistributionA variable described as the number of successes in a sequence of independent Bernoulli trialshas Binomial distribution. Its parameters are n, the number of trials, and p, the probability ofsuccess.Example: Trial“Success”“Failure”BinomialExperimentBinomial RandomVariable X1. Toss a coinHTToss 5 times# of H obtained2. Roll a dieSix1, 2, 3, 4, 5Roll 10 times# of sixes obtained3. Fire a riffleHitMissedFire 7 times# of hits obtained Suppose a discrete random variable X follows a Binomial distribution. Then the probability ofexactly x successes in n trials is given byP X x p q n x n xx for x n 0,1,2,…,where q p 1 .Mean and VarianceSuppose X Bin n p ~ , . Then np and 2 npq .Note.Suppose X Bin n p ~ , and Y n X . Then Y Bin n p ~ ,1 .UECM1633 Probability and Statistics for ComputingChapter 3 – 14Example 3.11.As part of a business strategy, randomly selected 20% of new internet service subscribersreceive a special promotion from the provider. A group of 10 neighbors signs for the service.What is the probability that at least 4 of them get a special promotion?Example 3.12.Suppose that X Bin n p ~ 20, 0.3 . Find(a) P X 6(b) P X 4 12 (c) P X 2(d) E X and Var X Example 3.13.Suppose X Bin n p ~ 10, 0.8 and Y X 10 , Y Bin n p ~ 10, 0.2 . Find(a) P X 7(b) P Y 3UECM1633 Probability and Statistics for ComputingChapter 3 – 153.9. Families of Discrete Distributions – The Geometric DistributionThe number of Bernoulli trials needed to get the first success has Geometric distribution.Geometric random variables can take any integer value from 1 to infinity, because one needsat least 1 trial to have the first success, and the number of trials needed is not limited by anyspecific number. The only parameter is p, the probability of “success”.The probability mass function of a random variable X follows a Geometric distribution is givenbyP X x pq x1 for x 1,2,where q p 1 .Mean and VarianceSuppose a discrete random variable X follows a Geometric distribution, X Geo p ~ . Then1 p and 2 q 2p .An additional important property of the geometric distribution is that it is memoryless. It doesnot remember how many times you had tried before when you try the next time. Inmathematical notation:P X n k X n P X k | .Example 3.14.Some biology students were checking eye color in a large number of fruit flies. For theindividual fly, suppose that the probability of white eyes is 14and the probability of red eyesis 34, and that we may treat these observations as independent Bernoulli trials. Find theprobability that the first fly with white eyes is the forth fly.UECM1633 Probability and Statistics for ComputingChapter 3 – 163.10. Families of Discrete Distributions – The Negative Binomial DistributionIn a sequence of independent Bernoulli trials, the number of trials needed to obtain k successeshas Negative Binomial distribution. It has two parameters, k, the number of successes, and p,the probability of success.The Negative Binomial probability mass function is the trial results in the success1 successes in the first 1 trials,.and the last trial is a successP X x P x k th thk xP Hence, 11x k x kP X x p qk for x k k , 1,where q p 1 . This formula accounts for the probability of k successes, the remainingx k failures, and the number of outcomes – sequences with the kth success coming on thethx trial. Note that with k 1, it becomes Geometric.Mean and VarianceSuppose a discrete random variable X follows a Negative Binomial distribution, X NB k p ~ , . Thenk p and 2 kq2p .Example 3.15.Suppose that during practice a basketball player can make a free throw 80% of the time.Furthermore, assume that a sequence of free-throw shooting can be thought of as independentBernoulli trials. Let X equal the minimum number of free throws that this player must attemptto make a total of 10 shots. Then,X NB k p ~ 10, 0.8 The pmf of X is 1 0.8 0.2 10 10 10 1x xP X x for . x 10,11,12, The mean, variance and standard deviation of X are, respectively,1012.50.8 , 2 10 0.2 2 3.1250.8 , 1.7678.And we have, for example, 12 0.8 0.2 0.2362 11 10 2 9P X .UECM1633 Probability and Statistics for ComputingChapter 3 – 17Example 3.16.In a recent production, 5% of certain electronic components are defective. We need to find 7non-defective components for our 7 new computers. Components are tested until 7 nondefective ones are found. What is the probability that more than 10 components will have to betested?UECM1633 Probability and Statistics for ComputingChapter 3 – 183.11. Families of Discrete Distributions –The Poisson DistributionThe Poisson distribution is a discrete probability distribution that applies to occurrences ofsome events over a specified interval. The random variable X is the number of occurrences ofthe events in an interval. The interval can be time, distance, area, or some similar unit.The following are examples of discrete random variable for which the Poisson probabilitydistribution can be applied.1. The number of telemarketing phone calls received by a household during a given day.2. The number of mistakes typed on a given page.3. The number of customers entering a grocery store during a one-hour interval.Poisson Probability DistributionThe probability distribution of X follows a Poisson distribution in an interval is !xeP X xx for x 0,1,2,where is the mean number of occurrences in that interval.Mean and VarianceSuppose X oi ~ ( ) , then 2 .Example 3.17.A washing machine in a laundry shop breaks down an average of three times per month. Findthe probability that during the next month this machine will have (a)(b)(c)exactly two breakdownsat most one breakdownat least one breakdown UECM1633 Probability and Statistics for ComputingChapter 3 – 19Note.The intervals for and X must be equal. If they are not, the mean should be redefined tomake them equal. The following example will show how to redefine and X to make themequal.Example 3.18.Customers of an internet service provider initiate new accounts at the average rate of 10accounts per day. What is the probability that (a)(b)more than 8 new accounts will be initiated today?more than 16 accounts will be initiated within 2 days? Example 3.19.On average, two new accounts are opened per day at Bank ABC. (a)Find the probability that on a given day, the number of new accounts opened at this bankwill be at most 3.Determine the average number of new accounts opened for the past 5 working days atBank ABC.(b) UECM1633 Probability and Statistics for ComputingChapter 3 – 203.12. Families of Discrete Distributions – The Poisson approximation of BinomialDistributionPoisson distribution can be effectively used to approximate Binomial probabilities when thenumber of trials n is large, and the probability of success p is small. Such an approximation isadequate, say, for n 30 and p 0.05 , and it becomes more accurate for larger n.Suppose X Bin n p ~ ( , ) , if n 30 and p 0.05 , then np , thenX oi np .When p is large p 0.95, the Poisson approximation is applicable too. The probability offailure q p 1 is small in this case. Then, we can approximate the number of failures, whichis also Binomial, by a Poisson distribution.Example 3.20.Ninety-seven percent of electronic messages are transmitted with no error. What is theprobability that out of 200 messages, at least 195 will be transmitted correctly?
- Assignment status: Already Solved By Our Experts
- (USA, AUS, UK & CA PhD. Writers)
- CLICK HERE TO GET A PROFESSIONAL WRITER TO WORK ON THIS PAPER AND OTHER SIMILAR PAPERS, GET A NON PLAGIARIZED PAPER FROM OUR EXPERTS