Inferential Statistics for Data Science

Question 1 of 50

1

Select the right answer.

With the help of inferential statistics, we can :

Select one of the following:

Making conclusions from a sample about the population
Conclude if a sample selected is statistically significant to the whole population or not
Compare two models to find which one is more statistically significant as compared to the other.
We can do feature selection, whether adding or removing a variable helps in improving the model or not.
Hypothesis testing.
All

Explanation

Question 2 of 50

1

Standard Error is the amount of variation in the _ data. It is related to Standard Deviation as σ/√n, where, n is the _ size.

Select one of the following:

Sample
Population

Explanation

Question 3 of 50

1

A Sampling Distribution is a probability distribution of a statistic (Mean/Median/Mode) obtained through a large number of samples drawn from a specific population.

Select one of the following:

True
False

Explanation

Question 4 of 50

1

A Sampling Distribution behaves much like a normal curve and has some interesting properties like :

Select one of the following:

The shape of the Sampling Distribution does not reveal anything about the shape of the population.
Sampling Distribution helps to estimate the population statistic.
Both.

Explanation

Question 5 of 50

1

Central Limit Theorem states that:

When plotting a sampling distribution of means, the mean of sample means will be equal to the population mean. And the sampling distribution will approach a normal distribution with variance equal to σ/√n where σ is the standard deviation of population and n is the sample size.

Select one of the following:

False
True

Explanation

Question 6 of 50

1

Greater the sample size, lower the standard error and greater the accuracy in determining the population mean from the sample mean?

Select one of the following:

False
True

Explanation

Question 7 of 50

1

No matter the shape of the population distribution, be it bi-modal, right-skewed, etc. The shape of the Sampling Distribution will remain the same (normal curve)?

Select one of the following:

True
False

Explanation

Question 8 of 50

1

For a sampling distribution:

The number of samples has to be sufficient (generally more than 50) to satisfactorily achieve a normal curve distribution. We also have to keep the sample size fixed since any change in sample size will change the shape of the sampling distribution and it will no longer be bell-shaped?

Select one of the following:

False
True

Explanation

Question 9 of 50

1

As we increase the sample size, the sampling distribution squeezes from both sides giving a better estimate of the population statistic since it lies somewhere in the middle of the sampling distribution (generally).

Select one of the following:

False
True

Explanation

Question 10 of 50

1

The confidence interval is a type of interval estimate from the ___________ distribution which gives a range of values in which the population statistic may lie.

Select one of the following:

Sampling
Population

Explanation

Question 11 of 50

1

The margin of error is a statistic expressing the amount of random sampling error in the results of a survey.

Select one of the following:

True
False

Explanation

Question 12 of 50

1

Margin of Error________ the width of Confidence Interval

Select one of the following:

1/2
1/4th

Explanation

Question 13 of 50

1

Which of the following points are true for confidence intervals?

Select one of the following:

Confidence Intervals can be built with different degrees of confidence suitable to a user’s needs like 70 %, 90% etc.
Greater the sample size, smaller the Confidence Interval
There are different confidence intervals for different sample means. For example, a sample mean of 40 will have a different confidence interval from a sample mean of 45.
95% Confidence Interval, does not mean that the probability of a population mean to lie in an interval is 95%. Instead, 95% C.I means that 95% of the Interval estimates will contain the population statistic.
All of the above.

Explanation

Question 14 of 50

1

Hypothesis testing lets us identify statistic to be checked against a _ statistic or statistic of another sample to study any intervention etc.

Select one of the following:

Sample, Population
Population, Sample

Explanation

Question 15 of 50

1

Null hypothesis is a type of hypothesis in which we assume that sample observations are not by chance. They are affected by some non-random situation. It is denoted by H1 or Ha.

Select one of the following:

True
False

Explanation

Question 16 of 50

1

Alternate Hypothesis is a type of hypothesis in which we assume that the sample observations are purely by chance. It is denoted by H0.

Select one of the following:

True
False

Explanation

Question 17 of 50

1

Hypothesis Testing is done on different levels of confidence and makes use of z-score to calculate the probability.

Select one of the following:

False
True

Explanation

Question 18 of 50

1

For a 95% Confidence Interval, anything above the z-threshold for 95% would reject the null hypothesis.

Select one of the following:

False
True

Explanation

Question 19 of 50

1

Write down the steps to hypothesis testing.

Select one of the following:

write your answer down.
check them later after the quiz.

Explanation

Question 20 of 50

1

The significance level, also denoted as alpha or α, is the probability of rejecting the null hypothesis when it is ________.

Select one of the following:

True
False

Explanation

Question 21 of 50

1

p-value is the probability of obtaining results at least as extreme as the observed results of a statistical hypothesis test, assuming that the null hypothesis is correct.

Select one of the following:

True
False

Explanation

Question 22 of 50

1

Low enough p-value is ground for rejecting the null hypothesis. We reject the null hypothesis if the p-value is less than the significance level?

Select one of the following:

False
True

Explanation

Question 23 of 50

1

Type-1 error: Type 1 error is the case when we fail to reject the null hypothesis but actually it is false. The probability of having a type-1 error is called beta(β).

Select one of the following:

False
True

Explanation

Question 24 of 50

1

Type-2 error: Type 2 error is the case when we reject the null hypothesis but in actual it was true. The probability of having a Type-2 error is called significance level alpha(α).

Select one of the following:

True
False

Explanation

Question 25 of 50

1

For Type 1 and Type 2 error:

α= P (Null hypothesis rejected | Null hypothesis is true)

β= P (Null hypothesis accepted | Null hypothesis is false)

Select one of the following:

True
False

Explanation

Question 26 of 50

1

Power of test is defined as

P= 1- Type-2 error

= 1 – β

Lesser the type-2 error more the power of the hypothesis test.

Select one of the following:

True
False

Explanation

Question 27 of 50

1

For a Z - test:

1. A Z-test is mainly used when the data is normally distributed.
2. We find the Z-statistic of the sample means and calculate the z-score.
3. Z-test is mainly used when the population mean and standard deviation are given.

Select one of the following:

True
False

Explanation

Question 28 of 50

1

T-tests are similar to the z-scores, the only difference being that instead of the Population Standard Deviation, we use the Sample Standard Deviation?

Select one of the following:

True
False

Explanation

Question 29 of 50

1

Z-tests are statistical calculations that can be used to compare population means to a sample's.

T-tests are calculations used to test a hypothesis, but they are most useful when we need to determine if there is a statistically significant difference between two independent sample groups.

Select one of the following:

True
False

Explanation

Question 30 of 50

1

The Degree of Freedom is the number of __________that have the choice of having more than one arbitrary value.

Select one of the following:

Variable
Sample

Explanation

Question 31 of 50

1

Select the True statement

Select one of the following:

1. Greater the difference between the sample mean and the population mean, greater the chance of rejecting the Null Hypothesis.
2. Greater the sample size, greater the chance of rejection of Null Hypothesis.
Both

Explanation

Question 32 of 50

1

One-sample t-test compares the mean of _________ data to a known value.

Select one of the following:

Sample
Population

Explanation

Question 33 of 50

1

Which of the following points are true for One Sample T- test?

Select one of the following:

Determine whether the mean of a group differs from the specified value.
Calculate a range of values that are likely to include the population mean.
We can run a one-sample T-test when we do not have the population S.D. or we have a sample of size less than 30.
All of them.

Explanation

Question 34 of 50

1

We use a two-sample T-test when we want to evaluate whether the mean of the two independent samples is different or not.

Select one of the following:

False
True

Explanation

Question 35 of 50

1

Two-sample T-test is used to:

Select one of the following:

Determine whether the means of two independent groups differ.
Calculate a range of values that is likely to include the difference between the population means.
Both

Explanation

Question 36 of 50

1

Points to be noted for two sample T-test are:

1. The groups to be tested should be __
2. The groups’ distribution should not be highly _.

Select one of the following:

Independent, Skewed
Dependent, Normal

Explanation

Question 37 of 50

1

A Independent Samples t-test compare the means for different groups?
Samples are ____ of each other?

Select one of the following:

Two, Independent
Same, Dependent

Explanation

Question 38 of 50

1

A Paired sample t-test compares means from the group at different times?
Samples are ___ on each other?

Select one of the following:

Same, Dependent
Two, Independent

Explanation

Question 39 of 50

1

ANOVA is used to determine whether there are any statistically significant differences between the means of ________ independent (unrelated) groups.

Select one of the following:

One
Two
Three or more

Explanation

Question 40 of 50

1

A one-way ANOVA has independent variable, while a two-way ANOVA has .

Select one of the following:

One, Two
Two, One

Explanation

Question 41 of 50

1

Write down the steps to perform ANOVA.

Select one of the following:

Write down the answers
Check them later

Explanation

Question 42 of 50

1

Practical applications of ANOVA in modeling are:

Select one of the following:

Identifying whether a categorical variable is relevant to a continuous variable.
Identifying whether a treatment was effective to the model or not.
Both.

Explanation

Question 43 of 50

1

The Chi-Square Test determines whether there is an association between _______ variables (i.e., whether the variables are independent or related).

Select one of the following:

Categorical
Continuous

Explanation

Question 44 of 50

1

Goodness of fit: It compares two categorical variables to find whether they are related with each other or not.

Select one of the following:

True
False

Explanation

Question 45 of 50

1

Test of Independence: It determines if sample data of categorical variables matches with population or not.

Select one of the following:

True
False

Explanation

Question 46 of 50

1

Regression analysis is a form of predictive modelling technique which investigates the relationship between a _ (target) and variable (s) (predictor).

Select one of the following:

Dependent, Independent
Independent, Dependent

Explanation

Question 47 of 50

1

The regression sum of squares describes how well a regression model represents the modeled data.
A higher regression sum of squares indicates that the model does not fit the data well?

Select one of the following:

True
False

Explanation

Question 48 of 50

1

A residual sum of squares (RSS) is a statistical technique used to measure the amount of_________ in a data set that is not explained by a regression model.

Select one of the following:

Mean
Variance

Explanation

Question 49 of 50

1

Coefficient of Determination (R-Square): It represents the strength of correlation between two variables?

Select one of the following:

True
False

Explanation

Question 50 of 50

1

Correlation Coefficients are used to measure how strong a relationship is between two variables?

Select one of the following:

True
False

	Created by Vishakha Achmare about 4 years ago

A basic quiz on Inferential Statistics.

Inferential Statistics for Data Science

Question 1 of 50

Select the right answer. With the help of inferential statistics, we can :

Select one of the following:

Explanation

Question 2 of 50

Standard Error is the amount of variation in the _________ data. It is related to Standard Deviation as σ/√n, where, n is the _________ size.

Select one of the following:

Explanation

Question 3 of 50

A Sampling Distribution is a probability distribution of a statistic (Mean/Median/Mode) obtained through a large number of samples drawn from a specific population.

Select one of the following:

Explanation

Question 4 of 50

A Sampling Distribution behaves much like a normal curve and has some interesting properties like :

Select one of the following:

Explanation

Question 5 of 50

Select one of the following:

Explanation

Question 6 of 50

Greater the sample size, lower the standard error and greater the accuracy in determining the population mean from the sample mean?

Select one of the following:

Explanation

Question 7 of 50

No matter the shape of the population distribution, be it bi-modal, right-skewed, etc. The shape of the Sampling Distribution will remain the same (normal curve)?

Select one of the following:

Explanation

Question 8 of 50

Select one of the following:

Explanation

Question 9 of 50

As we increase the sample size, the sampling distribution squeezes from both sides giving a better estimate of the population statistic since it lies somewhere in the middle of the sampling distribution (generally).

Select one of the following:

Explanation

Question 10 of 50

The confidence interval is a type of interval estimate from the ___________ distribution which gives a range of values in which the population statistic may lie.

Select one of the following:

Explanation

Question 11 of 50

The margin of error is a statistic expressing the amount of random sampling error in the results of a survey.

Select one of the following:

Explanation

Question 12 of 50

Margin of Error________ the width of Confidence Interval

Select one of the following:

Explanation

Question 13 of 50

Which of the following points are true for confidence intervals?

Select one of the following:

Explanation

Question 14 of 50

Hypothesis testing lets us identify ________ statistic to be checked against a _________ statistic or statistic of another sample to study any intervention etc.

Select one of the following:

Explanation

Question 15 of 50

Null hypothesis is a type of hypothesis in which we assume that sample observations are not by chance. They are affected by some non-random situation. It is denoted by H1 or Ha.

Select one of the following:

Explanation

Question 16 of 50

Alternate Hypothesis is a type of hypothesis in which we assume that the sample observations are purely by chance. It is denoted by H0.

Select one of the following:

Explanation

Question 17 of 50

Hypothesis Testing is done on different levels of confidence and makes use of z-score to calculate the probability.

Select one of the following:

Explanation

Question 18 of 50

For a 95% Confidence Interval, anything above the z-threshold for 95% would reject the null hypothesis.

Select one of the following:

Explanation

Question 19 of 50

Write down the steps to hypothesis testing.

Select one of the following:

Explanation

Question 20 of 50

The significance level, also denoted as alpha or α, is the probability of rejecting the null hypothesis when it is ________.

Select one of the following:

Explanation

Select the right answer.

With the help of inferential statistics, we can :

Standard Error is the amount of variation in the _ data. It is related to Standard Deviation as σ/√n, where, n is the _ size.

Hypothesis testing lets us identify statistic to be checked against a _ statistic or statistic of another sample to study any intervention etc.

For Type 1 and Type 2 error:

α= P (Null hypothesis rejected | Null hypothesis is true)

β= P (Null hypothesis accepted | Null hypothesis is false)

Power of test is defined as

P= 1- Type-2 error

= 1 – β

Lesser the type-2 error more the power of the hypothesis test.

For a Z - test:

1. A Z-test is mainly used when the data is normally distributed.
2. We find the Z-statistic of the sample means and calculate the z-score.
3. Z-test is mainly used when the population mean and standard deviation are given.

Z-tests are statistical calculations that can be used to compare population means to a sample's.

T-tests are calculations used to test a hypothesis, but they are most useful when we need to determine if there is a statistically significant difference between two independent sample groups.

Points to be noted for two sample T-test are:

1. The groups to be tested should be __
2. The groups’ distribution should not be highly _.

A Independent Samples t-test compare the means for different groups?
Samples are ____ of each other?