Density plot looks normal, qqplot not normal Shapiro significative

The following QQ plot looks with too many points out of the line, the density plot looks normal and the Shapiro Test p-value < 2.2e-16, so this is not a normal distribution but I've read not to trust Shapiro Test when I have about 1000 data points so I should conclude that this distribution is normal? enter image description here

enter image description here

asked 8 hours ago

AnaHochma

New contributor

$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
7 hours ago

1

$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber♦
7 hours ago

1

$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
7 hours ago

$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom♦
7 hours ago

$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
7 hours ago

|
show 2 more comments

enter image description here

asked 8 hours ago

AnaHochma

New contributor

$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
7 hours ago

1

$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber♦
7 hours ago

1

$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
7 hours ago

$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom♦
7 hours ago

$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
7 hours ago

|
show 2 more comments

enter image description here

asked 8 hours ago

AnaHochma

New contributor

enter image description here

normal-distribution

asked 8 hours ago

AnaHochma

New contributor

asked 8 hours ago

AnaHochma

New contributor

asked 8 hours ago

AnaHochma

New contributor

asked 8 hours ago

AnaHochma

asked 8 hours ago

AnaHochma

New contributor

AnaHochma is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
7 hours ago

1

$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber♦
7 hours ago

1

$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
7 hours ago

$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom♦
7 hours ago

$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
7 hours ago

|
show 2 more comments

$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
7 hours ago

1

$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber♦
7 hours ago

1

$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
7 hours ago

$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom♦
7 hours ago

$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
7 hours ago

Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.

– Peter Leopold
7 hours ago

@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.

– whuber♦
7 hours ago

@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.

– Nick Cox
7 hours ago

@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.

– Peter Flom♦
7 hours ago

I would not trust a scale for happiness [NB] with such results!

– Nick Cox
7 hours ago

|
show 2 more comments

1 Answer
1

active

oldest

votes

First, the density plot does not really look normal. It's symmetric, but the shape is wrong. I suggest generating a normal distribution with the same mean and variance as yours and then overlaying that density on the one you've got. I am fairly sure you will see a mismatch.

Second, a quantile normal plot is often a better clue to nonnormality.

Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?

answered 7 hours ago

Peter Flom♦

75.4k11107206

1

$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
4 hours ago

$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
4 hours ago

$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
4 hours ago

1

$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
4 hours ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

AnaHochma is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f392949%2fdensity-plot-looks-normal-qqplot-not-normal-shapiro-significative%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Second, a quantile normal plot is often a better clue to nonnormality.

Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?

answered 7 hours ago

Peter Flom♦

75.4k11107206

1

$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
4 hours ago

$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
4 hours ago

$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
4 hours ago

1

$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
4 hours ago

add a comment |

Second, a quantile normal plot is often a better clue to nonnormality.

Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?

answered 7 hours ago

Peter Flom♦

75.4k11107206

1

$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
4 hours ago

$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
4 hours ago

$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
4 hours ago

1

$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
4 hours ago

add a comment |

Second, a quantile normal plot is often a better clue to nonnormality.

Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?

answered 7 hours ago

Peter Flom♦

75.4k11107206

Second, a quantile normal plot is often a better clue to nonnormality.

Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?

answered 7 hours ago

Peter Flom♦

75.4k11107206

answered 7 hours ago

Peter Flom♦

75.4k11107206

answered 7 hours ago

Peter Flom♦

75.4k11107206

answered 7 hours ago

Peter Flom♦

75.4k11107206

1

$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
4 hours ago

$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
4 hours ago

$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
4 hours ago

1

$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
4 hours ago

add a comment |

1

$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
4 hours ago

$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
4 hours ago

$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
4 hours ago

1

$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
4 hours ago

Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon

– AnaHochma
4 hours ago

I'd go with Wilcoxon. Or maybe a bootstrap.

– Peter Flom♦
4 hours ago

Ah, two distributions! Thanks for confirming what the data was hinting strongly at.

– Peter Leopold
4 hours ago

Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?

– AnaHochma
4 hours ago

add a comment |

AnaHochma is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

AnaHochma is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Cross Validated!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ngtjyty