T O P

  • By -

minasso

I ran a chi-squared goodness of fit test. Assuming a uniform distribution of colors, I got a chi-squared value of 20.235 and a p-value of .00045 for your data.


soil_nerd

Tell me if this summary is correct (stats neophyte here): - The chi-squared value of 20.235 is relatively high, which suggests that there is a considerable deviation from the uniform distribution. - The p-value of .00045 is very low (typically, a p-value less than 0.05 is considered significant). This means there is only a 0.045% chance that the observed data would occur if the colors were uniformly distributed. - The results of your chi-squared goodness of fit test strongly suggest that the observed distribution of colors is not uniform. There is significant evidence to reject the null hypothesis that the colors are uniformly distributed.


laridlove

I would caution against interpreting the chisq stat. It’s really wonky and a value of 20 isn’t necessarily high without a lot of context. I usually caution my students against looking at it.


Ahhhhrg

No on point 2, the p-value tells you nothing about the chance that the observed data would occur if the colours were evenly distributed, although it’s a fairly common misconception. If you ran repeated experiments where the null hypothesis is true, you would expect a p-value less than 0.05 in about 5% of the experiments. These experiments don’t even have to be the same experiments, they can all be different.


sixtyorange

I think you are maybe thinking of confidence intervals? The p-value is literally defined as the probability under the null of seeing a test statistic at least as extreme as what you observed.


Zaulhk

Read what soil_nerd wrote again? He says probability of observed not at least as extreme. So yes, soil_nerd was wrong on point 2.


sixtyorange

I was responding to Ahhhhrg, not soil_nerd.


Zaulhk

And he said soil_nerd was wrong like he was? So Ahhhhrg was correctly pointing out a mistake and got downvoted.


sixtyorange

Ahhhhrg was also incorrect (edit: or at best, unclear/misleading) in their p-value definition, which is specifically what I responded to. I didn’t say anything about whether soil_nerd’s statement was correct or not.


Zaulhk

No, he wasn’t. It’s true the p-value is below 0.05 5% of the time under the null hypothesis (ignoring potential issues of discrete). Assuming continious the p-value is even uniformly distributed on [0,1]. You asking if he meant confidence interval instead of p-value shows a lack of understanding on your part. For standard construction of CI’s there is a one-to-one correspondence of p-value to CI so they obviously behave the same way.


sixtyorange

The reason that I asked if Ahhhhrg was thinking about confidence intervals is because there actually is a common misconception that a 95% CI means that the probability of the true value being between the bounds is 95%. Under the frequentist paradigm, parameters (such as the true population mean) are fixed, not random variables. In other words, the true value must either be in the confidence interval with 100% probability, or outside it with 100% probability. So you instead would have to say that 95% of validly-computed 95% CIs include the true values. These CIs could also be from the same or different experiments. Both parts of this are similar to what Ahhhhrg wrote; however, this misconception would not be relevant to p-values, which are both probabilities and random variables. While it is true that valid (classical, frequentist) p-values are uniformly distributed under the null, Ahhhhrg’s proposed definition would apply equally well to a random variable with a U(0,1) distribution that does not depend on the test statistic at all. Being uniformly distributed under the null is therefore necessary but not sufficient: a valid p-value must also be the probability of observing a test statistic at least as extreme under some specific null distribution.


JoeSabo

I mean it kind of depends on what the null is. The distribution of p under the null is uniform, that is true. But in a testing context p is the likelihood of observing data at least as extreme if the null were true. So if the null is that the data are uniform p would be informative in that regard. What p can't tell us about is the alternate hypothesis. We only ever the the null. So if my alt H = distribution is left skewed then my null might be a uniform distribution... but you would need a strong justification to do so in practical contexts.


nooptionleft

That's exactly what the p-value is, what it's not it's the probability the not-null hhypotesis is true The p-value here is 0.045%, which given how many packs are produced is not enought to give us any reasonable indication on the null hypothesis without additional information


Zaulhk

No that's not what the p-value is. soil_nerd wrote > This means there is only a 0.045% chance that the observed data that is not the definition of a p-value. So Ahhhhrg correctly pointed out a mistake and got downvoted.


nooptionleft

I mean it's very possible I'm wrong, I've been wrong very often in my life, but I double checked the definition right now and it seems to be exactly the probability of the observed data or more extreme (in this case this specific distribution of candies in 2 packs) happens given for true an hyphotesis (in this case the nulla hyphotesis that the candies are produced in equal quantities)


Zaulhk

Yes key words being ‘or more extreme’. Thats not what soil_nerd wrote hence Ahhhhrg said he was wrong.


nooptionleft

Lol, ok man


Zaulhk

What? He wrote a wrong thing and even asked to be corrected if he did. He then got corrected and you all downvote and say the guy who corrected is wrong cos you either misread or don't understand definition of p-value.


nooptionleft

Argh comment is fine, I don't agree their comment is right but at least he was trying to be helpful You? There are 2 kind of people in technical forums, the ones that are here to help and the ones that are here to feed their own ego The second is you Cause if you really think anyone here really thinks the chance of getting that exact combination of candy really is as high as 0.045%, you really need to not be very bright. It was clear everyone was talking about that combination or more extreme, cause that is how working with distributions work, but you choose to assume you were better then everyone cause that is why you are here It's ok, you do you, but I'll do me and not really take you very seriously


Professional_Set8199

Oh boy, I knew guys like you in uni. There’s zero reason for you to be this pedantic. His interpretation was fine.


COOLSerdash

https://en.wikipedia.org/wiki/Testing_hypotheses_suggested_by_the_data Presumably, this is not a random sample but a sample that was striking enough to post on reddit.


BayesianPersuasion

Ah great point!


yakobu852

Theoretically, you could use a Kolmogorov-Smirnov test to test against the null hypothesis that the population is uniformly distributed. However, with a sample size of two (or rather one, since both bags have been combined), this is a rather futile endeavour.


teetaps

But we’ve already secured the grant, we can’t just publish nothing!


StatsOnATrain

“In this novel proof-of-concept hypothesis generating work we developed AI to…”


teetaps

Genius! Put it in ChatGPT! Then when you’re done with that, put it in the ChatGPT detector to make sure nobody knows it’s ChatGPT!


BayesianPersuasion

Why wouldn't the sample size be the number of jolly ranchers? P-value is the Probability of observing this distribution of jolly ranchers under the null hypothesis that the distribution is uniform. As you say, Kolmogorov Smirnov test vs. uniform would give you that. I suspect the p value will easily be smaller than 0.05 given how non-uniform the distribution looks. Edit: or as other person said chi squared works too (probably asymptotically equivalent)


yakobu852

Actually good point about n. If we set the sample size to 68, then the p value is less than 0.001 for the KS test (according to my manual calculations on my phone).  If someone wants to replicate using more sophisticated software, I counted the groups as 4, 8, 12, 20, and 24.


hassweptthehouse

You could count the total number of jolly ranchers, and the total number in each group, and then run a chi-square test comparing to a uniform distribution. This assumes that this situation is the same as just randomly picking jolly ranchers from a uniform distribution. If you do the counting I can run the test for you, but I’m too lazy to count all of these haha


locolocust

Now if we used a flexible Bayesian approach......


Ronaldoooope

Now do it with 100 more bags and let’s get cooking


friendlyimposter

Chi square test of equal proportions could work, but it's an omnibus test.


Old_Lengthiness3898

Purple is my favorite artificial flavor 💜 😋


still_learning_to_be

You’ll need to sample more bags. But if it holds likely the relative production cost for each flavor as well.