A statistical model and a question about inference

Consider a normal distribution with unknown μ and unit variance, but with a twist: the sample space has a gap in it between -1 and 3. Accounting for the gap, the resulting probability density becomes:

p\left(x;\mu\right)=\begin{cases} \frac{\phi\left(x-\mu\right)}{1-\Phi\left(3-\mu\right)+\Phi\left(-1-\mu\right)}, & x\in\left(-\infty,-1\right]\cup\left[3,\infty\right),\\ 0, & x\in\left(-1,3\right). \end{cases}

In this model μ is not quite a location parameter; when it’s far from the gap the density is effectively a normal centered at μ but when it’s close to the gap its shape is distorted. It becomes a half-normal at the gap boundary and then something like an extra-shallow exponential (log-quadratic instead of log-linear like an actual exponential) as μ moves toward the center of the gap. At μ = 1 the probability mass flips from one side of the gap to the other. Here’s a little web app in which you can play around with this statistical model (don’t neglect the play button under the slider on the right hand side).

Now the question; I ask my readers to report their gut reaction in addition to any more considered conclusions in comments.

Suppose μ is unknown and the data is a single observation x. Consider two scenarios:

  1. x = -1 (the left boundary)
  2. x = 3 (the right boundary)
For the sake of concreteness suppose our interest is in μ ≤ 0 vs. μ > 0. Should it make a difference to our inference whether we’re in scenario (i) or scenario (ii)?
Advertisements
6 comments
  1. I’m not sure exactly what your question means, “should it make a difference” ? in the outcome, or the model? Obviously the data makes a difference in the outcome, so I guess you’re asking should we use a different model when we’re in case 1 vs 2?

    It seems like you’ve specified a generating process, so I’d guess that maybe my answer is “no” we just use that generating process. Of course we need a prior, but that shouldn’t depend on the data. However, we may have fairly complicated prior knowledge about mu, and you haven’t said anything there. But suppose for example we know mu has to be positive logically speaking…. and there are many alternative scenarios, we might know mu is an integer, or whatever.

    I will say that this reminds me of a problem I did some work on a while back, where we observe observations only from the right tail of a distribution, and we don’t know exactly where the cutoff is.

    Here is the post from a while back: http://models.street-artists.org/2016/04/05/estimating-parameters-from-truncated-samples/

    • By “make a difference” I mean to ask, should we arrive at the same inferential-result-bearing-on-“μ > 0”, (e.g., a p-value or a posterior probability or a SEV function evaluation or anything of that sort) in the two scenarios or should the two scenarios be distinct in their conclusions about μ > 0? I’m definitely not suggesting that there are two models (I only wrote down one PDF, after all), nor am I suggesting that the observation is censored or truncated.

      • Ah, so in my terminology you *were* asking about the outcome of the inference. And so, yes my gut instinct is that it should make a difference what data we get, whether we get x=-1 or x=3 should probably change our inference about mu. On the other hand, not having done the math, perhaps you’ve set it up in such a way that there’s some symmetry and so the Bayesian results are somehow the same? Is it supposed to be a trick question?

      • It’s not supposed to be a trick question — it’s supposed to be a very simple model. The trick, insofar as there is one, is that the symmetry between μ and x that exists in the usual normal model has been broken.

  2. Right, so I think the posterior distribution for mu should be p(x | mu) p(mu) / Z , and p(x|mu) tends to have these spiky shapes as shown in your online app, so I’d imagine values of mu that put the spikes high near x are to be preferred. If you give p(mu) something like a gaussian normal(0,10), what does the posterior look like under x=-1 and x=3? should be tractable for closed form right?

    • The unnormalized density is tractable (just treat the PDF I gave as a function of μ for fixed x). If you want the CDF you’ll have to use numerical integration; I like the adaptive Lobatto quadrature function (“quadl”) in the pracma package in R.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

In the Dark

A blog about the Universe, and all that surrounds it

Minds aren't magic

Paul Crowley

Mad (Data) Scientist

Musings, useful code etc. on R and data science

djmarsay

Reasoning about reasoning, mathematically.

The Accidental Statistician

Occasional ramblings on statistics

Slate Star Codex

THE JOYFUL REDUCTION OF UNCERTAINTY

Models Of Reality

Stochastic musings of a data scientist.

Data Colada

Thinking about evidence and vice versa

Hacked By Gl0w!Ng - F!R3

Stochastic musings of a data scientist.

John D. Cook

Stochastic musings of a data scientist.

Simply Statistics

Stochastic musings of a data scientist.

LessWrong

Stochastic musings of a data scientist.

Normal Deviate

Thoughts on Statistics and Machine Learning

Xi'an's Og

an attempt at bloggin, nothing more...

%d bloggers like this: