- The Hungry Mouse
- Using a wrapper and a recursive function
- Snake game javascript
- C# Employee Pay Data using JSON file
- How to speed up first unique character lookup
- Looking for a webtoon where a girl's brain gets put in a boy's body
- The value for the parameter @MirrorDirectory is not supported error with Ola Hallengren Backup script
- Cannot register SPN, error 0x80090350, state 4
- Can you link text data to an Illustrator graph using variable importer script?
- how to know fonts type? What font is this?
- How are 'line' details added to textures?
- How to record HDMI video output?
- What are the WebM (VP9+opus) encoding settings compatible for YouTube import?
- Is the second sentence even a real sentence? Edited to be more specific a question
- David Foster Wallace's technique in “Consider the Lobster”
- What punctuation to use when a character is writing something down?
- Tomcat application arbitrary file read exploitation
- What does this activity record of gmail mean, when it says an authorized application with IP from India?
- Is there some physical interpretation of the parallel exterior region?
- Huygens Principle and Interference

# Sample complexity for learning Boltzmann Distribution parameters

I am trying to think through the number of samples that I would need to estimate the parameters of a Boltzmann partition function to a desirable precision.

Suppose that there are N possible states of the world, with the probability of state i being observed equal to

$$Pr(i | \theta) = \frac{e^{-\theta_i}}{ \sum_{i=1}^N e^{-\theta_i}}.$$

I don't know what the values of $\{\theta_i\}_{i=1}^N$ are, but I can sample independent observations from the set {1,...,N} of possible states of the world, distributed according to the distribution $Pr(i)$ given above.

Let's say I draw $m$ observations $X_1,...,X_m \in \{1,...,N\}$, and define the maximum likelihood estimator

$$\widehat{\theta} = \arg \max_{\theta'} \sum_{j=1}^m Pr(X_j | \theta)$$

When $m \to \infty$, then $\widehat{\theta} = \theta$. However, I'm not sure how many samples $m$ I would need to ensure that the estimate $\widehat{\theta}$ is probably approximately correct. That is, how large does $m$ need to b