Role of ACVF and ACF in prediction¶
Let $\{X_t\}_t$ be a stationary time series. We know that
Auto covariance function (ACVF) is $\gamma_{x}(h) = \mathrm{Cov}(X_t, X_{t + h})$, $\forall h \in \mathbb{Z}$.
Auto correlation function (ACF) is $\rho_{x}(h) = \frac{\gamma_x(h)}{\gamma_x(0)}$.
Given observed data $X_1, \cdots, X_n$ we want to find the best prediction for $X_{n + h}$ given $X_n$. In the $L^2$ sense (a function which, over a finite range, has a finite number of discontinuities), the best prediction is $\mathbb{E}\left[ X_{n + h} | X_n \right] = p(X_n)$. This means
$$ \mathbb{E}\left[ (X_{n + h} - p(X_n))^2 \right] = \min_{f: \mathbb{E}[f^2(X_n)] < \infty} \mathbb{E}\left[ (X_{n + h} - f(X_n))^2 \right]. $$
A simpler problem is to find the best linear prediction. This means, find $\hat{a}$ and $\hat{b}$ such that
$$ \mathbb{E}\left[ (X_{n+h} - \hat{a} - \hat{b}X_n)^2 \right] = \min_{(a, b)\in \mathbb{R}^2} \mathbb{E}\left[ (X_{n+h} - a - bX_n)^2 \right]. $$
Let $\psi(a, b) = \mathbb{E}\left[ (X_{n+h} - a - bX_n)^2 \right]$, then we can have
$$\frac{\partial \psi}{\partial a} = -2 \mathbb{E}\left[ (X_{n + h} - a - bX_n) \right] = 0$$
$$\frac{\partial \psi}{\partial b} = -2 \mathbb{E}\left[ X_n(X_{n + h} - a - bX_n) \right] = 0.$$
$$ \begin{cases} a + b\mu = \mu & \text{(1)}\\ a\mu + b \mathbb{E}[X_n^2] = \gamma_x(h) + \mu^2 & \text{(2)}\\ \end{cases} $$
and we know that
$$\mathbb{E}[X_t] = \mu$$
$$\mathbb{E}[X_t^2] = \mathrm{Var}[X_n] + \mu^2 = \gamma_x(0) + \mu^2$$
$$\mathbb{E}[X_t X_{t + h}] = \mathrm{Cov}(X_t, X_{t+h}) + \mu^2 = \gamma_x(h) + \mu^2$$
as a result we can derive (2) as
$$ \begin{align} & a\mu + b \mathbb{E}[X_n^2] = \gamma_x(h) + \mu^2 \\ \Leftrightarrow &a\mu + b \mu^2 + b \gamma_x(0) = \gamma_x(h) + \mu^2 \\ \Leftrightarrow &\mu^2 + b \gamma_x(0) = \gamma_x(h) + \mu^2 \\ \Leftrightarrow &b \gamma_x(0) = \gamma_x(h)\\ \Leftrightarrow &\hat{b} = \frac{\gamma_x(h)}{\gamma_x(0)} = \rho_x(h)\\ \end{align} $$
based on the value of $\hat{b}$, we now can get the value of $\hat{a}$ using (1) as
$$ \begin{align} & a + b \mu = \mu \\ \Leftrightarrow & a + \rho_x(h) \mu = \mu \\ \Leftrightarrow & \hat{a} = \mu (1 - \rho_x(h)). \end{align} $$
As a result, the best linear prediction is
$$ \begin{align} &\ell (X_n) = \mu (1 - \rho_x(h)) + \rho_x(h)X_n \\ \Leftrightarrow &\ell (X_n) = \mu + \rho_x(h) (X_n - \mu).\\ \end{align} $$
The prediction error is
$$ \begin{align} \mathbb{E} \left[ \left( X_{n+m} - \mu - \rho_x(h) (X_n - \mu) \right)^2 \right] &= \mathbb{E} \left[ \left( X_{n+m} - \mu \right)^2 \right] \\ &\quad - 2 \rho_x(h) \mathbb{E} \left[ \left( X_{n+m} - \mu \right) \left( X_{n} - \mu \right) \right] \\ &\quad + \rho_x^2(h) \mathbb{E} \left[ \left( X_{n} - \mu \right)^2 \right] \\ &= \gamma_x(0) - 2 \gamma_x(0) \rho_x^2(h) + \rho_x^2(h) \gamma_x(0) \\ &= \gamma_x(0) - \gamma_x(0) \rho_x^2(h) \\ &= \gamma_x(0) \left(1 - \rho_x^2(h)\right) \end{align} $$
The prediction error is almost 0 if $| \rho_x(h) | \approx 1$.
Moreover, it is maximal if $\rho_x(h) = 0$ in which case $\ell(X_n) = \mu$.
A characterzing condition for ACVF¶
Recall that the ACVF of some stationary time series $\{X_t\}_t$, $\gamma_x$ satisfies the following properties
$\gamma_x(0) \ge 0$ ($\gamma_x(0) = \mathrm{Var}[X_t], \forall t$)
$| \gamma_x(h) | \le \gamma_x(x)$ (Cauchy-Schwarz inquality)
$\gamma_x(-h) = \gamma_x(h), \forall h \in \mathbb{Z}$
Question: Suppose we are given a function on $\mathbb{Z}$ which satisfies the above properties. How can we tell that this function is ACVF of some stationary time series?
Definition: A real valued funcion $\kappa$ defined on $\mathbb{Z}$ is said to be nonnegative definite if $\forall n \ge 1$ and $a = (a_1, a_2, \cdots, a_n)^T \in \mathbb{R}^n$ we have that $\sum_{1 \le i,j \le n} a_i a_j \kappa (i - j) \ge 0$. In other words, $a^T \kappa_n a \ge 0$ with $\kappa_n = (\kappa (i - j))_{1 \le i, j \le n}$.
Theorem: A real valued function $\kappa$ defined on $\mathbb{Z}$ is the ACVF of some stationary time series if and only if
$$ \begin{cases} \kappa(-h) = \kappa(h), \forall h \in \mathbb{Z} \\ \kappa\text{ is nonnegative definite} \end{cases} $$
Suppose that $\kappa$ is the ACVF of some stationary time series $\{X_t\}_t$. We know that $\kappa$ has to be even as indicated in property 1.
Let $n \ge 1$ and $a = (a_1, a_2, \cdots, a_n)^T \in \mathbb{R}^n$. Also, let $\tilde{X}_n = (X_1, X_2, \cdots, X_n)^T$. Then we have
$$ \begin{align} \mathrm{Var}[a^T \tilde{X}_n] &= \sum_{1 \le i,j \le n} a_i a_j \mathrm{Cov}(X_i, X_j) \\ &= \sum_{1 \le i,j \le n} a_i a_j \kappa (i - j) \end{align} $$
which shows that
$$\forall n \ge 1, \forall a = (a_1, a_2, \cdots, a_n)^T \in \mathbb{R}^n.$$
$$\sum_{1 \le i,j \le n} a_i a_j \kappa (i - j) \ge 0$$
meaning that $\kappa$ is nonnegative definite.
Remark: The proof of sufficiency is based on showing that there exists a Guassian time series $\{X_t\}_t$ whose ACVF is $\kappa$.
Fix $w_0 \in \mathbb{R}$ and consider the function $\kappa(h) = \cos (w_0 h)$, $h\in \mathbb{Z}$. Given $\kappa$ is even, let $n \ge 1$ and $a = (a_1, a_2, \cdots, a_n)^T \in \mathbb{R}^n$. We have
$$ \begin{align} \sum_{1 \le i,j \le n} a_i a_j \kappa(i - j) &= \sum_{1 \le i,j \le n} a_i a_j \cos(w_0(i - j)) \\ &= \sum_{1 \le i,j \le n} a_i a_j \cos(w_0 i) \cos(w_0 j) + \sum_{1 \le i,j \le n} a_i a_j \sin(w_0 i) \sin(w_0 j) \\ &= \left( \sum_{i = 1}^n a_i \cos(w_0 i) \right)^2 + \left( \sum_{j = 1}^n a_j \sin(w_0 j) \right)^2 \ge 0 \end{align} $$
By the previous theorem, $\kappa$ is the ACVF of some stationary time series.
Consider $X_n = A \cos(w_0 t) + B \sin(w_0 t)$ with $A$ and $B$ random variables such that $\mathbb{E}[A] = \mathbb{E}[B] = 0$, $\mathrm{Var}(A) = \mathrm{Var}(B) = 1$, and $\mathrm{Cov}(A, B) = 0$. We can have
$\mathrm{Var}(X_t) = \mathrm{Var}(A) \cos^2(w_0 t) + 0 + \mathrm{Var}(B) \sin^2(w_0 t) = 1 < \infty$
$\mu_x(t) = \mathbb{E}[X_t] = 0, \forall t$
$\begin{align} \gamma_x(t, t+h) &= \mathrm{Cov} \left( A \cos(w_0 t) + B \sin(w_0 t), A \cos(w_0 (t + h)) + B \sin(w_0 (t + h)) \right) \\ &= \mathrm{Var}(A) \cos(w_0 t) \cos(w_0 (t+h)) + 0 + 0 + \mathrm{Var}(B) \sin(w_0 t) \sin(w_0 (t + h))\\ &= \cos(w_0 t) \cos(w_0 (t+h)) + \sin(w_0 t) \sin(w_0 (t+h))\\ &= \cos(w_0 h) = \kappa(h) \end{align}$
Consider the function
$$ \kappa(h) = \begin{cases} 1 & h = 0\\ \rho & |h| = 1\\ 0 & \text{otherwise} \end{cases} $$
where $\rho \in [-1, 1]$.
We will show that $\kappa$ is the ACVF of some stationary time series if and only if $|\rho| \le \frac{1}{2}$.
Consider the first order moving average $MA(1)$ process, where $\{X_t\}_t$ is defined as $X_t = Z_t + \theta Z_{t - 1}$, $\forall t \in \mathbb{Z}$ with $\theta \in \mathbb{R}$ and $\{Z_t\}_t \sim WN(0, \sigma^2)$ with $\sigma > 0$.
We know that $\{X_t\}_t$ is stationary with
$$ \gamma_x(h) = \begin{cases} \sigma^2 (1+\theta^2) & h = 0\\ \sigma^2 \theta^2 & |h| = 1\\ 0 & \text{otherwise} \end{cases}. $$
Suppose that $|\rho| \le \frac{1}{2}$, then $\gamma_x = \kappa$ on $\mathbb{Z}$ if and only if $\begin{cases} \sigma^2(1+\theta^2) = 1 & (1) \\ \sigma^2\theta = \rho & (2) \end{cases}$.
If $\rho = 0$, then $\theta = 1$ and $\sigma = 1$. In this case, $\kappa$ is the ACVF of $\{Z_t\}_t \sim WN(0, 1)$ .
If $\rho \neq 0$, then $\frac{(1)}{(2)}$ gives that $\frac{1 + \theta^2}{\theta}=\frac{1}{\rho} \Leftrightarrow \theta^2 - \frac{\theta}{\rho} + 1 = 0$. As a result, $\Delta = \frac{1}{\rho^2} - 4 = \frac{1 - 4\rho^2}{\rho^2} \ge 0$.
The equations admit the solutions
$$ \theta_1 = \frac{1}{2} \left( \frac{1}{\rho} - \frac{\sqrt{1 - 4\rho}}{|\rho|} \right), \sigma_1^2 = \frac{\rho}{\theta_1} $$
$$ \theta_2 = \frac{1}{2} \left( \frac{1}{\rho} - \frac{\sqrt{1 - 4\rho}}{|\rho|} \right), \sigma_2^2 = \frac{\rho}{\theta_2} $$
In summary, if $|\rho| \le \frac{1}{2}$, then $\kappa$ is the ACVF of either $\{Z_t\}_t \in WN(0, 1)$ or on $MA(1)$ with parameters $\theta_1$, $\sigma_1^2$, $\theta_2$, and $\sigma_2^2$.
Suppose that $|\rho| > \frac{1}{2}$, then $\Delta = \frac{1}{\rho^2} - 4 < 0$ in which case the equation $\theta^2 - \frac{\theta}{\rho} + 1 = 0$ has no solution in $\mathbb{R}$.
Therefore, there is no $MA(1)$ whose ACVF is $\kappa$.
Case $\rho > \frac{1}{2}$
Let $x\ge 1$ be an interger (to be determined) and consider the vector $a = \begin{pmatrix} 1 & -1 & 1 & \cdots & 1 & -1 \\ \end{pmatrix}^T \in \mathbb{R}^{2r}$ (let $n = 2r$) and $\kappa_n = (\kappa(i - j))_{1\le i, j \le n}$. Then we have
$$ \begin{align} a^T \kappa a &= \begin{pmatrix} 1 & -1 & 1 & \cdots & 1 & -1 \\ \end{pmatrix} \begin{pmatrix} 1 & \rho & 0 & 0 & \cdots & 0 \\ \rho & 1 & \rho & 0 & \cdots & 0 \\ 0 & \rho & 1 & \rho & \cdots & 0 \\ 0 & 0 & \rho & 1 & \rho & 0 \\ \vdots & \vdots & \vdots & \ddots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & \rho & 1 \end{pmatrix} \begin{pmatrix} 1 \\ -1 \\ 1 \\ \vdots \\ 1 \\ -1 \end{pmatrix}\\ &= \begin{pmatrix} 1 & -1 & 1 & \cdots & 1 & -1 \\ \end{pmatrix} \begin{pmatrix} 1 - \rho \\ 2 \rho - 1 \\ -(2\rho - 1) \\ \vdots \\ -(2\rho - 1) \\ -(1-\rho) \end{pmatrix}\\ &= 2(1-\rho) + (2r - 2) (1 - 2\rho)\\ &= 2(1 - \rho + (r - 1)(1 - 2\rho))\\ &= 2(1 - \rho + r (1 - 2\rho) - 1 + 2 \rho)\\ &= 2(\rho + r (1 - 2\rho)) < 0\\ &\Leftrightarrow \rho < r (2 \rho - 1)\\ &\Leftrightarrow r > \frac{\rho}{2 \rho - 1} \end{align} $$
If $n = 2 \left( \lfloor \frac{\rho}{2 \rho - 1} \rfloor + 1 \right)$, then we can find a $a \in \mathbb{R}^n$ such that $a^T \kappa a < 0$, which means that $\kappa$ is not nonnegative definite, meaning that it cannot be the ACVF of some stationary time series.
Case $\rho < - \frac{1}{2}$
Let $n \ge 1$ (to be determined) and consider $a = \begin{pmatrix}1 & 1 & \cdots & 1\end{pmatrix}^T \in \mathbb{R}^n$, we have
$$ \begin{align} a^T\kappa a &= \begin{pmatrix}1 & 1 & \cdots & 1\end{pmatrix} \begin{pmatrix} 1+\rho \\ 1+2\rho \\ \vdots \\ 1+2\rho \\ 1+\rho \end{pmatrix} \\ &= 2(1 + \rho) + (n - 2)(1 + 2\rho) \\ &= 2(1 + \rho) + n(1 + 2\rho) - 2(1 + 2\rho) \\ &= n(1 + 2\rho) - 2\rho < 0 \\ &\Leftrightarrow n > \frac{2\rho}{2\rho + 1} \end{align} $$
Again, for $n = 1$, we can find $a \in \mathbb{R}^n$ such that $a^T\kappa a < 0$, which means that $\kappa$ is not nonnegative definite, meaning that $\kappa$ cannot be the ACVF of some stationary time series.
Stationary and q-dependence¶
Recall that $\{X_t\}_t$ is strictly stationary if $\forall n \ge 1$ and $h \in \mathbb{Z}$, we have that $(X_1, X_2, \cdots, X_n) \stackrel{d}{=} (X_{1+n}, X_{2+n}, \cdots, X_{n+h})$.
Proposition: Let $\{X_t\}_t$ be a strictly stationary time series, then
The random variables $X_t$, $t \in \mathbb{Z}$ are identically distributed.
For $t$ and $h$ in $\mathbb{Z}$, $(X_1, X_{1+h}) \stackrel{d}{=} (X_{t}, X_{t+h})$.
$\{X_t\}_t$ is also weakly stationary provided that the second moment is finite $\mathbb{E}[X_t^2] < \infty$.
We take $n = 1$ in the definition of strict stationary to conclude that $X_1 \stackrel{d}{=} X_{1+h}, \forall h \in \mathbb{Z}$. As $\{ 1 + h : h \in \mathbb{Z} \} = \mathbb{Z}$, we have
$$X_1 \stackrel{d}{=} X_t, \forall t \in \mathbb{Z}.$$
We want to prove that $(X_1, X_{1+h}) \stackrel{d}{=} (X_{t}, X_{t+h})$. For $h = 0$, $(X_1, X_{1+h}) \stackrel{d}{=} (X_t, X_{t+h})$ as a consequence of 1.
Let $h \neq 0$ and without loss of generality that $h \ge 1$ and $n \ge h + 1$. By definition of strict stationary, we have $\forall k \in \mathbb{Z}$
$$(X_1, X_2, \cdots, X_{1+h}, \cdots, X_n) \stackrel{d}{=} (X_{1+k}, \cdots, X_{1+h+k}, \cdots, X_{n+k}).$$
For $k = t - 1$, this implies
$$(X_1, X_2, \cdots, X_{1+h}, \cdots, X_n) \stackrel{d}{=} (X_{t}, \cdots, X_{t+h}, \cdots, X_{n+t-1}).$$
This implies that $(X_1, \cdots, X_{1 + h} \stackrel{d}{=} (X_t, X_{t+h}))$. This is a particular case of the more general result
$$(X_r, X_s) \stackrel{d}{=} (X_{t+r-1}, X_{t+s-1}), \forall 1 \le r < s \le n.$$
If $\mathbb{E}[X_t^2] < \infty$, then $\mathbb{E}[|X_t|] \le \sqrt{\mathbb{E}[X_t^2]} < \infty$ (Jensen's inequality), hence, $\mathbb{E}[X_t]$ is finite.
By 1, $\mathbb{E}[X_t] = \mathbb{E}[X_1], \forall t$, which means $\mu_x(t) = \mu_x(1), \forall t$.
By the Cauchy-Schwarz inequality,
$$|\mathrm{Cov}(X_t, X_{t+h})| \le \sqrt{\mathrm{Var}(X_t)} \sqrt{\mathrm{Var}(X_{t+h})} < \infty$$
we can conclude $\gamma_x(t, t+h)$ is finite.
Now, 2 gives us that
$$\mathrm{Cov}(X_t, X_{t+h}) = \mathrm{Cov}(X_1, X_{1+h})$$
we can have
$$\gamma_x(t, t+h) = \gamma_x(1, 1+h)$$
which depends on $h$.
We can now conclude $\{X_t\}_t$ is (weakly) stationary.
Remark: An IID sequence of random variables $\{X_t\}_t$ is strictly stationary. In fact, let $x_1, x_2, \cdots, x_n \in \mathbb{R}$ .
$$ \begin{align} \mathbb{P}(X_{1+h} \le x, \cdots, X_{n+h} \le x_n) &= \prod_{i=1}^n \mathbb{P}(X_{i+n} \le x_i) \\ &= \prod_{i=1}^n F(x_i)~~~~~\text{where F is the common CDF}\\ &= \prod_{i=1}^n \mathbb{P}(X_i \le x_i)\\ &= \mathbb{P}(X_1 \le x_1, \cdots, X_n \le x_n)~~~~~\text{using independence}\\ &\Leftrightarrow (X_{1+h}, \cdots, X_{n+h}) \stackrel{d}{=} (X_1, \cdots, X_n) \end{align} $$
How do we construct a strictly stationary time series?¶
Let $g: \mathbb{R}^{q+1} \rightarrow \mathbb{R}$ for some integer $q \ge 0$ such that $g$ is measurable (a continuous function that works). Consider an IID time series $\{Z_t\}_t$, for $t \in \mathbb{Z}$, define $X_t = g(Z_t, Z_{t-1}, \cdots, Z_{t-q})$.
Claim: The time series $\{X_t\}_t$ is strictly stationary.
Proof: Let $n \ge 1$ and $h \in \mathbb{Z}$, we can have
$$ \begin{align} (X_{1+h}, \cdots, X_{n+h}) &= (g(Z_{1+h}, \cdots, Z_{1+h-q}), g(Z_{2+h}, \cdots, Z_{2+h-q}), \cdots, g(Z_{n+h}, \cdots, Z_{n+h-q}))\\ &= \psi(Z_{1+h-q}, Z_{2+n-q}, \cdots, Z_{n+h}) \end{align}$$
where $\psi: \mathbb{R}^{n+q} \rightarrow \mathbb{R}$ is also measurable.
Given $q = 1$ and $g(x_1, x_2) = x_1 + x_2$, we can have
$$X_t = g(Z_t, Z_{t-1}) = Z_{t-1} + Z_t.$$
Take $n = 3$ as an example
$$ \begin{align} (X_{1+h}, X_{2+h}, X_{3+h}) &= (Z_h + Z_{h+1}, Z_{h+1} + Z_{h+2}, Z_{h+2} + Z_{h+3})\\ &= \psi(Z_h, Z_{h+1}, Z_{h+2}, Z_{h+3}) \end{align} $$
where $\psi(x_1, x_2, x_3, x_4) = (x_1+x_2, x_2+x_3, x_3+x_4)$.
Recall that
$$(X_{1+h}, \cdots, X_{n+h}) = \psi(Z_{1+h-q}, Z_{2+h-q}, \cdots, Z_{h+n})$$
Since $\{Z_t\}_t$ is an IID time series, it holds that
$$(Z_{1+h-q}, Z_{2+h-q}, \cdots, Z_{n+h}) \stackrel{d}{=} (Z_{1-q}, Z_{1-q}, \cdots, Z_n).$$
This implies that
$$ \begin{align} \psi(Z_{1+h-q}, Z_{2+h-q}, \cdots, Z_{n+h}) \stackrel{d}{=} \psi(Z_{1-q}, Z_{2-q}, \cdots, Z_n) &= (g(Z_1, \cdots, Z_{1-q}), g(Z_2, \cdots, Z_{2-q}), \cdots, g(Z_n, \cdots, Z_{n-q}))\\ &= (X_1, X_2, \cdots, X_n) \end{align} $$
which proves that
$$(X_{1+h}, \cdots, X_{n+h}) \stackrel{d}{=} (X_1, X_2, \cdots, X_n).$$
Let $g: \mathbb{R}^{q+1} \rightarrow \mathbb{R}$ be a measurable function for some integer $q \ge 0$. Also, let $X_t = g(Z_t, \cdots, X_{t-q})$ with $\{Z_t\}_t$ an IID time series.
For $t$ and $s$ such that $|s - t| > q$ we have that
$$\{t-q, \cdots, t\} \cap \{s-q, \cdots, s\} = \phi.$$
Indeed, if the intersection were not equal to $\phi$, then $\exists i,j \in \{0, \cdots, q\}$ such that $t - i = s - j \Rightarrow |s - t| = |j - i| \le q$, which contradicts the fact that $|s - t| > q$.
Hence, for $t$ and $s$ such that $|s - t| > q$, we have that $(Z_{t-1}, \cdots, Z_t)$ and $(Z_{s-q}, \cdots, Z_s)$ are independent. This give that $g(Z_t, \cdots, Z_{t-q})$ and $g(Z_s, \cdots, Z_{s-q})$ are independent. Furthermore, we can conclude that $X_t$ and $X_s$ are independent.
Based on these, we say that $\{X_t\}_t$ is q-dependent.
The related notion here is q-correlation. A stationary time series is said to be q-correlated if $\gamma_x(h) = 0$, $\forall |h| > q$.
q-dependence $\Rightarrow$ q-correlation provided that $\mathbb{E}[X_t^2] < \infty$.
An IID sequence is 0-dependent.
A $WN(0, \sigma^2)$ is a 0-dependent.
An important example of a q-correlated time series is the Move Average ($MA$) process of order q.
Definition: A time series $\{X_t\}_t$ is called a Moving Average ($MA$) process of q if $X_t$ has the following representation: $X_t = Z_t + \theta_1 Z_t + \cdots + \theta_q Z_{t-q}$, for $q \ge 0$ an integer $\theta_1, \cdots, \theta_q \in \mathbb{R}$ and $\{Z_t\}_t \sim WN(0, \sigma^2)$ for some $\sigma > 0$
$$\{X_t\}_t \sim MA(q).$$
As $\{X_t\}_t$ is q-correlated, we have
$$\mathrm{Var}(X_t) = \mathrm{Var}(Z_t + \theta_1 Z_{t-1} + \cdots + \theta_q Z_{t-q}) \sigma^2 \left( 1 + \sum_{i = 1}^q \theta_i^2 \right) < \infty$$
$$\mu_x(t) = 0, \forall t = \mathbb{Z}$$
$$ \begin{align} \gamma_x(t, t+h) &= \mathrm{Cov}\left( \sum_{i=0}^q \theta_i Z_{t-i} , \sum_{i=0}^q \theta_i Z_{t+h-i} \right) \\ &= \sigma^2 \sum_{0\le i,j \le q} \theta_i \theta_j \mathbb{1} \{j - i = h\} \end{align} $$
which depends only on $h$, giving us $\{X_t\}_t$ is stationary.
If $|h| > q$, then $\{(i, j): 1 \le i,j \le q, j-i = h\} = \phi \Rightarrow \gamma_x(h) = 0$.
Remark: It can be shown that a q-correlated time series $\{X_t\}_t$ is an $MA(q)$.