In two previous posts I've used a trick from ODEs to help solve some SDEs with multiplicative and additive noise. But, there's a more systematic way to have gone about finding the general solutions.
To solve the SDE
$$ dX_t = \mu(X_t,t)dt + \sigma(X_t,t)dW_t $$
we need a function $X_t = f(W_t,t)$ such that $df$ satisfies the above. Since we're using the Itô sense here, that comes to
$$ df = \left(\frac{\partial f}{\partial t} + \frac{1}{2}\frac{\partial^2 f}{\partial w^2}\right)dt + \frac{\partial f}{\partial w} dW_t $$
For $X_t = f(W_t,t)$ to solve the SDE, we then require that
$$ \frac{\partial f}{\partial t} + \frac{1}{2}\frac{\partial^2 f}{\partial w^2} = \mu(X_t,t) = \mu(f(W_t,t),t) $$
and
$$ \frac{\partial f}{\partial w} = \sigma(X_t,t) =\sigma(f(W_t,t),t) $$
Notice that the latter equation is a PDE of $f$ but only in terms of $w$ differentials. We can use this to turn the former condition into a PDE only in terms of time differentials. In particular,
$$ \frac{\partial^2 f}{\partial w^2} = \frac{\partial}{\partial w}\sigma(f,t) = \frac{\partial \sigma}{\partial f}\frac{\partial f}{\partial w} = \frac{\partial \sigma}{\partial f}\sigma(f,t) $$
Plugging this into the first PDE, we get
$$ \frac{\partial f}{\partial t} + \frac{1}{2}\frac{\partial \sigma}{\partial f}\sigma = \mu(f,t) $$
So there is the system that one would need to solve in order to solve the SDE. Let's now apply this formulation to the general linear SDE that we looked at before.
$$ dX_t = (a_1(t)X_t + a_2(t))dt + (b_1(t)X_t + b_2(t))dW_t $$
The two equations to solve are
$$ \frac{\partial f}{\partial t} + \frac{1}{2}b_1(t)(b_1(t)f(w,t)+b_2(t)) = a_1(t)f(w,t) + a_2 $$
$$ \frac{\partial f}{\partial w} = b_1(t)f + b_2(t) $$
The general solutions to the homogeneous first equation is then
$$ f^h = g(w)e^{\int_0^t a_1(s) - \frac{1}{2}b_1^2(s) ds} $$
To find $g(w)$ we utilize the second equation in homogeneous form
$$ f^h = f_0e^{\int_0^t a_1(s) - \frac{1}{2}b_1^2(s) ds + \int_0^tb_1(s)dW_s} $$
Now, multiplying the original equation by $f^h$ we get
$$ \frac{\partial f^hf}{\partial t} = (a_2 - \frac{1}{2}b_1b_2)f^h $$
The RHS are all known functions, so we may integrate and rearrange to get
$$ f = (f^h)^{-1}\left(\int_0^t (a_2(s) - \frac{1}{2}b_1(s)b_2(s))f^hds + j(w)\right) $$
If we evaluate this for $t=0$, we find that $j(w)$ represents the initial condition of the system, which can either be a deterministic value (some constant) or a distribution.