In this tutorial we will try to fit a single line of a spectrum, using a very general fitting approach. The approach presented should not be used in actual research. It is only used to demonstration!
import numpy as np
from matplotlib import pylab as plt
from astropy.io import fits
First we define the path to our fits file.
path = "seminar7/"
name = "151130.fits"
we can open the file using the open function of fits.io.
FITS = fits.open(path+name)
HEAD = FITS[0].header
We can print the entire header by calling it.
HEAD
We can extract individual arguments of the header by simply calling them.
Central_Wavelength = FITS[0].header["CENTWAVE"]
print Central_Wavelength
We can also extract the data as follows. Fortunately astropy.io will directly output multi dimensional datasets as numpy arrays. In this case the data consits of a one dimensional spectum.
DATA = FITS[0].data
We can quickly visualise the data:
plt.figure(figsize = (24,2))
plt.plot(DATA[:DATA.shape[0]/2], ".")
plt.xlim(0,DATA.shape[0]/3)
plt.show()
plt.figure(figsize = (24,2))
plt.plot(DATA[DATA.shape[0]/3:DATA.shape[0]*2/3], ".")
plt.xlim(0,DATA.shape[0]/3)
plt.show()
plt.figure(figsize = (24,2))
plt.plot(DATA[DATA.shape[0]*2/3:], ".")
plt.xlim(0,DATA.shape[0]/3)
plt.show()
We can compute the signal to noise ratio (SN) by deviting the mean with the standart diviation as follows:
SN = DATA.astype("float").mean()/DATA.astype("float").std()
print SN
In our case, we want to focus on the H-alpha line, which is the strongest line in the spectrum. We first have to locate the line and than try to fit it and determine its shape according to Bouvier et. al. 2003.
Let`s first try to find the line maually by identifying it in the plot above.
lim0, lim1 = 1206, 1234
DATAY = DATA[lim0:lim1]
DATAX = np.arange(DATA.shape[0])[lim0:lim1]
plt.figure(figsize = (12,8))
plt.plot(np.arange(DATA.shape[0]), DATA , ".")
plt.plot(DATAX , DATAY, ".")
plt.xlim(1180,1290)
plt.show()
Now we have extracted the data, which we subjectively asociate with the H-alpha line from the dataset.
Before we go on, lets generate a few simple data points to fit them. First, we define the gauss function.
$f(x)=Ae^{-\frac{(x-b)^2}{2\sigma}^2}+C$
def gauss_f(x, A=1., b=0., sigma=1., C = 0. ):
y = A*np.exp(-(x-b)**2/(2.*sigma**2))+C
return y
Next we define a one dimensional space X on which we will generate a gauss distribution Y for a variety of parameters.
X = np.linspace(-10,10,100)
Y = gauss_f(X)
plt.plot(X,Y)
plt.show()
Next, we can generate a 3D array of possible parameter permutations. The size of the array depends on the number of parameters, the number of parameter variations and the fact, that we only change one parameter per sub figure.
a = (np.linspace(0.00001,1.,4)*np.ones((4,4)).T).T
Z = np.ones((a.shape[0],a.shape[1],4))
for i in range(Z.shape[2]):
Z[i,:,i] = a[:,i]
def plot_samples(X,Z, nrow=2, ncol=2):
lab = ["a","b",r"$\sigma$","C"]
fig, axs = plt.subplots(nrows=nrow, ncols=ncol,figsize = (ncol*6,nrow*3), frameon = False, sharex=True, sharey = True)
for i in range(nrow*ncol):
axes = axs.reshape(-1)
for j in range(Z.shape[1]):
axes[i].plot(X,gauss_f(X,*Z[i,j]),"-", label = str(lab[i])+" = "+str(round(Z[i,j,i],2)), alpha = 0.5 )
axes[i].set_xlim([X[0],X[-1]])
axes[i].grid()
axes[i].legend()
fig.subplots_adjust(hspace=0.06)
fig.subplots_adjust(wspace=0.06)
plt.show()
plot_samples(X,Z, nrow=2, ncol=2)
Next, we will use the curve_fit function of scipy.optimize to produce a simple fit.
from scipy.optimize import curve_fit
variable presents the optimal values for the parameters so that the sum of the squared residuals of f(X, *args) - Y is minimized.
error presents the estimated covariance of popt. The diagonals provide the variance of the parameter estimate. To compute one standard deviation errors on the parameters use perr = np.sqrt(np.diag(pcov)).
variable, error = curve_fit(gauss_f, X,Y)
variable
Yv = gauss_f(X,*variable)
def simple_plot(X,Y,Yv):
plt.figure(figsize = (12,4))
### plot data
plt.plot(X,Y , "o", alpha = 0.7, label="Data")
### plot fit
plt.plot(X,Yv,"--", alpha = 0.7, label="Fit")
plt.legend()
plt.show()
simple_plot(X,Y,Yv)
print "A , b, sigma"
print variable
Next we will introduce some noise in Y. and plot the result
YN = Y+(np.random.rand(Y.shape[0])-0.5)/5.
variable, error = curve_fit(gauss_f, X,YN)
YNv = gauss_f(X,*variable)
simple_plot(X,YN,YNv)
print "A , b, sigma"
print variable
From here we can create more complicated functions.
def gauss_2(x, A1=1., b1=0.1, sigma1=0.2, A2=-0.6, b2=0.13, sigma2=0.04, C = 0.1):
y = A1*np.exp(-(x-b1)**2/(2.*sigma1**2))+A2*np.exp(-(x-b2)**2/(2.*sigma2**2))+C
return y
X1 = np.linspace(-2,2,1000)
Y1 = gauss_2(X1)
plt.figure(figsize = (12,4))
plt.plot(X1,Y1)
plt.show()
Let`s try to fit the previous function to our data. The data being [DATAX, DATAY].
variable, error = curve_fit(gauss_2, DATAX,DATAY,
p0 = [4., 1220., 3., 1.1, 1232., 3., 1.0],
bounds = ([1. , 1200., 0.01 , -20. , 1200., 0.01 , -5],
[5., 1240., 100., 2. , 1280., 100., 5]),
method= "dogbox",
maxfev=100000)
### defines the space in which we generate the fit function
lenDATAX = (DATAX[-1]-DATAX[0])/2.
XN = np.linspace(DATAX[0]-lenDATAX, DATAX[-1]+lenDATAX, 100 )
### computes a curve based on the generated fitting parameters.
YNv = gauss_2(XN,*variable)
### generates the two individual components of our fitting function.
G1 = gauss_2(XN,*variable[:3])
G2 = gauss_2(XN,*variable[3:])
### plotting
plt.figure(figsize = (12,8))
plt.plot(XN , G1 ,
"--k",
alpha = 0.5,
label = "Gauss component 1 "+str(np.round(variable[:3],2) ) )
plt.plot(XN , G2 ,
"--k", alpha = 0.5,
label = "Gauss component 2 "+str(np.round(variable[3:],4) ) )
plt.plot(DATAX , DATAY, ".", label = "DATA")
plt.plot(XN , YNv , "-", label = "Combined Gauss fit")
plt.legend(loc = "lower left")
#plt.xlim(1180,1290)
plt.show()
From the plot we can identify the short comings of our analysis:
- not enough data points
- S/N too low
- insufficient complexity of our self-generated fitting function.
We will further analyse the spectrum next week, using some higher level functions. In preparation for next week, you can check out the following links: