# typical python libraries import pandas as pd import matplotlib.pyplot as plt import numpy as np import pickle # vorbin package by M. Capellari from vorbin.voronoi_2d_binning import * # hoki sedfitting subpackage import hoki.sedfitting as hsed plt.style.use('../tuto.mplstyle')
Vornoi Binning and retrieveing SEDs of the binned data¶
In this jupyter notebook we will work with a cropped data cube of NGC 4993 (reduced by Joe Lyman) that is centered(-ish) on the galaxy and extends to comfortably contain the galaxy and the transient AT 2017gfo (top left in the image below).
In this notebook we will mostly use the voronoi binning algorithm of M. Cappellari (see website) and a few utilities I wrote and put in
hoki.sedfitting such as
mpdaf was used to create the cropped data cube. It is a pretty commonly used IFU data utility package. For further information see their documentation.
Loading the cropped data cube¶
First let’s take a look at the data cube:
with open('data/cropped_NGC4993.pkl', 'rb') as pickle_file: cube = pickle.load(pickle_file)
plt.figure() ima = cube.sum(axis=0) # to make the image we sum over the wavelength axis to get the total flux ima.plot(scale='arcsinh', colorbar='v') # this scaling with arcsinh was copied from mpdaf tutorials - it works
<matplotlib.image.AxesImage at 0x7fae1be6e7f0>
okay let’s take a look at what’s in this
cube object. First checking the shape is quite informative:
(3681, 150, 150)
so axis 0 is the wavelength, which is why we summed over axis=0 earlier, and then each side of the image is 150 pixels. The original images from MUSE are much larger but I cropped it to make it easier to upload and download (and also to not fill the RAM when working on the data!)
Now let’s look at one of the headers, the
XTENSION= 'IMAGE ' / Image extension BITPIX = -64 / array data type NAXIS = 3 / number of array dimensions NAXIS1 = 150 NAXIS2 = 150 NAXIS3 = 3681 PCOUNT = 0 / number of parameters GCOUNT = 1 / number of groups WCSAXES = 3 / Number of coordinate axes CRPIX1 = 53.754912632071 / Pixel coordinate of reference point CRPIX2 = 128.27776222996 / Pixel coordinate of reference point CUNIT1 = 'deg' / Units of coordinate increment and value CUNIT2 = 'deg' / Units of coordinate increment and value CTYPE1 = 'RA---TAN' / Right ascension, gnomonic projection CTYPE2 = 'DEC--TAN' / Declination, gnomonic projection CRVAL1 = 197.450333 / [deg] Coordinate value at reference point CRVAL2 = -23.38117 / [deg] Coordinate value at reference point LONPOLE = 180.0 / [deg] Native longitude of celestial pole LATPOLE = -23.38117 / [deg] Native latitude of celestial pole CSYER1 = 1.3680091695E-05 / [deg] Systematic error in coordinate CSYER2 = 7.77995869722E-06 / [deg] Systematic error in coordinate MJDREF = 0.0 / [d] MJD of fiducial time RADESYS = 'FK5' / Equatorial coordinate system EQUINOX = 2000.0 / [yr] Equinox of equatorial coordinates CD1_1 = -5.5555555555556E-05 CD1_2 = 0.0 CD2_1 = 0.0 CD2_2 = 5.5555555555556E-05 CRVAL3 = 4749.61767578125 / Coordinate value at reference point CRPIX3 = 1.0 / Pixel coordinate of reference point CUNIT3 = 'Angstrom' / Units of coordinate increment and value CTYPE3 = 'AWAV ' / Coordinate type code CD3_3 = 1.25 CD1_3 = 0.0 CD2_3 = 0.0 CD3_1 = 0.0 CD3_2 = 0.0 EXTNAME = 'DATA ' / This extension contains data values HDUCLASS= 'ESO ' / class name (ESO format) HDUDOC = 'DICD ' / document with class description HDUVERS = 'DICD version 6' / version number (according to spec v2.5.1) HDUCLAS1= 'IMAGE ' / Image data format HDUCLAS2= 'DATA ' / this extension contains the data itself ERRDATA = 'STAT ' / pointer to the variance extension OBJECT = 'SGRB (DATA)' BUNIT = '10**-20 Angstrom-1 cm-2 erg s-1' / data unit type COMMENT These data have been ZAPped! ZAPVERS = '1.1.dev workflow' / ZAP version ZAPZLVL = 'extSVD ' / ZAP zero level correction ZAPCLEAN= T / ZAP NaN cleaning performed for calculation ZAPCFTYP= 'weight ' / ZAP continuum filter type ZAPCFWID= 20 / ZAP continuum filter size ZAPNSEG = 11 / Number of segments used for ZAP SVD ZAPSEG0 = '0:520 ' / spectrum segment (pixels) ZAPNEV0 = 1 / number of eigenvals/spectra used ZAPSEG1 = '521:880 ' / spectrum segment (pixels) ZAPNEV1 = 6 / number of eigenvals/spectra used ZAPSEG2 = '881:1352' / spectrum segment (pixels) ZAPNEV2 = 16 / number of eigenvals/spectra used ZAPSEG3 = '1353:1600' / spectrum segment (pixels) ZAPNEV3 = 2 / number of eigenvals/spectra used ZAPSEG4 = '1601:1960' / spectrum segment (pixels) ZAPNEV4 = 1 / number of eigenvals/spectra used ZAPSEG5 = '1961:2360' / spectrum segment (pixels) ZAPNEV5 = 1 / number of eigenvals/spectra used ZAPSEG6 = '2361:2812' / spectrum segment (pixels) ZAPNEV6 = 11 / number of eigenvals/spectra used ZAPSEG7 = '2813:3081' / spectrum segment (pixels) ZAPNEV7 = 6 / number of eigenvals/spectra used ZAPSEG8 = '3082:3185' / spectrum segment (pixels) ZAPNEV8 = 1 / number of eigenvals/spectra used ZAPSEG9 = '3186:3620' / spectrum segment (pixels) ZAPNEV9 = 12 / number of eigenvals/spectra used ZAPSEG10= '3621:3680' / spectrum segment (pixels) ZAPNEV10= 4 / number of eigenvals/spectra used HISTORY Image was compressed by CFITSIO using scaled integer quantization: HISTORY q = 4.000000 / quantized level scaling parameter HISTORY 'SUBTRACTIVE_DITHER_1' / Pixel Quantization Algorithm CHECKSUM= 'bODodMAobMAobMAo' / HDU checksum updated 2021-07-23T23:34:48 DATASUM = '1413973769' / data unit checksum updated 2021-07-23T23:34:48
CRVAL3 is the first wavelength bin and
CD3_3 is the \(\Delta \lambda\) = 1.25 in this case. We’re going to need this in a minute.
In this particular situation Voronoi binning is creating tiles over the image in such a way that we homogenise the signal-to-noise-ratio across our bins. At the moment our bins are just pixels, and obviously the pixels on the outside of the image have much lower SNR than the ones in the middle of the galaxy.
What we’re going to do now is use the
vorbin algorithm to create tiles of adjacent pixels which, when summed together give use a spectrum with SNR close to our target SNR (and you get to pick what that target is).
Book keeping - creating our wavelength array¶
# The first thing we need to do is create the wavelength scale. We've only got a starting # wavelength, a Delta lambda and a number of bins in the spectrum (3861 - see above) # The code below creates the wavelength array # minimum and maximum wavelength from the header wl_min = cube.data_header['CRVAL3'] wl_max = cube.data_header['CRVAL3']+cube.data_header['CD3_3']*cube. data.shape # creating the wavelength array WL = np.arange(cube.data_header['CRVAL3'], wl_max, cube.data_header['CD3_3'])
# just to check our array has indeed got the right size WL.shape
Book keeping - creating the pixel coordinate, signal and noise arrays¶
# First we need to make lists of pixel Coordinates # note this will be stupid big with large data cubes # in that case find a smartter way to make these arrays # ... X is a 1D list of length 150**2 X =  for x in np.arange(0, cube.shape): X+=[x]*cube.shape x = np.array(X) # ... Y is a 1D list of length 150**2 Y= for y in np.arange(0,cube.shape): Y+=list(np.arange(0,cube.shape)) y = np.array(Y)
# Now we store the signal (spectrum) in its own array signal = ima.data.data.reshape(-1,1) # this reshape(-1,1) turns the 2D array in a 1D array.
# And now we store the errors in their own array noise = np.sqrt(ima.var.data.reshape(-1,1))
# pick a target signal to noise ratio - I picked 40 TARGET_SNR = 40
Calculating SNR in a line-free region¶
To compare SNR from spectrum to spectrum we need to chose an area of the spectrum that is devoid of strong lines across our whole data set and then calculate the SNR in that range.
# range of wavelength between which we calculate the SNR wlmin_snr = 5590 wlmax_snr= 5680 # Finding the indices corresponding to the min and max wavelengths of our range to calc SNR i_min = (np.abs(WL-wlmin_snr)).argmin() i_max = (np.abs(WL-wlmax_snr)).argmin()+1 # Calculating the SNR in that range signal = np.mean(cube.data[i_min:i_max,:,:], axis=0).reshape(-1,1) # again making it 1D with the reshape trick noise = np.sqrt(np.mean(cube.var[i_min:i_max,:,:], axis=0)).reshape(-1,1)
# Summoning vorbin!!! binNum, xNode, yNode, xBar, yBar, sn, nPixels, scale = voronoi_2d_binning(x, y, signal, noise, TARGET_SNR, plot=1, quiet=1, pixelsize=1, )
Binning the spectra¶
Now that we have voronoi bins and we know which pixel belongs in which bin, we can put our spectra together to caculate the integrated spectra of each bin. Let’s do this!
# first I create a dataframe that tells us which bin each (x,y) location belongs to bin_keys=pd.DataFrame(np.array([vorbin.X, vorbin.Y, vorbin.binNum]).T, columns=['X','Y','binNum']) bin_keys.head()
# we'll need to know the number of wavelength bins wl_len = cube.shape wl_len
Let’s bin the SEDs!¶
# list of spectra. By the end of the loop there should be binNum spectra corresponding to each vorbin spectra= # we loop over the number of voronoi bins for i in range(bin_keys.binNum.max()+1): # Y, X coordinates of all the pixels in the ith voronoi bin (binNum) coords_i=bin_keys[bin_keys.binNum==i][['X','Y']].values # instanciate a np array of the size of the SED filled with zeros spectrum = np.zeros(wl_len) # for each pixel in this voronoi bin... for coord in coords_i: # ..add up all the fluxes spectrum+=cube.data.data[:,coord,coord] # ... then we add this spectrum to our list of spectra spectra.append(spectrum) # turn the list into an array spectra=np.array(spectra) #### QUALITY CONTROL # want to make sure the spectra don't have negative values or nan values spectra = np.where(spectra>0.0, spectra, 1e-15) # NO NEGATIVES spectra = np.where(spectra!=np.nan, spectra, 1e-15) # NO NAN
<ipython-input-18-ed04e47ac8b7>:26: RuntimeWarning: invalid value encountered in greater spectra = np.where(spectra>0.0, spectra, 1e-15) # NO NEGATIVES
# finally I like to stack the wavelength array and the spectra # so i=0 is the wavelength # and every row after that is a spectrum spectra=np.vstack([WL,spectra])
repeat for the standard deviations!¶
stds= # list of standard deviations for i in range(bin_keys.binNum.max()+1): # Y, X coordinates of all the pixels in the ith voronoi bin (binNum) coords_i=bin_keys[bin_keys.binNum==i][['X','Y']].values # instanciate a np array of the size of the spectrum filled with zeros std_i = np.zeros(wl_len) for coord in coords_i: # for each pixel in this bin... std_i+=cube.var.data[:,coord,coord] #add the variances - same as adding the std in quadrature # then we square root the whole thing... std_i = np.sqrt(std_i) # ... before adding it to our list ... stds.append(std_i) # ... and finally we stack our noise spectra together with the wavelengths as the frist row spectra_errs= np.vstack([WL,stds]) #### QUALITY CONTROL spectra_errs = np.where(spectra_errs!=np.nan, noises, 1e15) spectra_errs = np.where(spectra_errs>0.0, noises, 1e15)
np.save('data/vorbinned_spectra.npy', spectra) np.save('data/vorbinned_specerr.npy', spectra_errs)
To see how good a job we did in homogeneising the SNR we can plot it with a colour map. Here what we’re going to do is take the minimum of the SED/NOISE array for each bin:
min_snrs =  # list of the minimum SNRs for spec, std in zip(spectra, spectra_errs): snr = spec/std # calculate SNR for each bin min_snrs.append(min(snr)) # append the minimum min_snrs = np.array(min_snrs) # make it an array
# At this point we have to flip x and y in the array so things plot the right way around # not sure if I messed up earlier or if its a "feature" pd.DataFrame(np.array([y,x, sn[binNum], binNum]).T, columns=['x', 'y', 'sn', 'binNum']).to_csv('data/voronoi_bins_NEW.txt', index=False)
fig, ax = plt.subplots(1,1) # note that here we have to flip the y and x axes to have the plot come out right # that is why we took that weird step above before saving the results. # in future analysis when loading that data frame x and y will be the right way around sn_color = hsed.plot_voronoi(y,x, sn[binNum], pixelsize=1, ax=ax, cmap='rainbow') fig.colorbar(sn_color) ax.set_title('SNR')
Text(0.5, 1.0, 'SNR')
Our SNR map looks decent! obviously the center glows well above our threshold value because galaxies are bright. We notice a sort of ring of blue (lower SNR) around the bright region ad we transition to larger bins: that is because at this point the algorithm is between a rock and a hard place: the pixel itself is on the cusp of meeting the SNR threshold but combining two of them makes it crazy high compared to what we ask! That transitional phase between unbinned and binned pixels is expected and you might see it in your own images when you perform this exercise!
SED fitting integrated spectrum with Kvn
SED fitting all the bins with LordCommander