{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Model Data Compiler\n", "---\n", "\n", "Download all the Jupyter notebooks from: https://github.com/HeloiseS/hoki/tree/master/tutorials\n", "\n", "# Initial Imports" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from hoki.constants import MODELS_PATH, OUTPUTS_PATH, DEFAULT_BPASS_VERSION\n", "from time import time\n", "from hoki.data_compilers import ModelDataCompiler" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction\n", "\n", "The BPASS outputs contain a lot of information that is pretty much ready for analysis and can be quickly loaded through `hoki` (e.g. HR Diagrams, star counts etc...) \n", "\n", "But if you're looking to match your observation to specific stellar models present in BPASS, say a binary system with primary ZAMS 9 $M_{\\odot}$ and absolute F814w magnitude = X, then you will need to explore the large library of BPASS stellar models to get those evolution tracks.\n", "\n", "The best way to go about that is to compile all the relevant data (i.e. the IMFs and metallicities you want, for single or binary models) into one or multiple DataFrames that can then be saved into binary files.\n", "**Loading data from binary files is much faster** and it means you won't have to compile your data frome text files multiple times. We'll go over searching through the DataFrames in the notebook called \"Model Search\". Here we focus on using the `ModelDataCompiler`.\n", "\n", "The class `ModelDataCompiler` is pretty much a pipeline: Given the relevant parameters (which we'll see in a minute), it will locate the BPASS input files, read them, then fetch the BPASS stellar models one by one and combine all of this information in one sin|gle DataFrame. It can then be pickled using pre-existing `pandas` functionalities.\n", "\n", "Here is a visual summary:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Download the Data\n", "\n", "To run `ModelDataCompiler` you need two things: \n", "- An input file with a name like `input_bpass_z020_bin_imf135_300` which is in the bpass output files (folder names like `bpass_v2.2.1_imf135_300`\n", "- 50 GBs of textfiles containing BPASS stellar model library, paramters etc.. etc... The most recent ones are in `bpass-v2.2-newmodels.tar.gz` in the [Google Drive](https://drive.google.com/drive/folders/1BS2w9hpdaJeul6-YtZum--F4gxWIPYXl).\n", "\n", "The input files are, in essence, the recipe, and the 50Gb of textfiles are the ingredients. We need to read the input file to know which stars to put in, in what quantity (IMF) and at what time (if there are mergers etc..). That is all done by the pipeline ;)\n", "\n", "### 4 Easy Steps\n", "Then, follow the next steps to run your `ModelDataCompiler` pipeline:\n", "- 1) Create a list of desired metallicities\n", "- 2) Create a list of desired \"dummy\" array columns\n", "- 3) Ensure the paths to the \"model outputs\" (which contain the inputs) and to the the BPASS stellar models are correct\n", "- 4) Run `ModelDataCompiler`\n", "\n", "# Running the pipeline" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Step 1\n", "# List the metallicities that you want to see in your Data Frame (same format as all BPASS metallicities)\n", "metallicity_list=['z020']" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Step 2\n", "# List the columns you want - NEED FULL LIST ONE A WEBPAGE OF READ THE DOCS\n", "cols =['age','M1','f814w']" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "*************************************************\n", "******* YOUR DATA IS BEING COMPILED ******\n", "*************************************************\n", "\n", "\n", "This may take a while ;)\n", "Go get yourself a cup of tea, sit back and relax\n", "I'm working for you boo!\n", "\n", "NOTE: The progress bar doesn't move smoothly - it might accelerate or slow down - it'dc perfectly normal :D\n", " |███████████████████████████████████████████████████████████████████████████████████████████████████-| 99.99% \n", "\n", "\n", "*************************************************\n", "******* JOB DONE! HAPPY SCIENCING! ******\n", "*************************************************\n", "This took 6.3 minutes\n" ] } ], "source": [ "# Step 3\n", "# Use the ModelDataCompiler pipeline\n", "start=time()\n", "myfirstcompiler = ModelDataCompiler(z_list=metallicity_list, \n", " columns=cols, \n", " # Note: The following are defualt parameters written explicitly for the \n", " # pruposes of the tutorial\n", " binary=True, single=False, \n", " models_path=MODELS_PATH, input_files_path=OUTPUTS_PATH, \n", " bpass_version=DEFAULT_BPASS_VERSION, verbose=True)\n", "\n", "print(f\"This took {round((time()-start)/60,2)} minutes\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### `models_path`, `input_files_path`, `bpass_version`\n", "\n", "- `models_path` is the ABSOLUTE PATH to the top folder of the BPASS stellar models. \n", "\n", "- `input_files_path` is the ABSOLUTE PATH to the **folder** containing the input files (with names like `input_bpass_z020_bin_imf135_300`) - `ModelDataCompiler` will find the right inout files based on the other parameter information your provided\n", "\n", "- `bpass_version` is a **str** that indicates which BPASS version your stellar models are: valid options are `v221` and `v222`. Unless you **know** that you have `v222` then you're probably using `v221` and you can just use the `DEFAULT_BPASS_VERSION` (see below). \n", "\n", "\n", "### `MODELS_PATH`, `OUTPUTS_PATH`, `DEFAULT_BPASS_VERSION`\n", "\n", "All of these are `hoki` constants (they are in capital letters to make them stand out) - here is what they do and how you can update them if you want:\n", "\n", "---\n", "\n", "**`MODELS_PATH`**\n", "This is the location of the top folder containing the BPASS stellar models (the orange folder in the cartoon above). Mine is set to:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'/home/fste075/BPASS_hoki_dev/bpass-v2.2-newmodels/'" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "MODELS_PATH" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This location is held in `hoki.data.settings.yaml` and can be changed by calling:\n", "\n", "`hoki.constants.set_models_path(path=[absolute path to the models])`\n", "\n", "Note that if you do this you will have to reload your jupyter notebook for it to work. Alternatively, just set the parameters `models_path` in `ModelDataCompiler` to the right path. \n", "\n", "---\n", "**`OUTPUTS_PATH`**\n", "\n", "Same concept but this is the default absolute path to the BPASS outputs, which contain HRDs, stellar numbers, ionizing flux information, etc... **including the input files**. In my case I haven't moved the input files outside of the output folder so that's why I'm using this default.\n", "\n", "Mine is set to:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'/home/fste075/BPASS_hoki_dev/bpass_v2.2.1_imf135_300/'" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "OUTPUTS_PATH" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This location is also held in `hoki.data.settings.yaml` and can be changed by calling:\n", "\n", "`hoki.constants.set_outputs_path(path=[absolute path to the outputs])`\n", "\n", "Note that if you do this you will have to reload your jupyter notebook for it to work.\n", "\n", "---\n", "\n", "**`DEFAULT_BPASS_VERSION`**\n", "\n", "This is also found in the `settings.yaml` file and is (for now) set to `v221` by default. Unless you know that you have `v222` then don't touch it. If you do want to change it though, just use `hoki.constants.set_default_bpass_version([vXYZ])`\n", "\n", "---\n", "\n", "Anyway, back to the data...\n", "\n", "\n", "# Accessing the data\n", "\n", "That's easy! Just do:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | age | \n", "M1 | \n", "f814w | \n", "filenames | \n", "model_imf | \n", "types | \n", "mixed_imf | \n", "mixed_age | \n", "initial_BH | \n", "initial_P | \n", "z | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0.000000e+00 | \n", "65.00000 | \n", "-5.621502 | \n", "NEWSINMODS/z020/sneplot-z020-65 | \n", "0.0778658 | \n", "-1 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "
1 | \n", "1.635020e+03 | \n", "64.99532 | \n", "-5.585636 | \n", "NEWSINMODS/z020/sneplot-z020-65 | \n", "0.0778658 | \n", "-1 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "
2 | \n", "2.163294e+03 | \n", "64.99379 | \n", "-5.569154 | \n", "NEWSINMODS/z020/sneplot-z020-65 | \n", "0.0778658 | \n", "-1 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "
3 | \n", "2.611867e+03 | \n", "64.99247 | \n", "-5.552240 | \n", "NEWSINMODS/z020/sneplot-z020-65 | \n", "0.0778658 | \n", "-1 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "
4 | \n", "3.002335e+03 | \n", "64.99132 | \n", "-5.535085 | \n", "NEWSINMODS/z020/sneplot-z020-65 | \n", "0.0778658 | \n", "-1 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
2674010 | \n", "9.833045e+10 | \n", "0.42280 | \n", "15.582510 | \n", "NEWSINMODS/z020/sneplot-z020-0.6 | \n", "43.7266 | \n", "3 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "
2674011 | \n", "9.865972e+10 | \n", "0.42280 | \n", "15.638750 | \n", "NEWSINMODS/z020/sneplot-z020-0.6 | \n", "43.7266 | \n", "3 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "
2674012 | \n", "9.899617e+10 | \n", "0.42280 | \n", "15.696310 | \n", "NEWSINMODS/z020/sneplot-z020-0.6 | \n", "43.7266 | \n", "3 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "
2674013 | \n", "9.933917e+10 | \n", "0.42280 | \n", "15.755130 | \n", "NEWSINMODS/z020/sneplot-z020-0.6 | \n", "43.7266 | \n", "3 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "
2674014 | \n", "9.968715e+10 | \n", "0.42280 | \n", "15.815090 | \n", "NEWSINMODS/z020/sneplot-z020-0.6 | \n", "43.7266 | \n", "3 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "
2674015 rows × 11 columns
\n", "\n", " | age | \n", "M1 | \n", "f814w | \n", "filenames | \n", "model_imf | \n", "types | \n", "mixed_imf | \n", "mixed_age | \n", "initial_BH | \n", "initial_P | \n", "z | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0.000 | \n", "65.00000 | \n", "-5.621502 | \n", "NEWSINMODS/z020/sneplot-z020-65 | \n", "0.0778658 | \n", "-1 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "
1 | \n", "1635.020 | \n", "64.99532 | \n", "-5.585636 | \n", "NEWSINMODS/z020/sneplot-z020-65 | \n", "0.0778658 | \n", "-1 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "
2 | \n", "2163.294 | \n", "64.99379 | \n", "-5.569154 | \n", "NEWSINMODS/z020/sneplot-z020-65 | \n", "0.0778658 | \n", "-1 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "
3 | \n", "2611.867 | \n", "64.99247 | \n", "-5.552240 | \n", "NEWSINMODS/z020/sneplot-z020-65 | \n", "0.0778658 | \n", "-1 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "
4 | \n", "3002.335 | \n", "64.99132 | \n", "-5.535085 | \n", "NEWSINMODS/z020/sneplot-z020-65 | \n", "0.0778658 | \n", "-1 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "020 | \n", "