22
33This repository provides a * minimal* template for research projects in Python.
44It includes a basic structure for organizing code, results, and documentation,
5- as well as good practices for software engineering such as version control,
6- testing, and continuous integration. The template is designed to be lightweight
7- such that it can be easily adapted to different research workflows .
5+ and demonstrates how to incorporate good practices in software engineering such
6+ as testing and continuous integration. The template is designed to be
7+ lightweight such that it can be easily adapted to your specific needs .
88
99** Motivation:** Research projects often grow organically: Experiments
1010accumulate, code gets copied across notebooks, and results become difficult to
@@ -23,16 +23,19 @@ questions or suggestions for improvement, please open an issue on GitHub.
2323This template implements a minimal set of practices that help keep research
2424code organized and reliable.
2525
26- - ** Version Control:** Version control allows us to track the history of a
26+ - ** Version Control:** Version control allows you to track the history of a
2727 project. Every change to the codebase is recorded and can be inspected or
28- reverted later. This template uses * git* for version control. It allows you
29- to track the evolution of your code, revert to earlier versions, and
30- collaborate with others.
31-
28+ reverted later. It also facilitates collaboration, as multiple people can
29+ work on in different branches and merge their changes together. This
30+ repository is a GitHub template, which means that when you create a new
31+ repository based on this template, it will already be set up with * git* for
32+ version control.
33+
3234- ** Testing:** Automated tests help ensure that your code behaves as expected.
3335 For example, if you know the expected output of a function for certain
3436 inputs, you can write a test that verifies this behavior automatically.
35- This template uses the ` pytest ` package for running tests.
37+ This template uses the [ ` pytest `
38+ package] ( https://docs.pytest.org/en/stable/ ) for running tests.
3639
3740- ** Continuous Integration:** Whenever code changes are made, it is useful to
3841 automatically check whether everything still works. This process is called
@@ -43,17 +46,19 @@ code organized and reliable.
4346 ensures that all required dependencies are * explicitly* listed in the
4447 ` project_template_env.yml ` file, which is important for reproducibility.
4548
46- - ** Clean Code:** Writing clean and readable code is always beneficial . Some
49+ - ** Clean Code:** Writing clean and readable code is always advisable . Some
4750 useful principles include:
4851
4952 - Use meaningful names for variables, functions, etc.
50- - Avoid duplicating code.
53+ - Avoid duplicating code. If you find yourself copying and pasting code,
54+ the alarm bells should be ringing in your head. It will often make sense
55+ to extract the common code into a reusable function or module.
5156 - Prefer simple solutions over complex ones.
5257 - Each code unit should only perform a single, clearly defined task (this
5358 is also known as the * Single Responsibility Principle* ).
5459 - Write docstrings describing what your code does and what the inputs and
5560 outputs are. There are different conventions for docstrings that you can
56- use , e.g. * Google* or * NumPy* style. But even a simple one-line
61+ adopt , e.g. * Google* or * NumPy* style. But even a simple one-line
5762 description is better than no docstring at all.
5863
5964 These small habits make your code easier to understand for collaborators
@@ -66,35 +71,40 @@ code organized and reliable.
6671The repository is organized into four main folders.
6772
6873- ** ` source ` ** : The ` source ` folder contains * reusable code* that may be used
69- across multiple experiments. Code in this folder should therefore be
70- written in Python modules (not notebooks) and should include basic
71- documentation and tests. These files live in ` source/our_library ` . This
72- folder can be installed as a local Python package via ` pip install -e . ` .
73- After installation, functions can be imported in experiments like this:
74- ` from our_library.some_functions import add_10 ` .
74+ across * multiple* experiments. Since we want to be able to import this
75+ code, it should be organized in Python files (not Jupyter notebooks). Also,
76+ since this code is meant to be reused, it should include basic
77+ documentation and tests. All files live in ` source/our_library ` . This
78+ folder can be installed as a local Python package via ` pip install -e . `
79+ (details in section 3). After installation, functions can be imported in
80+ experiments like this: ` from our_library.some_functions import add_10 ` .
7581
7682 The ` source ` folder also contains a ` tests ` directory for automated
7783 testing. A useful convention is to mirror the structure of the library and
78- the tests.
84+ the tests. For instance, if you have a file
85+ ` source/our_library/some_functions.py ` , you should implement the
86+ corresponding tests in ` source/tests/test_some_functions.py ` .
7987
8088- ** ` experiments ` :** The ` experiments ` folder contains the * actual research
8189 experiments* . Each experiment has its own numbered folder, e.g.
8290 ` experiments/01_first_experiment ` . Experiments can be implemented using
8391 Jupyter notebooks or Python scripts. Separating reusable code (` source ` )
8492 from experiment-specific code (` experiments ` ) helps keep projects
85- organized.
86-
87- A good rule of thumb: If you find yourself copying code between
93+ organized. A good rule of thumb: If you find yourself copying code between
8894 experiments, it should probably be moved to ` source ` .
8995
90- - ** ` results ` :** The ` results ` folder stores outputs generated by experiments.
91- Its structure mirrors the ` experiments ` folder. Typical outputs include
92- plots, tables, trained models, or intermediate data. Because result files
93- can be large, the contents of the ` results ` folder are excluded from git
94- tracking by default. However, if your files are small, you can explicitly
95- track a subfolder by adding an exception such as
96- ` !results/01_first_experiment/ ` to the ` .gitignore ` file. An alternative
97- for large files is to use Git LFS.
96+ - ** ` results ` :** The ` results ` folder stores outputs generated by our
97+ experiments. Its structure mirrors the ` experiments ` folder. So, if you
98+ have an experiment in ` experiments/01_first_experiment ` , the corresponding
99+ results should be stored in ` results/01_first_experiment ` . This makes it
100+ easy to keep track of which results belong to which experiment.
101+
102+ Typical outputs include plots, tables, trained models, or intermediate
103+ data. Because result files can be large, the contents of the ` results `
104+ folder are excluded from git tracking by default. However, if your files
105+ are small, you can explicitly track a subfolder by adding an exception such
106+ as ` !results/01_first_experiment/ ` to the ` .gitignore ` file. An alternative
107+ for large files is to use [ Git LFS] ( https://git-lfs.com/ ) .
98108
99109- ** ` documentation ` :** The ` documentation ` folder contains project
100110 documentation. Currently it includes a minimal LaTeX paper template in
@@ -117,8 +127,8 @@ installed on your local machine. To clone repositories via SSH, you should also
117127set up SSH keys for your GitHub account (cloning via HTTPS is also possible).
118128In addition, you need ` conda ` installed to create and manage the Python
119129environment used in this project. If you want to compile the LaTeX paper, you
120- will also need a LaTeX distribution such as * TeX Live * installed on your
121- machine.
130+ will also need a LaTeX distribution such as [ * TeX
131+ Live * ] ( https://www.tug.org/texlive/ ) installed on your machine.
122132
123133If this is the first time you are using this template, it is recommended to
124134work through the following steps:
@@ -169,27 +179,30 @@ work through the following steps:
169179 `our_library` folder; and `results` mirrors `experiments`. This is a useful
170180 convention that helps keep things organized.
171181
172- 4. **Run the Tests:** Take a look at the tests in `source/tests`. Note that
173- they use `@pytest.mark.parametrize`, which is a convenient way to run the
174- same test with multiple inputs. Run `pytest .` from the `source` directory
175- to see the automated tests in action. Feel free to extend `our_library` and
176- add additional tests.
177-
178- 5. **Trigger the GitHub Workflow:** Take a look at the GitHub workflow in
179- `.github/workflows`. This workflow is triggered automatically whenever you
180- push changes to GitHub (see the section beginning with `on`). It creates a
181- fresh environment, installs the dependencies, and runs the tests. You can
182- make a small change to the code, push it to GitHub, and then check the
183- results in the "Actions" tab of your GitHub repository.
182+ 4. **Run the Tests:** Take a look at `source/our_library/some_functions.py` and
183+ the corresponding tests in `source/tests/test_some_functions.py`. Note that
184+ the tests use `@pytest.mark.parametrize`, which is a convenient way to run
185+ the same test with multiple inputs. Run `pytest .` from the `source`
186+ directory to see the automated tests in action. Feel free to extend
187+ `our_library` and add additional tests.
188+
189+ 5. **Trigger the GitHub Workflow:** Take a look at the GitHub workflow
190+ `run_pytest.yml` in `.github/workflows`. This workflow is triggered
191+ automatically whenever you push changes to GitHub (see the section
192+ beginning with `on`). It creates a fresh environment, installs the
193+ dependencies, and runs the tests. You can make a small change to the code
194+ (e.g. by adding another test case to `test_some_functions.py`), push it to
195+ GitHub, and then check the results in the *Actions* tab of your GitHub
196+ repository.
184197
1851986. **Run the Experiments:** Take a look at the experiments in the `experiments`
186199 folder. Note how reusable code from `source/our_library` is imported and
187- how results are written to the `results` folder. Run the experiments to see
188- this in action.
200+ how results are written to the `results` folder. Run the first experiment
201+ to see this in action.
189202
1902037. **Compile the LaTeX Paper:** Try compiling the LaTeX paper in
191204 `documentation/paper`. The template demonstrates how figures from the
192205 `results` folder can be included in the paper.
193206
1942078. **Adapt and Extend the Template:** Adapt and extend the template as needed
195- for your own research project.
208+ for your own research project.
0 commit comments