Generating Configuration Files

This section details how to generate the configuration files needed for preparing Otter autograders. Note that this step can be accomplished automatically if you’re using Otter Assign.

To use Otter to autograde an assignment, you must first generate a zip file that Otter will use to create a Docker image with which to grade submissions. Otter’s command line utility otter generate allows instructors to create this zip file from their machines.

Before Using Otter Generate

Before using Otter Generate, you should already have written tests for the assignment and collected extra requirements into a requirements.txt file (see here). (Note: the default requirements can be overwritten by your requirements by passing the --overwrite-requirements flag.)

Directory Structure

For the rest of this page, assume that we have the following directory structure:

├── data.csv
├── hidden-tests
│   ├──
│   └──  # etc.
├── hw00-sol.ipynb
├── hw00.ipynb
├── requirements.txt
├── tests
│   ├──
│   └──  # etc.

Also assume that the working directory is hw00-dev.


The general usage of otter generate is to create a zip file at some output directory (-o flag, default ./) which you will then use to create the grading image. Otter Generate has a few optional flags, described in the CLI Reference.

If you do not specify -t or -o, then the defaults will be used. If you do not specify -r, Otter looks in the working directory for requirements.txt and automatically adds it if found; if it is not found, then it is assumed there are no additional requirements. If you do not specify -e, Otter will use the default environment.yml. There is also an optional positional argument that goes at the end of the command, files, that is a list of any files that are required for the notebook to execute (e.g. data files, Python scripts). To autograde an R assignment, pass the -l r flag to indicate that the language of the assignment is R.

The simplest usage in our example would be

otter generate

This would create a zip file with the tests in ./tests and no extra requirements or files. If we needed data.csv in the notebook, our call would instead become

otter generate data.csv

Note that if we needed the requirements in requirements.txt, our call wouldn’t change, since Otter automatically found ./requirements.txt.

Now let’s say that we maintained to different directories of tests: tests with public versions of tests and hidden-tests with hidden versions. Because I want to grade with the hidden tests, my call then becomes

otter generate -t hidden-tests data.csv

Now let’s say that I need some functions defined in; then I would add this to the last part of my Otter Generate call:

otter generate -t hidden-tests data.csv

If this was instead an R assignment, I would run

otter generate -t hidden-tests -l r data.csv

Grading Configurations

There are several configurable behaviors that Otter supports during grading. Each has default values, but these can be configured by creating an Otter config JSON file and passing the path to this file to the -c flag (./otter_config.json is automatically added if found and -c is unspecified).

The supported keys and their default values are provided in, which is generated from the dictionary printed below for documenting each option.

score_threshold: null             # a score threshold for pass-fail assignments
points_possible: null             # a custom total score for the assignment; if unspecified the sum of question point values is used.
show_stdout: false                # whether to display the autograding process stdout to students on Gradescope
show_hidden: false                # whether to display the results of hidden tests to students on Gradescope
show_all_public: false            # whether to display all test results if all tests are public tests
seed: null                        # a random seed for intercell seeding
seed_variable: null               # a variable name to override with the seed
grade_from_log: false             # whether to re-assemble the student's environment from the log rather than by re-executing their submission
serialized_variables: {}          # a mapping of variable names to type strings for validating a deserialized student environment
pdf: false                        # whether to generate a PDF of the notebook when not using Gradescope auto-upload
token: null                       # a Gradescope token for uploading a PDF of the notebook
course_id: None                   # a Gradescope course ID for uploading a PDF of the notebook
assignment_id: None               # a Gradescope assignment ID for uploading a PDF of the notebook
filtering: false                  # whether the generated PDF should have cells filtered out
pagebreaks: false                 # whether the generated PDF should have pagebreaks between filtered sectios
debug: false                      # whether to run the autograder in debug mode (without ignoring errors)
autograder_dir: /autograder       # the directory in which autograding is taking place
lang: python                      # the language of the assignment; one of {'python', 'r'}
miniconda_path: /root/miniconda3  # the path to the miniconda install directory
plugins: []                       # a list of plugin names and configuration details for grading
logo: true                        # whether to print the Otter logo to stdout
print_summary: true               # whether to print the grading summary
print_score: true                 # whether to print out the submission score in the grading summary
zips: false                       # whether zip files are being graded

Grading with Environments

Otter can grade assignments using saved environments in the log in the Gradescope container. This behavior is not supported for R assignments. This works by deserializing the environment stored in each check entry of Otter’s log and grading against it. The notebook is parsed and only its import statements are executed. For more inforamtion about saving and using environments, see Logging.

To configure this behavior, two things are required:

  • the use of the grade_from_log key in your config JSON file

  • providing studens with an Otter configuration file that has save_environments set to true

This will tell Otter to shelve the global environment each time a student calls Notebook.check (pruning the environments of old calls each time it is called on the same question). When the assignment is exported using Notebook.export, the log file (at ./.OTTER_LOG) is also exported with the global environments. These environments are read in in the Gradescope container and are then used for grading. Because one environment is saved for each check call, variable name collisions can be averted, since each question is graded using the global environment at the time it was checked. Note that any requirements needed for execution need to be installed in the Gradescope container, because Otter’s shelving mechanism does not store module objects.

Autosubmission of Notebook PDFs

Otter Generate allows instructors to automatically generate PDFs of students’ notebooks and upload these as submissions to a separate Gradescope assignment. This requires a Gradescope token, for which you will be prompted to enter your Gradescope account credentials. Otter Generate also needs the course ID and assignment ID of the assignment to which PDFs should be submitted. This information can be gathered from the assignment URL on Gradescope:{COURSE ID}/assignments/{ASSIGNMENT ID}

To configure this behavior, set the course_id and assignment_id configurations in your config JSON file. When Otter Generate is run, you will be prompted to enter your Gradescope email and password. Alternatively, you can provide these via the command-line with the --username and --password flags, respectively:

otter generate --username --password thisisnotasecurepassword

Currently, this action supports HTML comment filtering with pagebreaks, but these can be disabled with the filtering and pagebreaks keys of your config.

Pass/Fail Thresholds

The configuration generator supports providing a pass/fail threshold. A threshold is passed as a float between 0 and 1 such that if a student receives at least that percentage of points, they will receive full points as their grade and 0 points otherwise.

The threshold is specified with the threshold key:

    "threshold": 0.25

For example, if a student passes a 2- and 1- point test but fails a 4-point test (a 43%) on a 25% threshold, they will get all 7 points. If they only pass the 1-point test (a 14%), they will get 0 points.

Overriding Points Possible

By default, the number of points possible on Gradescope is the sum of the point values of each test. This value can be overrided, however, to some other value using the points_possible key, which accepts an integer. Then the number of points awarded will be the provided points value scaled by the percentage of points awarded by the autograder.

For example, if a student passes a 2- and 1- point test but fails a 4-point test, they will receive (2 + 1) / (2 + 1 + 4) * 2 = 0.8571 points out of a possible 2 when points_possible is set to 2.

As an example, the command below scales the number of points to 3:

    "points_possible": 3

Intercell Seeding

The autograder supports intercell seeding with the use of the seed key. This behavior is not supported for Rmd and R script assignments, but is supported for R Jupyter notebooks. Passing it an integer will cause the autograder to seed NumPy and Python’s random library or call set.seed in R between every pair of code cells. This is useful for writing deterministic hidden tests. More information about Otter seeding can be found here. As an example, you can set an intercell seed of 42 with

    "seed": 42

Showing Autograder Results

The generator allows intructors to specify whether or not the stdout of the grading process (anything printed to the console by the grader or the notebook) is shown to students. The stdout includes a summary of the student’s test results, including the points earned and possible of public and hidden tests, as well as the visibility of tests as indicated by test["hidden"].

This behavior is turned off by default and can be turned on by setting show_stdout to true.

    "show_stdout": true

If show_stdout is passed, the stdout will be made available to students only after grades are published on Gradescope. The same can be done for hidden test outputs using the show_hidden key.

The Grading on Gradescope section details more about how output on Gradescope is formatted. Note that this behavior has no effect on any platform besides Gradescope.


To use plugins during grading, list the plugins and their configurations in the plugin key of your config. If a plugin requires no configurations, it should be listed as a string. If it does, it should be listed as a dictionary with a single key, the plugin name, that maps to a subdictionary of configurations.

    "plugins": [
            "somepackage.SomeOtherOtterPlugin": {
            "some_config_key": "some_config_value"

For more information about Plugins, see here.

Generating with Otter Assign

Otter Assign comes with an option to generate this zip file automatically when the distribution notebooks are created via the generate key of the assignment metadata. See Creating Assignments for more details.