This article explains what is needed for machine learning questions to be graded automatically on HackerEarth, and it also describes how this whole process happens.
Prerequisites for automatic evaluation
- Candidate's results.csv file
- Test case (ground truth values of the test data)
- Checker file
Generating the results.csv file on HackerEarth
When candidates start the assessment, they do the following to generate the results.csv file in their local system:
- Download the dataset (both the train.csv and test.csv files)
- Build the required model in their local system
- Train their model based on the data in the train.csv file.
- Once the model is trained, they feed the data from the test.csv file into the model to generate the result.csv file. This file contains the predictions.
Important: The candidate does not have access to the ground truth values (i.e. correct or “true” answer to a specific problem or question) of the test.csv file. - They upload this result.csv file on the HackerEarth ML platform.
How is the candidate's submission evaluated?
Every ML question in the HackerEarth library has a checker file. This file allows the HackerEarth platform to automatically evaluate a candidate’s submission (result.csv file) and generate a score. Specifics about the checker file are as follows:
- The checker file is a piece of code that contains the following:
- Checkpoints (whether the file format (.csv) is met, all data points from the dataset are covered)
- Evaluation metric (for example, r2_score)
When a candidate uploads their results file (in .csv format) on the platform, the checker file does the following:
- The checker file for the question compares this CSV file with the ground truth values which is added to the platform as a test case.
- Generates a score based on the evaluation metric that is defined in the checker file.
When you want to change the score for a machine learning question, you need to update the score on the UI and in the checker files for correct evaluation.