Creating a custom Data Science question
- Log in to HackerEarth Assessment using admin credentials.
- Click Add question in the Overview section.
- Click Create a new question.
- Click Data Science in the Coding section.
Template
The template contains the following sections:
S. no. |
Section name |
Description |
1 |
Description |
This section describes the problem that you want to create. You are required to add details about the problem statement, difficulty level, etc |
2 |
Data & test cases |
This section contains the datasets, test cases, etc. |
3 |
Languages |
This section contains the languages that you can enable for a candidate to solve the question |
4 |
Editorial |
This section can contain the approach to solve the problem. This is optional. |
The Description section
Creating a problem statement
The question-creation template is displayed on your screen with the following fields:
- Problem Name
- Problem Statement
- Difficulty level
- Maximum Score
- Tags
1. Add the name or title of your question in the Problem Name field.
2. In the Problem Statement field, add the problem statement that you want the candidate to solve. A good problem statement has the following:
- Problem statement: It contains a task or problem statement that a candidate must solve.
- Data description: It contains the dataset that is provided to solve a problem.
- Submission criteria: It contains the formats and different criteria that a candidate must follow while making a submission.
- Evaluation criteria: It contains the scoring metric that is used for evaluating a submission.
3. Set the complexity of your question from the Difficulty level list.
4. Add the required tags in the Tag section.
The Data & test cases sectionAdding dataset and test cases
S. no. |
Label name |
Description |
1 |
Sample Data Set |
A sample data set is a subset of the full data set. |
2 |
Expected Output |
The sample expected output file represents the true values of the sample test data that is present in the sample data set. |
3 |
Full Data Set |
A full data set can be defined as the complete data set which can be used to train and test the model that solves the problem given in the problem statement. |
4 |
Checker |
A checker is used to automatically evaluate and generate a score for the submission. This submission contains the predictions of the test data submitted by the candidate. If you click Add Checker, you will observe the following fields:
To understand about partial scoring and multiple test cases, you can refer to this link. Once you have uploaded the required files, click Upload. |
5 |
Time Limit |
The evaluation of your submission file during COMPILE & TEST and SUBMIT is limited to a specific amount of time. If your code exceeds the specified limit, then you will see the time limit exceeded (TLE) error on the screen. Note: The maximum time limit is 300 seconds. The time limit is multiplied by a time factor. For example, in Python, this factor is 5. If you have set the time limit as 300 seconds, then the total time is 300*5=1500 seconds, that is, 25 minutes. Similarly, in R, the factor is 1.5. |
6 |
Memory Limit |
The evaluation of your submission file during COMPILE & TEST and SUBMIT is limited to a specific amount of memory. If your code exceeds the specified limit, then you will see the memory limit exceeded (MLE) error on the screen. Note: The maximum memory limit is 3072 MB. |
7 |
Code Snippet |
Code snippets are boilerplate codes that can be used by the candidate for reference to solve the question. You can add instructions (in comments format) or provide comments to import required libraries. |
Notes
- The data set must be uploaded in a .zip format. And, the maximum file size limit is 30 MB.
- The dataset folder should consist of the following information:
- train.csv: Data set that candidates use to train their models
- test.csv: Data set that candidates use to predict an outcome
- sample_submission.csv: Format that candidates should follow to create their submission file
- The test data, train data, and sample submission must be .csv files.
Adding code snippets
Code snippets are boilerplate codes that can be used by the candidate for reference to solve the question. You can add instructions (in comments format) or provide comments to import required libraries. You can select the languages allowed for the code snippets which are the following:
- Python
- Python 3
- R
Example
The Editorial section
Adding the approach to solve a problem
You can add interesting content regarding the question that allows a candidate to solve the problem. This can include the approach, directions, or steps to solve the problem.
Click Publish to successfully create a Data Science question in a test.
To try the question and understand how the candidate can solve it on HackerEarth’s platform and make submissions, you can refer to this article.