Datasets

Create a Dataset

7min

To create a dataset go to your "Dataset" page and click on the blue "+New Dataset" button on the top right corner of the page. From there, follow these steps for either Text, Image, Audio or Video datasets.

Text

1. Choose if you want to upload your data row-based or document-based: if it's row-based it will assign each paragraph/row/item to a task. If it's doc-based, each document will be a task.

Document image


2. Select a file/files from your computer: you can either drop files or browse in your computer to upload.

  • Maximum file size allowed is 5MB and 10,000 rows or 10 documents of 1MB each for doc-based datasets

3. Configure the dataset: set the purpose of each column

  • Row Header: Select if the first row of the file is a header
  • Column classification:
    • To be annotated: the columns with the text that has to be annotated. Depending on the type of project you have to choose one (for example: for text classification) or two (for example: for text similarity) columns.
    • Unique Identifier: if you have a column with IDs for every task
    • Task title: if each task has a title
    • With annotations: is your dataset annotated? select the columns with the annotations - it can be either a column per label or a single column with all the labels separated by " | " for each task - for further information please go to Datasets > Annotated Dataset
    • Row-based Tags: if each task has specific tags
    • Contextual info: if you need to add any specifications to a task you can have a column for that
  • "Ignore columns?" - By deafult is checked - will ignore columns not applied to any section above
Document image


4. Name and language

  • Name of the dataset: create a new name or choose from the dropdown from previous dataset or projects
  • Language: select from the dropdown
Document image


Image

1. Upload up to 20 images by either dropping them or browsing them in your files: they can either be JPG, JPEG or PNG and maximum file size allowed is 2MB.

Document image


2. Name and language

  • Name of the dataset: create a new name or choose from the dropdown from previous dataset or projects
  • Language: select from the dropdown



Audio

1. Upload up to 20 audios by either dropping them or browsing them in your files: they can either be MP3 or WAV and maximum file size allowed is 2MB.

Document image


2. Name and language

  • Name of the dataset: create a new name or choose from the dropdown from previous dataset or projects
  • Language: select from the dropdown

Video

1. Upload up to 20 videos by either dropping them or browsing them in your files: they must be MP4 and maximum file size allowed is 2MB.

Document image


2. Name and language

  • Name of the dataset: create a new name or choose from the dropdown from previous dataset or projects
  • Language: select from the dropdown