8 results · AI-generated index
H
huggingface.io
tool

Large Scale NLP Corpus Dataset for Machine Learning

The Hugging Face Datasets library provides a wide range of large-scale NLP corpus datasets for machine learning, including datasets for text classification, sentiment analysis, and language modeling.

U
ucsd.edu
article

Natural Language Processing (NLP) Datasets

The University of California, San Diego provides a list of NLP datasets, including large-scale corpora for machine learning, such as the Common Crawl dataset and the Wikipedia dataset.

L
ldc.upenn.edu
research

Linguistic Data Consortium (LDC) Datasets

The Linguistic Data Consortium (LDC) at the University of Pennsylvania provides a wide range of linguistic datasets, including large-scale NLP corpora for machine learning research and development.

G
google.com
official

Google's Natural Language Processing Dataset

Google's Natural Language Processing dataset is a large-scale corpus of text data that can be used for machine learning research and development, including tasks such as text classification and sentiment analysis.

T
towardsdatascience.com
article

NLP Datasets for Machine Learning

This article provides an overview of popular NLP datasets for machine learning, including large-scale corpora such as the Stanford Natural Language Inference (SNLI) dataset and the Multi-Genre Natural Language Inference (MultiNLI) dataset.

A
arxiv.org
research

Large-Scale NLP Corpus Dataset for Machine Learning Research

This research paper presents a large-scale NLP corpus dataset for machine learning research, including a dataset of over 100,000 text samples for tasks such as text classification and sentiment analysis.

K
kaggle.com
tool

NLP Dataset Repository

The Kaggle NLP Dataset Repository provides a wide range of NLP datasets, including large-scale corpora for machine learning, such as the IMDB dataset and the 20 Newsgroups dataset.

N
nist.gov
official

National Institutes of Standards and Technology (NIST) NLP Dataset

The National Institutes of Standards and Technology (NIST) provides a large-scale NLP corpus dataset for machine learning research and development, including datasets for tasks such as text classification and information retrieval.