60k Stack Overflow Questions with Quality Rating

  • by user1
  • 28 February, 2022

Questions from 2016-2020 classified in three categories based on their quality

LicenseData files © Original Authors

Tagstext datanlptext mining

This is a dataset containing 60,000 Stack Overflow questions from 2016-2020. Questions are classified into three categories:

  1. HQ: High-quality posts without a single edit.
  2. LQ_EDIT: Low-quality posts with a negative score, and multiple community edits. However, they still remain open after those changes.
  3. LQ_CLOSE: Low-quality posts that were closed by the community without a single edit.


  • Questions are sorted according to Question Id.
  • Question body is in HTML format.
  • All dates are in UTC format.



Size: 21543 KB Price: Free Author: Moore Data source: kaggle.com