March Dataset Spotlight

2 min read

It's that time. Our March dataset spotlight here at DoltHub. For new folks, Dolt is a SQL database with git-like versioning and DoltHub is a place on the internet to share Dolt databases. This monthly feature keeps you updated on Data Bounties and popular Dolt databases.

Bounties

We are excited to continue updating you about our progress on Data Bounties. We have one active data bounty with one finishing imminently. We will be launching another in the next week or so. We completed one bounty in March.

Logos

Link: dolthub/logo-2k-extended
Bounty: $10,000
End Date: April 9, 2021

We wanted to build a database of corporate logos to use in Machine Learning applications. There has been some clamouring in the ML community for more open datasets. We are doing our part. So far, we've had 18 PRs accepted with 424,461 logos collected.

US College Course Catalogs

Link: dolthub/national-course-catalog-us (private)
Bounty: $10,000
End Date: March 18, 2021

We wanted to build a database of US College Course catalogs for a partner. Unlike previous bounties, the data generated by this bounty was made private once the bounty completed. We were able to collect 65 schools with catalogs spanning more than 20 years.

Popular Datasets

The five most viewed DoltHub datasets for the month of March:

  1. dolthub/corona-virus
  2. dolthub/national-course-catalog-us
  3. dolthub/hospital-price-transparency
  4. dolthub/logo-2k-extended
  5. dolthub/ip-to-country

Conclusion

That's it for this month. Interested in participating in data bounties? Come say hello on our Discord and be a part of our data community.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt