It's that time for our April dataset spotlight here at DoltHub. For new folks, Dolt is a SQL database with git-like versioning and DoltHub is a place on the internet to share Dolt databases. This monthly feature keeps you updated on Data Bounties and popular Dolt databases.
We are excited to continue updating you about our progress on Data Bounties. We have one active data bounty. We completed one bounty in April.
End Date: April 9, 2021
We wanted to build a database of corporate logos to use in Machine Learning applications. There has been some clamouring in the ML community for more open datasets. This is our attempt to help.
The logo bounty was a success. The bounty participants collected 3.4M logo URLs. The top contestant earned over $6000.
Hospital Price Transparency V2
End Date: May 26, 2021
Our first hospital bounty was inspiring, collecting over 72M prices. This time we corrected some schema errors, imported the old data, and asked bounty participants to have at it. Either get new hospitals or go through the ones we already have and update the proper code descriptions. We changed the schema so that code descriptions are now per hospital, as opposed to assuming the codes are the same across all hospitals. We think we got the schema right this time.
The five most viewed DoltHub datasets for the month of April:
The Police Data Accessibility Project (PDAP has been very active on DoltHub and on our Discord. They are collecting data from approximately 18,000 police organizations in the US. It's a very cool initiative. Expect to see more from them in this space.
That's it for this month. Interested in participating in data bounties? Come say hello on our Discord and be a part of our data community.