Announcing the $10,000 chargemaster URLs bounty

BOUNTY
2 min read

DoltHub is building a database of hospitals and their chargemasters for Payless.Health. A complete list of URLs will help them build out their comprehensive search engine of hospital prices, spanning all 7,000+ hospitals in the US.

Background

In 2019, the Trump administration issued an executive order making hospitals publish their rates on their websites. DoltHub has already collected prices from around 1,835 of them, out of an original list of 5,000 (even three years later, not all hospitals have fully complied.)

This time OneFact wants to focus on collecting as many chargemaster URLs as possible. Afterwards, they'll take the data shaping and formatting into their own hands.

OneFact has managed to extend the original list of 5,000 to over 7,000 hospitals through a combination of FOIA requests and independent research. This data is available for free in the seeded bounty database.

The Bounty

To see where we're starting from, check out the paylesshealth database, which has been pre-seeded with chargemaster URLs from one of our earlier bounties.

There is only one table with five columns: the ccn (CMS certification number), the hospital name, the homepage URL, and the chargemaster URLs.

The direct chargemaster URL should get as close as possible to downloading the chargemaster directly. The indirect URL is the page that contains the direct URL, if it exists.

+---------------------------+---------------+------+-----+---------+-------+
| Field                     | Type          | Null | Key | Default | Extra |
+---------------------------+---------------+------+-----+---------+-------+
| name                      | varchar(100)  | YES  |     | NULL    |       |
| ccn                       | char(10)      | NO   |     | NULL    |       |
| state_code                | char(2)       | NO   |     | NULL    |       |
| chargemaster_direct_url   | varchar(2048) | YES  |     | NULL    |       |
| homepage_url              | varchar(2048) | YES  |     | NULL    |       |
| chargemaster_indirect_url | varchar(2048) | YES  |     | NULL    |       |
+---------------------------+---------------+------+-----+---------+-------+

The conditions for acceptance are that the URLs must be live and linked to the right hospital.

If two URLs point to the same page or file, the shortest one will be accepted (excluding the http:// prefix.)

If one URL is more "canonical" it can be accepted as a replacement. A link to a chargemaster file that comes from the hospital page directly is preferred over one that was found in a Google search.

Payout

Each week, $2,000 will be paid out to the winners. This bounty will be run in 5 parts of 1 week each.

About OneFact

To learn more about OneFact, see the bounty README.

OneFact is a 501(c)(3) whose mission is to change global health care using open source principles, machine learning, and natural language processing. All their work is open source. Their team of artificial intelligence experts includes Dr. Jaan Altosaar, Maxim Zaslavsky, Rohan Bansal, and Brad Windsor.

If you have skills in ML, data analytics, data journalism or UI/UX design, and if you'd like to help out, you can contact OneFact at hello@onefact.org.

Any questions about getting started? You can hop on our Discord chat or drop me a line at alec@dolthub.com.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.