');mask-image:url('data:image/svg+xml;charset=utf-8, ');width:16px}.markdown-body details,.markdown-body figcaption,.markdown-body figure{display:block}.markdown-body summary{display:list-item}.markdown-body [hidden]{display:none!important}.markdown-body a{background-color:transparent;color:#0969da;-webkit-text-decoration:none;text-decoration:none}.markdown-body abbr[title]{border-bottom:none;-webkit-text-decoration:underline dotted;text-decoration:underline;text-decoration:underline dotted}.markdown-body b,.markdown-body strong{font-weight:600}.markdown-body dfn{font-style:italic}.markdown-body h1{border-bottom:1px solid rgba(209,217,224,.702);font-size:2em;font-weight:600;margin:.67em 0;padding-bottom:.3em}.markdown-body mark{background-color:#fff8c5;color:#1f2328}.markdown-body small{font-size:90%}.markdown-body sub,.markdown-body sup{font-size:75%;line-height:0;position:relative;vertical-align:baseline}.markdown-body sub{bottom:-.25em}.markdown-body sup{top:-.5em}.markdown-body img{border-style:none;box-sizing:content-box;max-width:100%}.markdown-body code,.markdown-body kbd,.markdown-body pre,.markdown-body samp{font-family:monospace;font-size:1em}.markdown-body figure{margin:1em 2.5rem}.markdown-body hr{background:transparent;background-color:#d1d9e0;border:0;box-sizing:content-box;height:.25em;margin:1.5rem 0;overflow:hidden;padding:0}.markdown-body input{font:inherit;font-family:inherit;font-size:inherit;line-height:inherit;margin:0;overflow:visible}.markdown-body [type=button],.markdown-body [type=reset],.markdown-body [type=submit]{-webkit-appearance:button;-moz-appearance:button;appearance:button}.markdown-body [type=checkbox],.markdown-body [type=radio]{box-sizing:border-box;padding:0}.markdown-body [type=number]::-webkit-inner-spin-button,.markdown-body [type=number]::-webkit-outer-spin-button{height:auto}.markdown-body [type=search]::-webkit-search-cancel-button,.markdown-body [type=search]::-webkit-search-decoration{-webkit-appearance:none;appearance:none}.markdown-body ::-webkit-input-placeholder{color:inherit;opacity:.54}.markdown-body ::-webkit-file-upload-button{-webkit-appearance:button;appearance:button;font:inherit}.markdown-body a:hover{-webkit-text-decoration:underline;text-decoration:underline}.markdown-body ::-moz-placeholder{color:#59636e;opacity:1}.markdown-body ::placeholder{color:#59636e;opacity:1}.markdown-body hr:after,.markdown-body hr:before{content:"";display:table}.markdown-body hr:after{clear:both}.markdown-body table{font-feature-settings:"tnum";border-collapse:collapse;border-spacing:0;display:block;font-variant:tabular-nums;max-width:100%;overflow:auto;width:-moz-max-content;width:max-content}.markdown-body td,.markdown-body th{padding:0}.markdown-body details summary{cursor:pointer}.markdown-body [role=button]:focus,.markdown-body a:focus,.markdown-body input[type=checkbox]:focus,.markdown-body input[type=radio]:focus{box-shadow:none;outline:2px solid #0969da;outline-offset:-2px}.markdown-body [role=button]:focus:not(:focus-visible),.markdown-body a:focus:not(:focus-visible),.markdown-body input[type=checkbox]:focus:not(:focus-visible),.markdown-body input[type=radio]:focus:not(:focus-visible){outline:1px solid transparent}.markdown-body [role=button]:focus-visible,.markdown-body a:focus-visible,.markdown-body input[type=checkbox]:focus-visible,.markdown-body input[type=radio]:focus-visible{box-shadow:none;outline:2px solid #0969da;outline-offset:-2px}.markdown-body a:not([class]):focus,.markdown-body a:not([class]):focus-visible,.markdown-body input[type=checkbox]:focus,.markdown-body input[type=checkbox]:focus-visible,.markdown-body input[type=radio]:focus,.markdown-body input[type=radio]:focus-visible{outline-offset:0}.markdown-body kbd{background-color:#f6f8fa;border:1px solid rgba(209,217,224,.702);border-radius:6px;box-shadow:inset 0 -1px 0 rgba(209,217,224,.702);color:#1f2328;display:inline-block;font:11px ui-monospace,SFMono-Regular,SF Mono,Menlo,Consolas,Liberation Mono,monospace;line-height:10px;padding:.25rem;vertical-align:middle}.markdown-body h1,.markdown-body h2,.markdown-body h3,.markdown-body h4,.markdown-body h5,.markdown-body h6{font-weight:600;line-height:1.25;margin-bottom:1rem;margin-top:1.5rem}.markdown-body h2{border-bottom:1px solid rgba(209,217,224,.702);font-size:1.5em;font-weight:600;padding-bottom:.3em}.markdown-body h3{font-size:1.25em;font-weight:600}.markdown-body h4{font-size:1em;font-weight:600}.markdown-body h5{font-size:.875em;font-weight:600}.markdown-body h6{color:#59636e;font-size:.85em;font-weight:600}.markdown-body p{margin-bottom:10px;margin-top:0}.markdown-body blockquote{border-left:.25em solid #d1d9e0;color:#59636e;margin:0;padding:0 1em}.markdown-body ol,.markdown-body ul{margin-bottom:0;margin-top:0;padding-left:2em}.markdown-body ol ol,.markdown-body ul ol{list-style-type:lower-roman}.markdown-body ol ol ol,.markdown-body ol ul ol,.markdown-body ul ol ol,.markdown-body ul ul ol{list-style-type:lower-alpha}.markdown-body dd{margin-left:0}.markdown-body code,.markdown-body pre,.markdown-body samp,.markdown-body tt{font-family:ui-monospace,SFMono-Regular,SF Mono,Menlo,Consolas,Liberation Mono,monospace;font-size:12px}.markdown-body pre{word-wrap:normal;margin-bottom:0;margin-top:0}.markdown-body .octicon{fill:currentColor;display:inline-block;overflow:visible!important;vertical-align:text-bottom}.markdown-body input::-webkit-inner-spin-button,.markdown-body input::-webkit-outer-spin-button{-webkit-appearance:none;appearance:none;margin:0}.markdown-body .mr-2{margin-right:.5rem!important}.markdown-body:after,.markdown-body:before{content:"";display:table}.markdown-body:after{clear:both}.markdown-body>:first-child{margin-top:0!important}.markdown-body>:last-child{margin-bottom:0!important}.markdown-body a:not([href]){color:inherit;-webkit-text-decoration:none;text-decoration:none}.markdown-body .absent{color:#d1242f}.markdown-body .anchor{float:left;line-height:1;margin-left:-20px;padding-right:.25rem}.markdown-body .anchor:focus{outline:none}.markdown-body blockquote,.markdown-body details,.markdown-body dl,.markdown-body ol,.markdown-body p,.markdown-body pre,.markdown-body table,.markdown-body ul{margin-bottom:1rem;margin-top:0}.markdown-body blockquote>:first-child{margin-top:0}.markdown-body blockquote>:last-child{margin-bottom:0}.markdown-body h1 .octicon-link,.markdown-body h2 .octicon-link,.markdown-body h3 .octicon-link,.markdown-body h4 .octicon-link,.markdown-body h5 .octicon-link,.markdown-body h6 .octicon-link{color:#1f2328;vertical-align:middle;visibility:hidden}.markdown-body h1:hover .anchor,.markdown-body h2:hover .anchor,.markdown-body h3:hover .anchor,.markdown-body h4:hover .anchor,.markdown-body h5:hover .anchor,.markdown-body h6:hover .anchor{-webkit-text-decoration:none;text-decoration:none}.markdown-body h1:hover .anchor .octicon-link,.markdown-body h2:hover .anchor .octicon-link,.markdown-body h3:hover .anchor .octicon-link,.markdown-body h4:hover .anchor .octicon-link,.markdown-body h5:hover .anchor .octicon-link,.markdown-body h6:hover .anchor .octicon-link{visibility:visible}.markdown-body h1 code,.markdown-body h1 tt,.markdown-body h2 code,.markdown-body h2 tt,.markdown-body h3 code,.markdown-body h3 tt,.markdown-body h4 code,.markdown-body h4 tt,.markdown-body h5 code,.markdown-body h5 tt,.markdown-body h6 code,.markdown-body h6 tt{font-size:inherit;padding:0 .2em}.markdown-body summary h1,.markdown-body summary h2,.markdown-body summary h3,.markdown-body summary h4,.markdown-body summary h5,.markdown-body summary h6{display:inline-block}.markdown-body summary h1 .anchor,.markdown-body summary h2 .anchor,.markdown-body summary h3 .anchor,.markdown-body summary h4 .anchor,.markdown-body summary h5 .anchor,.markdown-body summary h6 .anchor{margin-left:-40px}.markdown-body summary h1,.markdown-body summary h2{border-bottom:0;padding-bottom:0}.markdown-body ol.no-list,.markdown-body ul.no-list{list-style-type:none;padding:0}.markdown-body ol[type="a s"]{list-style-type:lower-alpha}.markdown-body ol[type="A s"]{list-style-type:upper-alpha}.markdown-body ol[type="i s"]{list-style-type:lower-roman}.markdown-body ol[type="I s"]{list-style-type:upper-roman}.markdown-body div>ol:not([type]),.markdown-body ol[type="1"]{list-style-type:decimal}.markdown-body ol ol,.markdown-body ol ul,.markdown-body ul ol,.markdown-body ul ul{margin-bottom:0;margin-top:0}.markdown-body li>p{margin-top:1rem}.markdown-body li+li{margin-top:.25em}.markdown-body dl{padding:0}.markdown-body dl dt{font-size:1em;font-style:italic;font-weight:600;margin-top:1rem;padding:0}.markdown-body dl dd{margin-bottom:1rem;padding:0 1rem}.markdown-body table th{font-weight:600}.markdown-body table td,.markdown-body table th{border:1px solid #d1d9e0;padding:6px 13px}.markdown-body table td>:last-child{margin-bottom:0}.markdown-body table tr{background-color:#fff;border-top:1px solid rgba(209,217,224,.702)}.markdown-body table tr:nth-child(2n){background-color:#f6f8fa}.markdown-body table img{background-color:transparent}.markdown-body img[align=right]{padding-left:20px}.markdown-body img[align=left]{padding-right:20px}.markdown-body .emoji{background-color:transparent;max-width:none;vertical-align:text-top}.markdown-body span.frame{display:block;overflow:hidden}.markdown-body span.frame>span{border:1px solid #d1d9e0;display:block;float:left;margin:13px 0 0;overflow:hidden;padding:7px;width:auto}.markdown-body span.frame span img{display:block;float:left}.markdown-body span.frame span span{clear:both;color:#1f2328;display:block;padding:5px 0 0}.markdown-body span.align-center{clear:both;display:block;overflow:hidden}.markdown-body span.align-center>span{display:block;margin:13px auto 0;overflow:hidden;text-align:center}.markdown-body span.align-center span img{margin:0 auto;text-align:center}.markdown-body span.align-right{clear:both;display:block;overflow:hidden}.markdown-body span.align-right>span{display:block;margin:13px 0 0;overflow:hidden;text-align:right}.markdown-body span.align-right span img{margin:0;text-align:right}.markdown-body span.float-left{display:block;float:left;margin-right:13px;overflow:hidden}.markdown-body span.float-left span{margin:13px 0 0}.markdown-body span.float-right{display:block;float:right;margin-left:13px;overflow:hidden}.markdown-body span.float-right>span{display:block;margin:13px auto 0;overflow:hidden;text-align:right}.markdown-body code,.markdown-body tt{background-color:rgba(129,139,152,.122);border-radius:6px;font-size:85%;margin:0;padding:.2em .4em;white-space:break-spaces}.markdown-body code br,.markdown-body tt br{display:none}.markdown-body del code{text-decoration:inherit}.markdown-body samp{font-size:85%}.markdown-body pre code{font-size:100%}.markdown-body pre>code{background:transparent;border:0;margin:0;padding:0;white-space:pre;word-break:normal}.markdown-body .highlight{margin-bottom:1rem}.markdown-body .highlight pre{margin-bottom:0;word-break:normal}.markdown-body .highlight pre,.markdown-body pre{background-color:#f6f8fa;border-radius:6px;color:#1f2328;font-size:85%;line-height:1.45;overflow:auto;padding:1rem}.markdown-body pre code,.markdown-body pre tt{word-wrap:normal;background-color:transparent;border:0;display:inline;line-height:inherit;margin:0;max-width:auto;overflow:visible;padding:0}.markdown-body .csv-data td,.markdown-body .csv-data th{font-size:12px;line-height:1;overflow:hidden;padding:5px;text-align:left;white-space:nowrap}.markdown-body .csv-data .blob-num{background:#fff;border:0;padding:10px .5rem 9px;text-align:right}.markdown-body .csv-data tr{border-top:0}.markdown-body .csv-data th{background:#f6f8fa;border-top:0;font-weight:600}.markdown-body [data-footnote-ref]:before{content:"["}.markdown-body [data-footnote-ref]:after{content:"]"}.markdown-body .footnotes{border-top:1px solid #d1d9e0;color:#59636e;font-size:12px}.markdown-body .footnotes ol{padding-left:1rem}.markdown-body .footnotes ol ul{display:inline-block;margin-top:1rem;padding-left:1rem}.markdown-body .footnotes li{position:relative}.markdown-body .footnotes li:target:before{border:2px solid #0969da;border-radius:6px;bottom:-.5rem;content:"";left:-1.5rem;pointer-events:none;position:absolute;right:-.5rem;top:-.5rem}.markdown-body .footnotes li:target{color:#1f2328}.markdown-body .footnotes .data-footnote-backref g-emoji{font-family:monospace}.markdown-body body:has(:modal){padding-right:var(--dialog-scrollgutter)!important}.markdown-body .pl-c{color:#59636e}.markdown-body .pl-c1,.markdown-body .pl-s .pl-v{color:#0550ae}.markdown-body .pl-e,.markdown-body .pl-en{color:#6639ba}.markdown-body .pl-s .pl-s1,.markdown-body .pl-smi{color:#1f2328}.markdown-body .pl-ent{color:#0550ae}.markdown-body .pl-k{color:#cf222e}.markdown-body .pl-pds,.markdown-body .pl-s,.markdown-body .pl-s .pl-pse .pl-s1,.markdown-body .pl-sr,.markdown-body .pl-sr .pl-cce,.markdown-body .pl-sr .pl-sra,.markdown-body .pl-sr .pl-sre{color:#0a3069}.markdown-body .pl-smw,.markdown-body .pl-v{color:#953800}.markdown-body .pl-bu{color:#82071e}.markdown-body .pl-ii{background-color:#82071e;color:#f6f8fa}.markdown-body .pl-c2{background-color:#cf222e;color:#f6f8fa}.markdown-body .pl-sr .pl-cce{color:#116329;font-weight:700}.markdown-body .pl-ml{color:#3b2300}.markdown-body .pl-mh,.markdown-body .pl-mh .pl-en,.markdown-body .pl-ms{color:#0550ae;font-weight:700}.markdown-body .pl-mi{color:#1f2328;font-style:italic}.markdown-body .pl-mb{color:#1f2328;font-weight:700}.markdown-body .pl-md{background-color:#ffebe9;color:#82071e}.markdown-body .pl-mi1{background-color:#dafbe1;color:#116329}.markdown-body .pl-mc{background-color:#ffd8b5;color:#953800}.markdown-body .pl-mi2{background-color:#0550ae;color:#d1d9e0}.markdown-body .pl-mdr{color:#8250df;font-weight:700}.markdown-body .pl-ba{color:#59636e}.markdown-body .pl-sg{color:#818b98}.markdown-body .pl-corl{color:#0a3069;-webkit-text-decoration:underline;text-decoration:underline}.markdown-body [role=button]:focus:not(:focus-visible),.markdown-body [role=tabpanel][tabindex="0"]:focus:not(:focus-visible),.markdown-body a:focus:not(:focus-visible),.markdown-body button:focus:not(:focus-visible),.markdown-body summary:focus:not(:focus-visible){box-shadow:none;outline:none}.markdown-body [tabindex="0"]:focus:not(:focus-visible),.markdown-body details-dialog:focus:not(:focus-visible){outline:none}.markdown-body g-emoji{display:inline-block;font-family:Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;font-size:1em;font-style:normal!important;font-weight:400;line-height:1;min-width:1ch;vertical-align:-.075em}.markdown-body g-emoji img{height:1em;width:1em}.markdown-body .task-list-item{list-style-type:none}.markdown-body .task-list-item label{font-weight:400}.markdown-body .task-list-item.enabled label{cursor:pointer}.markdown-body .task-list-item+.task-list-item{margin-top:.25rem}.markdown-body .task-list-item .handle{display:none}.markdown-body .task-list-item-checkbox{margin:0 .2em .25em -1.4em;vertical-align:middle}[dir=rtl] .markdown-body ol .task-list-item-checkbox,[dir=rtl] .markdown-body ul .task-list-item-checkbox{margin:0 -1.6em .25em .2em}.markdown-body .contains-task-list:focus-within .task-list-item-convert-container,.markdown-body .contains-task-list:hover .task-list-item-convert-container{clip:auto;display:block;height:24px;overflow:visible;width:auto}.markdown-body ::-webkit-calendar-picker-indicator{filter:invert(50%)}.markdown-body .markdown-alert{border-left:.25em solid #d1d9e0;color:inherit;margin-bottom:1rem;padding:.5rem 1rem}.markdown-body .markdown-alert>:first-child{margin-top:0}.markdown-body .markdown-alert>:last-child{margin-bottom:0}.markdown-body .markdown-alert .markdown-alert-title{align-items:center;display:flex;font-weight:500;line-height:1}.markdown-body .markdown-alert.markdown-alert-note{border-left-color:#0969da}.markdown-body .markdown-alert.markdown-alert-note .markdown-alert-title{color:#0969da}.markdown-body .markdown-alert.markdown-alert-important{border-left-color:#8250df}.markdown-body .markdown-alert.markdown-alert-important .markdown-alert-title{color:#8250df}.markdown-body .markdown-alert.markdown-alert-warning{border-left-color:#9a6700}.markdown-body .markdown-alert.markdown-alert-warning .markdown-alert-title{color:#9a6700}.markdown-body .markdown-alert.markdown-alert-tip{border-left-color:#1a7f37}.markdown-body .markdown-alert.markdown-alert-tip .markdown-alert-title{color:#1a7f37}.markdown-body .markdown-alert.markdown-alert-caution{border-left-color:#cf222e}.markdown-body .markdown-alert.markdown-alert-caution .markdown-alert-title{color:#d1242f}.markdown-body>:first-child>.heading-element:first-child{margin-top:0!important}.markdown-body .highlight pre:has(+.zeroclipboard-container){min-height:52px}
Announcing the $10,000 Menus Data Bounty | DoltHub Blog If you aren’t yet familiar with Dolt and DoltHub, Dolt is a MySQL database that can branch, diff and merge. DoltHub hosts those databases on the web. We are currently running data bounties , a string of collaborative database projects in which participants get paid to contribute data.
Today we are launching a $10,000 data bounty to pay contributors to build the only free, open source menus database on the internet.
Background
In 2010, Congress enacted the federal menu labeling law * . As of May 7, 2018, the FDA requires restaurants with over twenty locations to provide menu labeling for consumers. Enforcement of this law began May 7, 2019. Legal reasons aside, restaurants and businesses also use menu data to set competitive offerings and prices. Whether it concerns analysts of law, nutrition or restaurant strategy, this is data we think should be open source and free for all to inspect, contribute to and use.
Schema
We toyed with this schema for a few days. We wondered if adding pricing data was increasing the scope of the bounty too much. We ultimately couldn't pick between prices and nutrition, and seeing as nutritional data and prices are often listed in tandem on the internet, it made sense to allow contestants to scrape all nutritional information and prices at once. That is, however, not a requirement. We will accept submissions that contain menu items with nutritional information or prices.
The menu_items
table has a 3-part composite primary key: (name
, restaurant_name
and identifier
). The identifier
is where we will store regional data, where applicable. The database README.md has more explicit information on how to build the primary key.
In short, we're either using NATIONAL
as the identifier, or comma delineating regional information, for example Santa Monica, CA
or NULL, CA
. For food service delivery partners, we will insert menu items uniquely by also attaching the service provider name, such as Postmates, NATIONAL
, Postmates, NULL, CA
or Postmates, Santa Monica, CA
. Using the identifier in this format means we can parse out useful regional information later. At the same time, we are allowing inserts of unique menu items including those that vary in price across location or direct and third party menus.
menus> describe menu_items;
+-----------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+---------------+------+-----+---------+-------+
| name | varchar(255) | NO | PRI | | |
| restaurant_name | varchar(100) | NO | PRI | | |
| identifier | varchar(255) | NO | PRI | | |
| calories | int | YES | | NULL | |
| fat_g | decimal(6,2) | YES | | NULL | |
| carbohydrates_g | int | YES | | NULL | |
| protein_g | int | YES | | NULL | |
| sodium_mg | int | YES | | NULL | |
| price_usd | decimal(10,2) | YES | | | |
| cholesterol_mg | int | YES | | NULL | |
| fiber_g | int | YES | | NULL | |
| sugars_g | int | YES | | NULL | |
+-----------------+---------------+------+-----+---------+-------+
Here is an example of McDonald’s data, which we used to seed the database:
menus> select * from menu_items limit 5;
+-----------------------------------------+-----------------+------------+----------+-------+-----------------+-----------+-----------+-----------+----------------+---------+----------+
| name | restaurant_name | identifier | calories | fat_g | carbohydrates_g | protein_g | sodium_mg | price_usd | cholesterol_mg | fiber_g | sugars_g |
+-----------------------------------------+-----------------+------------+----------+-------+-----------------+-----------+-----------+-----------+----------------+---------+----------+
| APPLE SLICES | MCDONALD'S | NATIONAL | 15 | 0.00 | 4 | 0 | NULL | NULL | 0 | 0 | 3 |
| BACON BUFFALO RANCH MCCHICKEN | MCDONALD'S | NATIONAL | 430 | 21.00 | 41 | 20 | 850 | 1.00 | 50 | 2 | 6 |
| BACON CHEDDAR MCCHICKEN | MCDONALD'S | NATIONAL | 480 | 24.00 | 43 | 22 | 650 | 1.00 | 65 | 2 | 6 |
| BACON CLUBHOUSE BURGER | MCDONALD'S | NATIONAL | 720 | 40.00 | 51 | 39 | 1470 | 4.49 | 115 | 4 | 14 |
| BACON CLUBHOUSE CRISPY CHICKEN SANDWICH | MCDONALD'S | NATIONAL | 750 | 38.00 | 65 | 36 | NULL | NULL | 90 | 4 | 16 |
+-----------------------------------------+-----------------+------------+----------+-------+-----------------+-----------+-----------+-----------+----------------+---------+----------+
Data Validation
We are excited to release a public GitHub repository intended to mirror and validate the DoltHub menus
database. This GitHub repository contains a data validation script that users will run locally to identify issues and make any necessary corrections to submissions. This will also be used by the reviewer to ensure submissions pass minimum formatting requirements. Although GitHub contributions are not a paid part of the bounty, contestants are welcome to contribute to the data validation script by submitting a PR on GitHub.
The validation script and edge cases will continue to evolve over the course of the bounty as we discover more edge cases, creating a cycle of improvement between our scripts and submissions. We imagine scripts like this forming the basis for continuous integration on DoltHub, and have started to map out what we need to build an MVP for DoltHub CI. So far we think that includes a combination of webhooks for DoltHub pull requests, and GitHub actions.
To run the script, first clone the GitHub repository of the same name: dolthub/menus . Install python3
if you don't already have it, as well as Dolt . Then locate menus/validation/validate_menu_items.py
and fill in the relative path to your Dolt directory:
relative_path_to_dolt_directory = "FILL ME IN"
db = doltcli.Dolt(relative_path_to_dolt_directory)
Finally, cd
into menus/validation
and run the script:
python3 validate_menu_items.py
Users will be prompted to run SQL queries to make corrections if any issues are found.
Next steps
The menus bounty will be running for six weeks, until August 18, 2021. Get started scraping menu data, fork the database and submit a pull request to start earning money! If you need guidance or inspiration to start web scraping, pick a restaurant from this list of restaurants that we scraped in the research phases of this bounty. We've also open sourced the web scrapers we used to acquire McDonald's data.
Conclusion
Stay tuned for more product development regarding continuous integration testing. Come chat with us on Discord if you find data validation interesting. Our #data-bounties channel would love to meet you.
JOIN THE DATA EVOLUTION
Get started with Dolt