and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/about.html b/about.html
index cf6211ca..2a055352 100644
--- a/about.html
+++ b/about.html
@@ -2779,7 +2779,7 @@
Reuse
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/blog.html b/blog.html
index ad3d945f..d3c65114 100644
--- a/blog.html
+++ b/blog.html
@@ -3096,7 +3096,7 @@
Reuse
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/charts.html b/charts.html
index 33484dad..db86bda5 100644
--- a/charts.html
+++ b/charts.html
@@ -2917,7 +2917,7 @@
Reuse
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/da-faq.html b/da-faq.html
index b6ed5a12..51c66dc5 100644
--- a/da-faq.html
+++ b/da-faq.html
@@ -2720,7 +2720,7 @@
If you have questions about the CS track, contact Prof. Tim Wood
@@ -2855,7 +2855,7 @@
Reuse
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/good-eats.html b/good-eats.html
index a62d7df7..035efc2d 100644
--- a/good-eats.html
+++ b/good-eats.html
@@ -2734,7 +2734,7 @@
Reuse
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/index.html b/index.html
index 46f8a105..ebbe2404 100644
--- a/index.html
+++ b/index.html
@@ -643,7 +643,7 @@
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/posts/2021-03-25-customizing-distill-with-htmltools-and-css/index.html b/posts/2021-03-25-customizing-distill-with-htmltools-and-css/index.html
index 8687de82..0c05e8a3 100644
--- a/posts/2021-03-25-customizing-distill-with-htmltools-and-css/index.html
+++ b/posts/2021-03-25-customizing-distill-with-htmltools-and-css/index.html
@@ -3115,7 +3115,7 @@
Citation
,
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/posts/2021-05-04-expand-your-productivity-with-espanso/index.html b/posts/2021-05-04-expand-your-productivity-with-espanso/index.html
index 9af7e1cc..422dabd5 100644
--- a/posts/2021-05-04-expand-your-productivity-with-espanso/index.html
+++ b/posts/2021-05-04-expand-your-productivity-with-espanso/index.html
@@ -3326,7 +3326,7 @@
Citation
,
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/posts/2021-07-03-in-search-of-keyboard-nirvana/index.html b/posts/2021-07-03-in-search-of-keyboard-nirvana/index.html
index 03e74506..7ca7f33a 100644
--- a/posts/2021-07-03-in-search-of-keyboard-nirvana/index.html
+++ b/posts/2021-07-03-in-search-of-keyboard-nirvana/index.html
@@ -3059,7 +3059,7 @@
Citation
,
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/posts/2021-09-18-choice-based-conjoint-surveys-in-r-with-formr/index.html b/posts/2021-09-18-choice-based-conjoint-surveys-in-r-with-formr/index.html
index 186af856..f67a47e3 100644
--- a/posts/2021-09-18-choice-based-conjoint-surveys-in-r-with-formr/index.html
+++ b/posts/2021-09-18-choice-based-conjoint-surveys-in-r-with-formr/index.html
@@ -2965,7 +2965,7 @@
Citation
,
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/posts/2021-10-19-monte-carlo-bridge-game/index.html b/posts/2021-10-19-monte-carlo-bridge-game/index.html
index 08f971bf..a8bb886a 100644
--- a/posts/2021-10-19-monte-carlo-bridge-game/index.html
+++ b/posts/2021-10-19-monte-carlo-bridge-game/index.html
@@ -3569,7 +3569,7 @@
Citation
,
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/posts/2022-06-28-introducing-renderthis/index.html b/posts/2022-06-28-introducing-renderthis/index.html
index 581e3267..ea6baff9 100644
--- a/posts/2022-06-28-introducing-renderthis/index.html
+++ b/posts/2022-06-28-introducing-renderthis/index.html
@@ -3108,7 +3108,7 @@
Citation
,
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/posts/2022-12-26-why-chess-is-better-than-golf/index.html b/posts/2022-12-26-why-chess-is-better-than-golf/index.html
index fefb7950..2d89b1f4 100644
--- a/posts/2022-12-26-why-chess-is-better-than-golf/index.html
+++ b/posts/2022-12-26-why-chess-is-better-than-golf/index.html
@@ -2974,7 +2974,7 @@
Citation
,
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/posts/2023-02-28-parameterized-pdfs-with-quarto/index.html b/posts/2023-02-28-parameterized-pdfs-with-quarto/index.html
index 34e04216..91b836eb 100644
--- a/posts/2023-02-28-parameterized-pdfs-with-quarto/index.html
+++ b/posts/2023-02-28-parameterized-pdfs-with-quarto/index.html
@@ -2944,7 +2944,7 @@
Citation
,
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/posts/2023-04-06-markdown-surveys/index.html b/posts/2023-04-06-markdown-surveys/index.html
index 3bd1a3b8..167a1f5b 100644
--- a/posts/2023-04-06-markdown-surveys/index.html
+++ b/posts/2023-04-06-markdown-surveys/index.html
@@ -3014,7 +3014,7 @@
Citation
,
and the distill package
- Last updated on February 15,
+ Last updated on February 16,
2024
diff --git a/posts/posts.json b/posts/posts.json
index cfe966e4..f6b642ab 100644
--- a/posts/posts.json
+++ b/posts/posts.json
@@ -18,7 +18,7 @@
],
"contents": "\n\nContents\nUser interface\nA quick prototype\nPlatforms\nPlease help!🙏\nUpdate 1 (2023-06-15)\n\n\n\nI do a lot of research using surveys. My current platform of choice is formr, a flexible platform for making surveys using (see my blog post on how to use it to implement a choice-based conjoint survey). While formr is powerful in terms of it’s flexibility, it has a bit of a learning curve and isn’t the easiest tool for novices. And alternative platforms like Qualtrics, Google forms, etc. have their own issues, one of which is simply that they use a WYSIWYG interface, which makes it difficult to collaborate, version control, reproduce, etc.\nI want a markdown-based survey framework. Users should be able to draft plain text / markdown / RMarkdown files that can be compiled into a web-based survey. My inspiration is packages like {xaringan}, which compiles RMarkdown files into HTML presentation slides using remark.js.\nThis framework doesn’t yet exist (or at least I am not award of one that does), but I am confident we could relatively quickly build a working prototype. I’ve even got a name for it: {surveydown}\nSo this post is a call for help. I am laying out my goals for such a framework and asking if anyone out there wants to help take a crack at building it.\nUser interface\nMy prototype for a markdown-based survey framework would have the following features:\n1. All survey content is defined in plain text files (e.g. Rmd, yml, etc.)\n2. All survey questions are defined in a _questions.yml file\nHere’s an example:\n\nage:\n type: numeric\n required: true\n text: \"What's your age?\"\n option: 25\n\ngender:\n type: mc\n required: true\n text: \"Which best describes your gender?\"\n option:\n - female: \"Female\"\n - male: \"Male\"\n - opt_out: \"Prefer not to say\"\n\nSome fields in this file might be:\ntype: Defines the type of question (e.g. numeric = a numeric entry box, mc = multiple choice, etc.).\nrequired: Respondent must answer the question to continue in the survey (defaults to false).\ntext: The question text.\noption: The set of choices for the question.\nThe names used for each question would be used as the column names in the resulting data file once respondents have completed the survey. For example, since I have two questions called age and gender, the resulting data might look something like this:\n\n#> # A tibble: 3 × 3\n#> respondent age gender \n#> \n#> 1 1 25 female \n#> 2 2 30 male \n#> 3 3 32 opt_out\n\nHere I am storing the option values (e.g. \"opt_out\") instead of the option labels (e.g. \"Prefer not to say\") in each cell, though this could be optional.\n3. All content displayed in the survey is defined in plain text / Rmd files\nFor example, a basic survey might have the following files:\nwelcome.Rmd: Basic welcome page.\nscreener.Rmd: Contains screen questions to filter out only elligible respondents.\nother.Rmd: Other questions for those who got through the screener.\nend.Rmd: Final survey page.\nThe welcome.Rmd file might have the following content:\n\n# Welcome!\n\n---\n\nThis is a survey!\n\nHere the --- symbol would indicate a page break in the survey, similar to how slide breaks work in {xaringan}.\n4. Survey questions are inserted with a simple interface, e.g. {{ }}\nFor example, a double curly bracket could be used to insert the age and gender questions, like this:\n\n{{ question age }}\n\n{{ question gender }}\n\nThis allows the survey designer the ability to separately handle the survey questions from all other content. That way they don’t have to dig through their Rmd files to edit the question labels or response options - they can just edit the _questions.yml file and everything else in the survey remains unchanged.\n5. Survey control logic is defined in a _survey.yml file\nIn this file, the user should be able to control the sequencing of the survey content, such as skipping questions depending on a specific response in a screener question. An example might be:\n\nsurvey:\n welcome.Rmd\n screener.Rmd\n skip:\n condition: age > 40\n distination: end_screen.Rmd\n other.Rmd\n end.Rmd\n stop\n end_sreen.Rmd\n stop\n\nThe logic in this example would show the respondent the content in welcome.Rmd and screener.Rmd, and then it would evaluate the response to the age question (assuming it was shown in screener.Rmd). In this survey, the respondent is sent to the end_screen.Rmd page if their age is greater than 40, otherwise they would continue on to the other.Rmd and end.Rmd pages. A word like stop would be a special word that stops the survey at that point.\n6. Ability to run code in the survey\nThe ability to run code during a survey is perhaps the most promising aspect this framework. The formr platform can do exactly this, and it is the inspiration for why I feel this feature must be built into {surveydown} one way or another.\nHow to implement this is an open challenge. There are a number of ways to do it, and the exact implementation will probably depend on other aspects of how the package is built, but one idea is to leverage an idea related to child documents. I can imagine that users could make child Rmd files that contain code chunks that define aspects of questions, and then reference them in the _questions.yml file, something like this:\n\napples:\n type: mc\n text: \"What's your favorite apple?\"\n child: /child/apples.Rmd\n\nWhat exactly would go in the /child/apples.Rmd file is yet to be defined, but it might require code chunks that contain {surveydown} functions for defining aspects of the question, such as the text, options, etc. This obviously has not yet been well thought out, but I mention it here as it is an important concept to keep in mind as other aspects of the framework are constructed.\nA quick prototype\nRather than start completely from scratch, I decided to build on the {shinysurveys} package (by Jonathan Trattner and Lucy D’Agostino McGowan) for a very quick prototype of this framework. My prototype is not at all complete - most of the features I have listed are not implemented. All it really does is allow the user to define questions in a _questions.yml file, and then runs a shiny app with those questions in series (so no Rmd files, page breaks, etc.). The code for this prototype can be found in the repo I set up for this project: github.com/jhelvy/surveydown\nThe _questions.yml file contains this:\n\nage:\n type: numeric\n required: true\n text: \"What's your age?\"\n option: 25\n\ngender:\n type: mc\n required: true\n text: \"Which best describes your gender?\"\n option:\n - female: \"Female\"\n - male: \"Male\"\n - opt_out: \"Prefer not to say\"\n - self_desc: \"Prefer to self describe\"\n\ngender_self_describe:\n type: text\n text: \"Which best describes your gender?\"\n dependence: gender\n dependence_value: \"Prefer to self describe\"\n\neducation:\n type: select\n text: \"What is the highest level of education you have attained?\"\n option:\n - hs_no: \"Did not attend high school\"\n - hs_some: \"Some high school\"\n - hs_grad: \"High school graduate\"\n - college_some: \"Some college\"\n - college_grad: \"College\"\n - grad: \"Graduate Work\"\n - no_response: \"Prefer not to say\"\n\nrexper:\n type: mc\n text: \"Have you ever learned to program in R?\"\n option:\n - yes: \"Yes\"\n - no: \"No\"\n\nWhen run, the basic example makes a shiny survey that looks like this:\n\n\n\n\n\nGif from https://vsgif.com/gif/2859497\nPlatforms\nOkay, I’ve got a very basic survey working in Shiny, but it’s far from complete. It doesn’t even have a database backend or anything - just a UI built from a _questions.yml file. And Shiny may not be the best platform to build this framework in - it’s just the first thing I found that I could quickly implement without having to learn too much. There are other options, and that’s where I would really like to hear from others about the best direction to go next.\nAt the highest level, I believe the goal should be to develop the framework as an R or python package. Since I prefer R and am more familiar with similar packages like {xaringan}, my starting point would be to develop an R package that uses R code to convert the text files defined by a user into the code for an online survey.\nThis framework could use a number of different underlying platforms though to implement the final survey. So in some ways we have to work backwards - find a platform we like that makes good surveys and supports an easy database backend for storing responses, then build the R (or python) package that converts the markdown text files into the code for that survey.\nShiny?\nShiny is a convenient package for building this framework as it already comes with all of the widgets needed for most types of survey questions. It’s also something that many R users (and a lot of academics who might make these surveys) already know. It has some simple and relatively streamlined approaches to hosting too, such as shinyapps.io. Finally, the ability to run code in the survey should also be relatively straightforward in Shiny.\nThe downside to working with shiny is that is isn’t really designed for collecting data as much as it is for displaying data. That said, one approach I have used in the past to store and manage data for a Shiny app is to hook it into a Google sheet. While it’s a little hacky, it might actually work okay for this context because surveys usually don’t require enormous sizes - the databases are often a few thousand rows at most and maybe tens to hundreds of columns (quite easily something a Google sheet could handle). I mean, Google uses sheets for Google forms, so why not? Airtable is another similar option here.\nSo Shiny is an option, and we could build upon / modify the {shinysurveys} package as a starting point. But a key question would be about the best way to integrate a database backend for managing responses.\nSurveyJS?\nI haven’t used it, but SurveyJS looks really nice. I imagine it wouldn’t be too difficult to write some R functions that convert markdown inputs into a SurveyJS survey. It looks like it may be more flexible than Shiny, and it seems to have lovely database backend integrations, but there would be a much larger learning curve to build it (at least for me…maybe a JS expert out there wants to take the lead?). It’s certainly doable. After all, {xaringan} uses reactjs, and Quarto surveys use revealjs…maybe {surveydown} should use SurveyJS?\nOne challenge it poses though is the ability to run code in the background. I don’t yet see a clear way to do something like run code chunks, so this feature may need to be cut to make it work.\nQualtrics? SurveyMonkey?\nThe survey gorillas like Qualtrics and SurveyMonkey are solid survey providers with loads of other features. Why reinvent the wheel? So another option is to build a package that does the translation of converting markdown files into something that is uploadable to one of these sites. I am not a big user of them, so I don’t even know if this is possible, but if it is then it would solve a lot of issues.\nSo in this case a {surveydown} package would really be a markdown-based framework to design and preview the survey UI. Once the user is happy with it, they would upload it to Qualtrics, SurveyMonkey, etc. and field their survey.\nAn obvious downside is that it’s not open source (though I don’t think the entire toolchain needs to be for this to still be useful). Also, again the ability to run code in the survey may be lost, so that too is a considerable downside to this approach.\nPlease help!🙏\nIf you like this idea and want to contribute, I started a repo here: https://github.com/jhelvy/surveydown\nI include my basic Shiny example in the shiny-example folder. I’m not committed to using Shiny as the underlying platform, though I can at least envision how it might work.\nIf you have any thoughts on this, please do leave comments here or open an issue. I would love to see this framework built, and I think it could really improve and simplify the survey design process for many researchers.\nThanks for reading, and hopefully we’ll get this built one day!\nUpdate 1 (2023-06-15)\nAfter playing around with different versions of a UI, I’ve converged on a rather different approach to defining survey content. When using the UI originally described in this post, everything just felt too disaggregated. It lacked the cohesiveness that I love when, for example, making a presenting using {xaringan}. I missed having everything in a single RMarkdown file where the UI was much closer to literate programming.\nSo I’ve come up with a simpler UI. In this UI, you have a single survey.Rmd file (might also consider a .qmd Quarto file) where the survey theme and control logic is defined in a YAML at the top and all of the survey content is defined in the body. An example YAML might look like this:\n---\nname: \"surveydown demo\"\nauthor: \"John Paul Helveston\"\noutput:\n surveydown::survey:\n css:\n - default\n lib_dir: libs\n control:\n skip:\n condition: color == \"Blue\"\n destination: end_screenout\n stop: end\n stop: end_screenout\n---\nHere the surveydown::survey: section is where we could define all global options to the survey. I’ve added a control section to define flow logic. I could also imagine including a link to a database, etc.\nThen the body of the survey would feel just like a normal RMarkdown file. You could add any general markdown you want and insert questions using a code chunk with a question() function, something like this:\n```{r}\nquestion(\n name = 'color',\n type = 'mc',\n required = TRUE,\n label = \"Do you want to take the red pill or the blue pill?\",\n option = c('Red', 'Blue')\n)\n```\nThe survey could be rendered into a view only mode with a simple function, like surveydown::render_survey(file = 'survey.Rmd'). Then to make the survey go live, it could be hosted on shinyapps.io (or a different server) with another function, maybe:\n\n\nsurveydown::host_survey(\n folder = 'survey',\n data_url = 'path_to_googlesheet',\n api_key = 'api_key'\n)\n\n\nThis UI overall is much more simplified and allows for more complexity in how questions are defined. For example, you can write R code to define the options in a question, or you could read in a set of options from an external file.\nObviously a lot of work needs to be done to make something like this a reality, but it at least provides a goal.\n\n\n\n",
"preview": "posts/2023-04-06-markdown-surveys/logo.png",
- "last_modified": "2024-02-15T05:11:40+00:00",
+ "last_modified": "2024-02-16T05:11:40+00:00",
"input_file": {},
"preview_width": 2998,
"preview_height": 1979
@@ -41,7 +41,7 @@
],
"contents": "\n\nContents\nParameter basics\nAdding parameters in .qmd files\nPassing parameters while rendering\nIterative rendering\n\nExamples\nSome challenges\nPassing data frames as parameters\nRendering to a different directory\n\nWrap up\n\n\n\nQuarto is an open-source scientific and technical publishing system built on Pandoc. Many view it as the “next generation” of RMarkdown, though it is more general in that is supports mutliple languages and is not R-specific.\nOne of the most common use cases for me is creating parameterized reports. A parameterized report is one that uses a .qmd file with parameters that can be passed to it while rendering, enabling the ability to create different versions of the output report based on the parameters.\n\nBTW, Meghan Hall has a great post on this topic and goes into much more detail on how to customize outputs—take a look!\n\nAs a professor, one way I use parameterized reports is in providing my student’s unique feedback for their assignments. Using parameters like student_name, grade, and feedback, I am able to use a single .qmd file and then render a unique report for each student where those parameters are replaced with the appropriate information for each student.\nRMarkdown and Quarto use almost identical interfaces for rendering parameterized reports, so most of this post applies directly to RMarkdown. But since Quarto is more general and newer, I’m going to focus on Quarto for this post.\nOne more side note—these examples only apply when using knitr as the rendering engine (you can also use parameters with Jupyter, which uses a different syntax).\nParameter basics\nTo render a parameterized output in Quarto, you have to follow two steps:\nAdd parameters to your .qmd file\nPass the parameter values while rendering.\nAdding parameters in .qmd files\nIn the YAML, you can define any parameters you want using params. For example, if I wanted to make a report with the parameter name that you will replace with a person’s name when rendering, you would add this to the YAML:\n---\nparams:\n name: \"John\"\n---\nThe value \"John\" is the default value for the parameter, which will be used if no parameter is passed. This parameter can now be used anywhere in the .qmd file using params$name, which will be replaced with whatever the parameter value is. Note that in a code chunk you can just use params$name directly, but if you want to use it in-line (e.g. in a sentence) you have use an in-line R command, like so:\n\n\n`r params$name`\n\n\nYou can include as many parameters as you want, just add them to params. For example, here is how you would add a parameter for name and grade:\n---\nparams:\n name: \"John\"\n grade: \"100%\"\n---\nOne nice feature about using parameters is that you an preview the output with the default values, that way you can make sure everything looks the way you want before creating different versions of the document.\nI tend to save these files as something like “template.qmd”, since it is a template that I will use to render to multiple different versions.\nPassing parameters while rendering\nOnce you have a “template.qmd” file ready with parameters in place, you can pass new parameters to it while rendering. If you prefer to work in the terminal, you can pass parameters in the quarto render command, e.g.:\nquarto render template.qmd -P name:'Paul' -P grade:'98%'\nIf you’re more comfortable working in R than the terminal (like me), you can use the {quarto} R package to render the .qmd file. The main function is quarto::quarto_render(), which takes an input argument for the path to the “template.qmd” file. To pass parameters, you use the execute_params argument, which must be a list of parameters. For example, to render the same output as in the terminal example above, you would use:\n\n\nquarto::quarto_render(\n input = \"template.qmd\",\n execute_params = list(\n name = \"Paul\",\n grade = \"98%\"\n )\n)\n\n\nIterative rendering\nI tend to have more than one set of parameters I need to pass to my “template.qmd” file (e.g. I need a report for every student in my class). In these cases, I use the quarto::quarto_render() command inside a loop.\nFor example, imagine that I had a “grades.csv” file with the columns name and grade for each student in my class. I could read in that data file and then iteratively render the “template.qmd” file for each student. Here I have to be careful to make sure I also provide an output_file argument so that each report has a unique name. My code would look something like this:\n\n\ndf <- readr::read_csv(\"grades.csv\")\n\nfor (i in 1:nrow(df)) {\n student <- df[i, ] # Each row is a unique student\n quarto::quarto_render(\n input = \"template.qmd\",\n output_file = paste0(\"feedback-\", student$name, \".pdf\"),\n execute_params = list(\n name = student$name,\n grade = student$grade\n )\n )\n}\n\n\nIf I ran this code, I would end up with a lot of PDF files in my directory, each with the name “feedback-{name}.pdf”, where “{name}” is replaced with each student’s name (e.g. “feedback-John.pdf”).\n\nAside for the {purrr} people: Yes I know there are other ways to iterate, but for this specific purpose I prefer loops as I find it easier for passing parameters (especially if there are multiple parameters).\n\nExamples\nFor a recent GW Coders meetup (which you can watch here), I demonstrated how to use parameterized Quarto files with two simple examples: Grades and Wedding Cards. The code for those demos are available at https://github.com/jhelvy/quarto-pdf-demo.\nThe grades example is similar to the example I have used thus far in this post for creating unique reports for several students. The Wedding Cards example demonstrates how I could use two different templates and render the appropriate one depending on a condition (in this case these are “thank you” cards that contain a different message depending on whether the gift was money or not).\nIn each example, I have a “template.qmd” file that defines the content of the parameterized output PDF, and a “make_pdfs.R” file that contains the R code to iteratively render each PDF. I encourage you to download the files and play with them yourself to see how each example works. They are by no means the only (or even best) way to do this, but they provide a working starting point to build upon.\nSome challenges\nIn the demo repo, I have included a third example called “data-frames” that demonstrates some fixes for two challenges I have run into when rendering parameterized reports in Quarto. Those are:\nPassing a data frame object as a parameter.\nRendering the output to a different directory.\nIt is worth mentioning that neither of these are issues when using RMarkdown. They may be addressed more elegantly in the future, but for now here are my workaround solutions.\nPassing data frames as parameters\nSince Quarto is a separate program from R, it doesn’t know what a data frame is, so if you pass a data frame object as a parameter in execute_params, it will convert it to a list. This issue was posted in the Posit Community forum here.\nAfter posting about the issue in the Fediverse, both Mickaël Canouil and Garrick Aden-Buie suggested using the {jsonlite} package to serialize the data frame to pass it as a parameter and then un-serialize it back to a data frame inside the .qmd file. Turns out this worked perfectly!\nThe specific functions I use to handle the job are jsonlite::toJSON() and jsonlite::fromJSON(). In the quarto::quarto_render() command, I have to serialize the data frame inside the parameter list like so:\n\n\nquarto::quarto_render(\n input = \"template.qmd\",\n execute_params = list(\n df = jsonlite::toJSON(df), # Serialize the data frame\n month = month\n )\n)\n\n\nThen inside my “template.qmd” file I un-serialize it back to a data frame inside a code chunk with the following line:\n\n\ndf <- jsonlite::fromJSON(params$df)\n\n\nFrom there on I can use the df object anywhere in my “template.qmd” file as a data frame. The reason this isn’t an issue when using RMarkdown is that RMarkdown runs inside R, so it “knows” what a data frame is throughout the whole process.\nIn the “data-frames” example, I create monthly summary tables of flight departure and arrival delays by airline using the {nycflights} package.\nIn this specific example, an easier approach would be to simply pass the “month” as a parameter to the “template.qmd” file and then compute the summary table there (this is in fact my recommended approach if possible). But that requires that the data be accessible from outside the “template.qmd” file (e.g. saved to disc), and that the summary calculations be relatively fast. If, for example, reading in and summarizing the data is computationally expensive, then it may be easier to do what I have done in this example, which is first read in and summarize all the data, then pass along the summary data frame to the “template.qmd” file as a serialized data frame.\nRendering to a different directory\nUnfortunately, at least at the moment it appears that quarto::quarto_render() is not capable of rendering an output file to any location other than the current directory. I noted this in the quarto-cli discussion forums here. The best solution for now seems to be to simply render the output and then copy it over to a desired output directory.\nIn practice, this is a bit cumbersome as there are a number of different conditions to consider that make the copy-pasting not so simple, so my solution was to write my own custom function that works as a wrapper around quarto::quarto_render() and allows the user to provide an optional output_dir for where the output file will be moved post-rendering.\nI have put this function inside my person R package {jph}, which you can install if you wish to use it yourself. I named the function quarto_render_move(), which renders and then optionally moves the file to a desired location. The function source is available here.\nIn practice, it works as a drop-in replacement for quarto::quarto_render(). Here is an example:\n\n\njph::quarto_render_move(\n input = \"template.qmd\",\n output_file = \"feedback-student.pdf\",\n output_dir = \"output_folder\",\n execute_params = list(\n name = \"Paul\",\n grade = \"98%\"\n )\n)\n\n\nUsing this code, the output file would be placed inside a folder called “output_folder”.\nWrap up\nQuarto is still quite new, and the user base is still growing. Without a doubt, I expect that most current Quarto users are coming from RMarkdown, which has for years just seemed like total wizardry with how seamlessly it works.\nComing from RMarkdown myself, Quarto has a lot of very nice features that definitely build on the best of what RMarkdown has had to offer. But it’s not perfect, and the fact that it is totally separate from R (i.e. it’s not an R package) has meant giving up some of the conveniences I have enjoyed, like passing data frames around with wreckless abandon. Hopefully the tricks posted here will work for you too if you try to use them. However your Quarto journey goes, let me know with a comment!\nCheers,\nJP\n\n\n\n",
"preview": "posts/2023-02-28-parameterized-pdfs-with-quarto/quarto.png",
- "last_modified": "2024-02-15T05:11:40+00:00",
+ "last_modified": "2024-02-16T05:11:40+00:00",
"input_file": {},
"preview_width": 308,
"preview_height": 182
@@ -62,7 +62,7 @@
],
"contents": "\n\n\n\nImage source: https://theawesomer.com/golf-chess-set/132440/\nIf you’ve been paying attention to AI progress in 2022, you almost\ncertainly have heard of ChatGPT, an AI engine made by Open AI, that can generate remarkably\ndetailed text and code to prompts. Like many, I wanted to test it out to\nsee what it could come up with. This post summarizes the results of that\nexperiment.\nThis initial post is based on the very first publicly available chat\nengine, ChatGPT, which is based on GPT3. In the future, I plan to update\nthis post with responses from increasingly performant chat AIs to the\nexact same prompts.\nRound 1\nMy starting prompt for this post was quite simple:\n\nWrite a blog post arguing why chess is a much better game for people\nto adopt than golf.\n\nI chose this prompt for several reasons. First, it asks for an\nargumentative blog post, and I wanted to see how well the chat AI would\ndo at coming up with some arguments for this topic. Second, I wanted to\nsee what the AI would write on a topic that I haven’t seen someone else\nwrite about before. I tried to think of something relatively obscure,\nbut specific enough that there might be some clear, logical\narguments.\nBut most importantly, I wanted to see what specific arguments the AI\nwould use. There are some rather obvious social and environmental\naspects to this topic, such as the difference in resource requirements\nto play each game (golf, for example, requires a lot more land). Both\ngames are also social activities, and both can be played across just\nabout all ages. There are also mental and health benefits to each game.\nI was curious what arguments the AI would use with no further\nprompting.\nHere is what it produced based on the first prompt:\n\nChess and golf are both popular leisure activities that require\nstrategy, skill, and mental focus. However, there are several reasons\nwhy chess is a much better game for people to adopt than golf.\nFirst, chess is a much more accessible and affordable game than golf.\nWhile golf requires expensive equipment and exclusive membership to a\nclub or course, all you need to play chess is a chessboard and pieces,\nwhich can be purchased for a relatively low cost. Additionally, chess\ncan be played anywhere, whether it’s at home, in a park, or online,\nwhile golf requires a dedicated course.\nSecond, chess is a more inclusive game than golf. Golf has a\nreputation for being a primarily wealthy and white-dominated sport,\nwhile chess has a more diverse player base and is more accessible to\npeople of different socio-economic backgrounds.\nThird, chess has a range of benefits for the mind and body. Studies\nhave shown that playing chess can improve problem-solving skills,\ncritical thinking, and memory, as well as reduce the risk of developing\nconditions such as dementia and Alzheimer’s disease. Golf, on the other\nhand, has fewer cognitive benefits and requires more physical exertion,\nmaking it less suitable for those with mobility issues or\ndisabilities.\nFinally, chess is a more universal and timeless game than golf. While\ngolf is primarily popular in certain parts of the world and has a\nsomewhat limited appeal, chess is played and enjoyed by people all over\nthe globe, and its popularity has endured for centuries.\nIn conclusion, chess is a superior game to golf for several reasons.\nIt is more accessible, inclusive, beneficial for the mind, and universal\nin its appeal. If you’re looking for a strategic and mentally\nstimulating activity, chess is the clear choice.\n\nOverall, not too shabby for a single-line prompt. But the post itself\nwas rather simple and robotic. The writing structure is about as basic\nas you could get, and the arguments are all pretty obvious. For a grade\nschool writing assignment, this would pass, but it’s nothing I would be\nexcited about reading.\nThat said, I found it impressive that I didn’t even need to prompt it\nabout including social aspects around this topic. The response\nhighlighted the exclusivity of golf, and it also mentioned something\nabout health benefits from chess.\nBut the biggest flaw I saw was the lack of any references to some of\nthe claims in the post. In particular, the AI wrote a sentence stating\nthat “studies have shown that playing chess can…reduce the risk of\ndeveloping conditions such as dementia and Alzheimer’s disease.” Any\npost that includes a line like that better have some references to those\n“studies.” So I updated my prompt asking explicitly for references. I\nalso asked the AI to include details about the environmental impacts of\ngolf as that is a well-known issue with the game.\nRound 2\nMy second prompt read:\n\nI want you to write me a blog post. This post should include\ntechnical details to support any arguments provided to it. The topic of\nthe post is to argue why chess is a much better game for people to adopt\nthan golf. Include details about the environmental damages from golf as\nwell as information about the social dynamic of golf, such as networking\nbenefits.\n\nHere is the response I got:\n\nCertainly, here is a blog post arguing why chess is a better game\nthan golf:\nWhy Chess is a Better Game than Golf\nGolf is a popular leisure activity enjoyed by millions of people\naround the world, but it is not without its drawbacks. In contrast,\nchess is a game that offers numerous benefits and is a superior choice\nfor those looking for a challenging and rewarding pastime.\nOne of the main criticisms of golf is its environmental impact,\nparticularly with regard to land use. Golf courses require a significant\namount of land to be developed and maintained, and this land is often\ntaken from natural habitats. According to a study published in the\nJournal of Environmental Management, the average golf course in the\nUnited States occupies approximately 120 acres (48.5 hectares) of land\n(Karecha, Jorgensen, & Madsen, 2002). This land is often not\navailable for other uses, such as agriculture or housing, leading to a\nloss of ecosystem services and biodiversity (Karecha et al., 2002). In\naddition, the chemicals used to keep the greens and fairways in pristine\ncondition can have negative impacts on local ecosystems (Karecha et al.,\n2002). In contrast, chess requires no physical space or resources beyond\na board and pieces, making it a more environmentally friendly\noption.\nGolf can also be a divisive and exclusive activity. Golf courses and\nclub memberships are often expensive, making them out of reach for many\npeople. This exclusivity can create a social hierarchy and foster a\nculture of elitism. In contrast, chess is a game that can be enjoyed by\npeople of all ages, backgrounds, and skill levels. It requires no\nspecial equipment or facilities, making it accessible to everyone.\nBut while golf may be traditionally seen as a game for social\nnetworking, chess can offer similar opportunities for social interaction\nand networking. Many chess clubs and organizations exist worldwide,\nproviding a platform for people with a shared interest in the game to\ncome together and socialize. These clubs often host tournaments and\nother events, providing opportunities for players to compete and meet\nnew people. In addition, the internet has made it possible for chess\nplayers to connect and play with others online, further expanding the\npotential for social interaction and networking.\nPerhaps the most compelling reason to choose chess over golf is the\ncognitive benefits of the game. Chess is a highly strategic and\nchallenging game that requires deep concentration and problem-solving\nskills. It has been shown to improve brain function, including memory,\ncritical thinking, and decision-making skills (Gauthier, Tardif, &\nBelleville, 2010). In contrast, golf is often viewed as a leisurely\nactivity that does not offer the same level of mental stimulation.\nIn conclusion, chess is a superior game to golf for a variety of\nreasons. It is environmentally friendly, particularly with regard to\nland use and chemical use, accessible to all, and offers significant\ncognitive benefits. It can also provide opportunities for social\ninteraction and networking. If you’re looking for a challenging and\nrewarding pastime, consider giving chess a try. You might be surprised\nat how much you enjoy it.\nReferences:\nGauthier, S., Tardif, C., & Belleville, S. (2010). The benefits\nof chess for children’s cognitive and social development: A systematic\nreview. Frontiers in Psychology, 1, 186. https://doi.org/10.3389/fpsyg.2010.00186\nKarecha, J., Jorgensen, B. E., & Madsen, J. (2002). The\nenvironmental impacts of golf courses: A review. Journal of\nEnvironmental Management, 66(3), 319-334. https://doi.org/10.1006/jema.2002.0490\n\nThese additions felt a lot more impactful than the original post in\nterms of writing a persuasive argument. I appreciated the inclusion of\n“facts” like the amount of land used for an average golf course, as well\nas references to support those “facts.”\nThe problem is these “facts” are fake. The references included look\nreal, but they are both completely fabricated. The DOI links included\nare actually real DOIs, but they link to totally different and unrelated\npapers. So after reading this, I actually have no idea whether the\naverage golf course in the U.S. is 120 acres, or whether golf courses\nlead to a “loss of ecosystem services and biodiversity.” The only way I\ncan think to find out about whether these statements are true is to use\nan actual search engine like Google to search for real sources.\nI found this outcome deeply troubling. The overall writing was much\nbetter in this second round with a more detailed prompt, but the\narguments are based on potentially false results with fabricated\nreferences.\nWhere this goes next is going to get really interesting. Will the\nnext chat AI be able to include references to real research papers? Will\nthe arguments become more nuanced? Will the output include more\nquantitative arguments, such as quantifying the reduction in\nbiodiversity from golf courses (if that’s even true)? Given how quickly\nthese models are developing I would not be surprised if the answer to\nall of the above is “yes”. Time will tell, and I will update this post\nwhen the next chat AI is made publicly available.\n\n\n\n",
"preview": "posts/2022-12-26-why-chess-is-better-than-golf/chess-golf.png",
- "last_modified": "2024-02-15T05:11:40+00:00",
+ "last_modified": "2024-02-16T05:11:40+00:00",
"input_file": {},
"preview_width": 640,
"preview_height": 599
@@ -85,7 +85,7 @@
],
"contents": "\n\nContents\nWhat can it do?\nIn the\nbeginning, there was pagedown::chrome_print()\nBuild all the things!\nEnter stage left:\ngadenbuie\nOutside influences\nChanging the name\nGoing to CRAN!\nWhat’s next?\n\n\n\n\nThis post introduces the {renderthis}\n package (originally named\n{xaringanBuilder}). The package contains functions for rendering\nxaringan slides to different formats. Rather than use this post to demo\nall the package features, I decided to write a brief overview of it’s\nprimary functionality and then focus the rest of this post on the\npackage’s development history. The image above shows the different\nversions of the package hex sticker, the first one being in the running\nfor the ugliest hex sticker of all time.\nWhat can it do?\nThe {renderthis} package contains functions for rendering xaringan\nslides (Quarto support coming!) to different formats, including html,\npdf, png, gif, pptx, and mp4, as well as a ‘social’ output, a png of the\nfirst slide re-sized for sharing on social media.\nThe package follows the following render hierarchy:\nRmd\n |\n |--> social (png)\n |\n |--> html\n |\n |--> pdf\n |\n |--> png\n |\n |--> gif\n |\n |--> mp4\n |\n |--> pptx\nAll of the package functions follow a common pattern:\nAll functions start with to_*() to render slides to a\ndesired format (e.g., to_pdf()).\nAll functions have a required from argument which\nshould be set to the full or local path to the input file.\nAll functions have an optional to argument. If\nprovided, it can be a full or local path to the output file, and it must\nend in an appropriate extension (e.g. slides.gif for\nto_gif()). If it is not provided, the output file name will\nbe determined based on the from argument.\nWhile it is designed for rendering {xaringan} slides, other html\nslides are also supported. For example, if you have a Quarto\npresentation rendered as an html file, you can use any of the\nto_*() functions to render those slides to a pdf, png, gif,\npptx, or mp4 file.\nThat’s the gist of what the package does, and the package documentation\nhas many more details about the package features. Now, follow me back to\n2020 where this journey began.\nIn the\nbeginning, there was pagedown::chrome_print()\nThe {renderthis} package began in the fall of 2020 as a solution to a\nsimple problem: converting {xaringan} slides to a pdf.\nI love {xaringan} slides, but the html format is clunky for sharing,\nespecially as an email attachment, so I wanted a way to convert them to\na pdf. Before making the package, my solution was to simply use\npagedown::chrome_print() to “print” my rendered html file\nof {xaringan} slides to a pdf. For the most part, it worked just fine,\nand it still does! If all you need is a quick pdf of some simple slides,\nthen pagedown::chrome_print() is probably sufficient.\nBut being the lazy programmer I am, I wanted a single\nfunction that rendered the html and pdf files from the Rmd source file.\nSo I wrote a quick function I called build_pdf()\nthat was essentially a wrapper around\npagedown::chrome_print(). It checked if the input file was\nan Rmd file, and if so it first rendered the html and then rendered the\npdf. It was a pretty simple solution, and I found it easier to remember\nthe function build_pdf() over chrome_print().\nWith that function written, I decided to name the package\n{xaringanBuilder} as an homage to Garrick Aden-Buie’s incredible {xaringanExtra}\npackage.\nBuild all the things!\n\n\n\n\nImage\nsource\nWith build_pdf() working, I realized that I could use\nthe rendered pdf to convert my slides into other format types without\ntoo much more effort. For example, I wrote a\nbuild_thumbnail() function (a wrapper around\npagedown::chrome_print(format = \"png\")) to convert the\ntitle slide of the pdf deck to a png image. This was handy for making\nthumbnail images for youtube recordings of my class\nlectures (hence the name build_thumbnail()).\nBut then I realized I could also render all the slides to\npng images, and with the help of the {magick}\npackage I could use magick::image_join() and\nmagick::image_animate() to compile those pngs into an\nanimated gif of my whole presentation! This provided a handy way to\nshare my slides, like posting them on Twitter.\nEnter stage left: gadenbuie\nJust a couple months after starting the package, Garrick wrote an awesome\nblog post on using the {chromote} package to\nprint slides to a pdf in a way to allows incremental animations to be\nrendered on separate slides as well as separate slides for each panel in\na panelset\n(the pagedown::chrome_print() solution would only render\nthese as a single slide).\n\n\n\nJust posted: Do you have complicated {xaringan}\n#rstats\nslides or use panelsets from {xaringanExtra}? Do you need a PDF version\nof your slides? I've got a little function for\nyou:https://t.co/UXwW2KtvOw\n\n— Garrick Aden-Buie (@grrrck)\nJanuary\n25, 2021\n\n\nAfter connecting\non Twitter, I quickly added Garrick as a contributor and we started\nworking on integrating this function into {xaringanBuilder}.\n\nFolks, at this point I need to tell you about a little-known package\ndevelopment secret: If you want your package to be awesome, ask\nGarrick to join the project.\n\nI cannot emphasize enough just how much better this package became\nonce Garrick started working on it. Garrick has an excellent eye for\nmaking packages user-friendly, and he is an absolute Git / GitHub wizard\n(read on to the bit about changing the package name to see some of\nthis). Here are a few things he quickly added:\nHe made lots of quick changes to make all the functions more robust\nfor different use cases.\nHe integrated the {cli}\npackage to make beautiful status messages so the user can see the\nprogress of the output being built.\nHe introduced the {fs}\npackage to simplify how file paths were being handled (which were\ngetting rather tricky with all the different output types\ninvolved).\nHe introduced multiple tests and integrated it with Github Actions\nso that the tests would run and the {pkgdown} site would render on every\ncommit.\nWithout a doubt, Garrick’s additions transformed the package from a\nhandful of convenient functions to a feature-packed, user-friendly tool\nfor rendering slides.\nOutside influences\nAt this stage, the package had already become quite a useful tool,\nand many users were filing issues catching little bugs here and there.\nThe package had also grabbed the attention of several people who helped\nadd more features. Bryan\nShalloway added the initial\nsupport for rendering slides into a powerpoint deck, and Gina Reynolds’s {flipbookr} package\nprovided some inspiration for adding support for rendering slides to mp4\nvideos.\nMany other people added in small ways to the package over the\nfollowing year. Some found bugs and posted issues, others sent pull\nrequests with fixes. Over the summer of 2021 Garrick did an overhaul of how\npaths were handled in the package, after which the package remained\nrelatively stable for about a year.\nChanging the name\nIn the spring of 2022, Quarto\nstarted to catch fire. Alison Hill’s “We\ndon’t talk about Quarto” post (among others) inspired me to check\nout Quarto\npresentations, which are quite similar to {xaringan} slides.\n\n\n\nWoah, okay\n#rstats\nworld I may have to just make the .Rmd -> .qmd switch…@quarto_dev is looking\namazing. Exhibit A: Figures and tables in the margins! 🤯\npic.twitter.com/BDEHvxhd6e\n\n— John Paul Helveston (@JohnHelveston)\nMarch\n30, 2022\n\n\nThe revelation of Quarto presentations made me realize that the\n{xaringanBuilder} package could actually be used to render a wider\nvariety of inputs other than just {xaringan} slides. In fact, I myself\nhad already used it for other purposes, such as converting the first\npage of a pdf paper I wrote into a png image This prompted me to\nconsider changing the\npackage name.\nAfter a little back\nand forth, Garrick and I decided to call it {renderthis} as an\nhomage to the phenomenal {usethis} package. We also wanted\nto maintain the strategy of having every function begin with the same\nword (previously build_*()), and we chose\nto_*() so that the package and function names would work\ntogether to form a kind of sentence, like\nrenderthis::to_pdf().\nOnce we came up with a new name, implementing that name was daunting.\nI really had no idea where to begin, so I just created an entirely new\nrepo under jhelvy/renderthis and started editing the source\ncode accordingly. This proved to be a useful exercise as it allowed us\nto see how the new package would look and feel without touching the\nolder one. In retrospect, I probably should have just opened up a new\nbranch under the {xaringanBuilder} repo to do this, but the nice thing\nabout working from a totally new repo is that I could also preview the\ndocumentation with full links, etc. Once we were happy with the way it\nlooked, we actually went back to the {xaringanBuilder} repo and changed\nthe name there.\nThis is where things got a little tricky. Since I already had a repo\nunder jhelvy/renderthis, I couldn’t just change the name of\njhelvy/xaringanBuilder to it. So I first changed\njhelvy/renderthis to jhelvy/renderthistemp,\nthen changed jhelvy/xaringanBuilder to\njhelvy/renderthis. GitHub automagically handles redirects\nso that any user that navigates to https://github.com/jhelvy/xaringanBuilder gets\nredirected to https://github.com/jhelvy/renderthis.\nOnce again, shout out to Garrick for using some serious GitHub\nwizardry here. He was able to salvage most of my edits in the\njhelvy/renderthistemp repo and merge them into the new\njhelvy/renderthis repo (I still have no idea how he managed\nthis). Garrick guided the way through the messy process of changing a\npackage name without losing any of the commit history on GitHub. With a\nfew more small changes to clean up the package documentation and pkgdown\nsite, {renderthis} was ready to be introduced to the world.\nGoing to CRAN!\nThe {chromote} and\n{webshot2} packages\n(which {renderthis} depends on) were both conveniently released on CRAN\nin May of 2022. With these additions, we now had every dependency on\nCRAN, so that gave us the motivation to get {renderthis} on CRAN as\nwell.\nThis too was a bit tricky given what this package does. In\nparticular, the function examples we wrote had code that if executed\nwould error, e.g., renderthis::to_pdf(\"slides.Rmd\"). Even\nthough we wrapped these in \\dontrun{}, CRAN still ran these\nexamples on the first submission. After a little more back and forth on\nhow to address this, Garrick came up with a handy solution by making a\nnew with_example()\nfunction which allows the examples to be run in a temp directory, which\nwe use in each function example.\nWith that fix, version 0.1.0 of {renderthis} was published\non CRAN 🎉!\nWhat’s next?\nOne of the complaints I had early on with {xaringan} is that there\nwas no way to print out “handouts”. I can’t stand Powerpoint, but the\nability to print out a pdf with small thumbnails of each slide with\nspace for notes was a convenient feature I used often, and to date\nnothing like it exists for the {xaringan} ecosystem.\nFor this reason, we are working on a to_handout()\nfunction that will be introduced in the next version of {renderthis}.\nBelow is an example of a set of demo slides.\n\n\nWe’re also planning on supporting rendering Quarto slides, which\nagain was one of the motivations to change the package name to a more\ngeneral one.\nIf you made it to the end of this post, thanks for reading! And if\nyou give {renderthis} a shot and have feedback or thoughts on it, please\nlet us know by filing an issue.\nAnd if you love the package and just want to let us know about that,\ndrop us a note on the “praise” issue\n😊.\n\n\n\n",
"preview": "posts/2022-06-28-introducing-renderthis/images/logo.png",
- "last_modified": "2024-02-15T05:11:40+00:00",
+ "last_modified": "2024-02-16T05:11:40+00:00",
"input_file": {},
"preview_width": 637,
"preview_height": 345
@@ -107,7 +107,7 @@
],
"contents": "\n\nContents\nDefining the simulation\nEstimating\nthe expected number of survivors\nChoose wisely\nHow many steps to\ninclude?\nUpdate 1 - including\ntime\nUpdate 2\nUpdate 3\nAppendix\n\n\n\n\n\n\nIf you had a Netflix account in 2021, chances are you’ve at least\nheard of Squid\nGame – a South Korean survival drama in which players play a series\nof children’s games for the chance to win a large sum of money, with the\npenalty of death if they lose. The show is like a mixture of Richard\nConnell’s The Most\nDangerous Game, the Hunger Games,\nand the horror stories of Edgar Allan\nPoe. It quickly became the #1 most most-watched show after just one\nweek of being released.\n\nSPOILER ALERT: I mention a few details in this post that might be\nspoilers if you haven’t yet watched the show.\n\nIn Episode 7, the remaining 16 players play a game where they must\ncross a glass bridge with 18 steps. At each step, the lead player must\nchoose between two glass panes – one made of tempered glass strong\nenough to support two players, and the other of regular glass that will\nshatter if stepped on, in which case the player will plummet to their\ndeath. You can watch the scene here.\nWhen I watched the episode, I immediately started trying to calculate\nthe expected number of survivors for the game. While seemingly\nstraightforward, the problems is a bit more complex when you realize\nthat each person behind the leader learns information from the leader’s\nchoices. That is, if the lead player chooses the “not safe” pane, that\nplayer dies, but every remaining player now knows which glass pane is\n“safe” at that step.\nRather than try and work the math, I decided to take a Monte Carlo\napproach – just simulate a bunch of trials of the game, then count up\nhow many players “survived” in each trial.\nDefining the simulation\nThe first step I took in trying to simulate the game is to simplify\nthe problem. The actual game has two glass panes (one “safe” and one\n“not safe”) at each step in the bridge. This same situation can be\nmodeled with a series of single steps where the lead player has\na 50% chance of surviving at each step. If the lead player steps on a\n“not safe” pane, we mark that player as dead and replace that pane with\na “safe” one, allowing all remaining players to safely cross it.\nTo run a simulation, I first had to create the players, which I\nmodeled using a simple data.table of 1s and 0s for “alive” and “dead”.\nEveryone starts out alive.\n\n\n# Create an initial data.table of players\nlibrary(data.table)\n\nnum_players <- 16\nplayers <- data.table(player = seq(num_players), alive = 1)\nplayers\n\n\n#> player alive\n#> 1: 1 1\n#> 2: 2 1\n#> 3: 3 1\n#> 4: 4 1\n#> 5: 5 1\n#> 6: 6 1\n#> 7: 7 1\n#> 8: 8 1\n#> 9: 9 1\n#> 10: 10 1\n#> 11: 11 1\n#> 12: 12 1\n#> 13: 13 1\n#> 14: 14 1\n#> 15: 15 1\n#> 16: 16 1\n\nTo simulate a single game, I created a run_game()\nfunction that iterates through a specified number of bridge steps, each\nwith a 50% chance of being “safe”. The function iterates through each\nstep until all “safe” panels are known, at which point the remaining\n“alive” players are assumed to all be able to safely finish crossing the\nbridge (I am assuming that all players have sufficient time to cross the\nbridge).\n\n\n# Define a function for simulating one game\nrun_game <- function(players, num_steps) {\n lead_player <- 1\n for (step in seq(num_steps)) {\n # 50% chance that the pane is safe\n if (sample(c(TRUE, FALSE), 1)) {\n # The pane is safe, keep going!\n next\n } else {\n # The pane broke 💀\n # Before continuing, check if any players are still alive\n if (sum(players$alive) == 0) { return(players$alive) }\n # The lead player died\n players$alive[lead_player] <- 0\n lead_player <- lead_player + 1\n }\n }\n return(players$alive)\n}\n\n\n\nWith that, let’s run one game of 18 steps and 16 players (same as in\nthe show):\n\n\n# Run one iteration of the game\nsingle_game <- run_game(players, num_steps = 18)\nsingle_game\n\n\n#> [1] 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1\n\nSo after one game, we had 4 survivors. But of course if we want to\nget an estimate of the expected number of survivors, we need to\nrun lots of trials of the game (a lot of bits are about to die).\nEstimating the\nexpected number of survivors\nI ran my trials inside a data.table, a simulation strategy I picked\nup from Grant McDermott and\nhis wonderful post on efficient\nsimulations in R.\n\n\n# Run the main trials\n\n# Repeat the \"players\" data.table for each trial\nnum_trials <- 10000\nsims <- players[rep(seq(num_players), num_trials),]\n\n# Keep track of the trial number\nsims[, trial := rep(seq(num_trials), each = num_players)]\n\n# Now run the simulation for each trial\nsims[, alive := run_game(.SD, num_steps = 18), by = trial]\n\n\n\nIn addition to being very fast, one really nice thing about running\nthe trials using {data.table} is that you get all of the results back in\na nicely-formatted data.table, which you can then use to compute all\nsorts of statistics.\nFor example, if you wanted to see the outcome of an individual trial,\njust filter for it:\n\n\n# View one trial outcome\nsims[trial == 42]\n\n\n#> player alive trial\n#> 1: 1 0 42\n#> 2: 2 0 42\n#> 3: 3 0 42\n#> 4: 4 0 42\n#> 5: 5 0 42\n#> 6: 6 0 42\n#> 7: 7 0 42\n#> 8: 8 0 42\n#> 9: 9 0 42\n#> 10: 10 1 42\n#> 11: 11 1 42\n#> 12: 12 1 42\n#> 13: 13 1 42\n#> 14: 14 1 42\n#> 15: 15 1 42\n#> 16: 16 1 42\n\nOf course, we can also use the trials to compute the expected number\nof survivors from the game. Since the outcomes are discrete, I take\nmedian number of survivors across all trials to reflect the expected\nnumber of survivors:\n\n\n# Compute the expected number of survivors across all trials\ntrials_counts <- sims[, .(count = sum(alive)), by = trial]\nexpected_survivors <- median(trials_counts$count)\nexpected_survivors\n\n\n#> [1] 7\n\nThe simulation suggests that we should expect 7 players to survive\nwhen running a game with 16 players and 18 steps. When we look at the\ndistribution of trial outcomes, we can also see that there is a pretty\ngood chance of having 6 or 8 survivors in any one round of the game:\n\n\n# Plot the distribution of survivors across all trials\nlibrary(ggplot2)\n\ntrial_count_summary <- as.data.frame(table(trials_counts$count))\nnames(trial_count_summary) <- c(\"n_survivors\", \"count\")\n\nggplot(trial_count_summary) +\n geom_col(aes(x = n_survivors, y = count), width = 0.1) + \n theme_bw() +\n theme(panel.grid.minor = element_blank()) +\n labs(\n title = \"Outcomes of 10,000 trials\",\n x = \"Number of survivors\", \n y = \"Count\"\n )\n\n\n\n\nChoose wisely\nWhile the expected number of survivors from these trials is higher\nthan the actual number in the TV series (only 3 survive), keep in mind\nthat these trials reflect ideal conditions in which all the players act\nstrictly rationally. In the show, multiple players die as a result of\nthe selfish or revengeful actions of other players, not strictly because\nof the breaking glass panes.\nBut perhaps the more important question from the perspective of the\nplayers is:\n\n“Which number should I choose?”\n\nObviously choosing a larger number closer to the back of the line\ngives the player a higher probability of survival because the players in\nfront identify all the “safe” glass panes. But how likely would you be\nto survive if you chose a number near the middle (which most players in\nthe TV series chose first)?\nSince we have all the results of every trial outcome for every player\nin a convenient data.table, computing the probability of surviving for\neach player is straightforward:\n\n\n# Plot the probability of survival based on the player order number\nsurvival_summary <- sims[, .(p_survive = sum(alive) / num_trials), by = player]\n\nggplot(survival_summary) +\n geom_point(aes(x = player, y = p_survive)) + \n scale_y_continuous(labels = scales::percent) +\n scale_x_continuous(breaks = seq(16)) +\n theme_bw() +\n theme(panel.grid.minor = element_blank()) +\n labs(\n title = \"Probability of survival based on player order number\",\n x = \"Player order number\", \n y = \"Probability\"\n )\n\n\n\n\nSo even though the expected number of survivors is 7 out of 16\nplayers, those survivors are overwhelmingly more likely to be the\nplayers who chose the highest order numbers.\nHow many steps to include?\nOne final question I had was what if the game designers chose to\ninclude more steps in the bridge? Because this was not supposed to be\nthe last game, the designers needed to make sure at least\ntwo players survived, otherwise the games would simply\nend.\nWe can see from the previous simulations that with 16 players, the\nchoice of 18 steps already puts a small (but non-zero) probability on\nhaving less than two survivors:\n\n\n# Probability of having less than two survivors\nsum(trials_counts$count < 2) / num_trials\n\n\n#> [1] 0.0036\n\nOf course this is probably an under-estimate of the true probability\nof having less than two survivors because the players are human beings\nwho tend to make irrational choices when playing a deadly game (as we\nsaw play out in the TV series version of the game).\nStill, what if the game designers were more risk seeking? How many\nsteps could they include if they were willing to live with, for example,\na 5% chance of having less than 2 players remaining at the end of the\nbridge game?\nTo answer that, we have to re-run our simulations, but with an\nincreasing number of steps. To keep things simple, I run 1,000\niterations of the game over an increasing number of steps from 10 to\n30:\n\n\n# Re-run the trials using an increasing number of bridge steps (10 to 30)\n\n# Repeat the \"players\" data.table for each trial\nmin_num_steps <- 10\nmax_num_steps <- 30\nstep_trials <- seq(min_num_steps, max_num_steps, by = 1)\nnum_step_trials <- length(step_trials)\nnum_trials <- 1000\nstep_sims <- players[rep(seq(num_players), num_trials*num_step_trials),]\n\n# Keep track of the trial number and the step number\nstep_sims[, trial := rep(seq(num_trials*num_step_trials), each = num_players)]\nstep_sims[, steps := rep(step_trials, each = num_players*num_trials)]\n\n# Now run the simulation for each trial\nstep_sims[, alive := run_game(.SD, num_steps = unique(steps)), by = trial]\n\n\n\nNow I can compute the probability of having less than two survivors\nfor each step size:\n\n\n# Compute probability of having less than two survivors for each step size\nstep_counts <- step_sims[, .(count = sum(alive)), by = c(\"trial\", \"steps\")]\nstep_summary <- step_counts[, .(p_under2 = sum(count < 2) / num_trials), by = steps]\n\nggplot(step_summary) +\n geom_point(aes(x = steps, y = p_under2)) + \n scale_y_continuous(labels = scales::percent) +\n scale_x_continuous(breaks = step_trials) +\n theme_bw() +\n theme(panel.grid.minor = element_blank()) +\n labs(\n title = \"Probability of having less than two remaining players\",\n x = \"Number of bridge steps\", \n y = \"Probability\"\n )\n\n\n\n\nI found it interesting that 18 steps seems to be the point at which\nthe probability of having less than two remaining players begins to\nincrease from essentially 0. With just a few more steps, the probability\nrapidly increases:\n\n\nstep_summary[steps %in% seq(18, 23)]\n\n\n#> steps p_under2\n#> 1: 18 0.005\n#> 2: 19 0.008\n#> 3: 20 0.024\n#> 4: 21 0.040\n#> 5: 22 0.057\n#> 6: 23 0.107\n\nOkay, that’s it! I’m sure I probably made an error somewhere - let me\nknow if you find one, and hopefully this was an interesting example of\nhow to conduct some relatively efficient Monte Carlo simulations in R\nwith the help of the {data.table} package.\nUpdate 1 - including time\nUpdated October 22, 2021\nMany people on twitter\nand reddit\nhad some great suggestions about this post. In particular, @evalparse had a brilliant,\none-line solution for simulating the number of survivors in one\ngame:\n\n\nOne critique that was mentioned multiple times was that my\nsimulations left out a critical element in the game:\ntime. In the show, the players had just 16 minutes to\ncross the bridge, otherwise all of the glass panes shatter and anyone\nleft behind dies. I originally omitted time for simplicity, but looking\nback at it now this is a pretty crucial element in the game because it\nmeans that the latter order numbers aren’t necessarily so safe after all\n– the further back in line you are, the higher risk you face of running\nout of time.\nSo, here’s one attempt at including time in the simulation.\nMy immediate thought was to assume a distribution for the amount of\ntime delay it takes for a player to take a step and then sample from\nthat distribution for each player in each trial of the game. The key\nquestion is what should that distribution look\nlike?\nObviously time needs to be positive, so I can’t use distributions\nthat could have negative numbers. But time also can’t go on forever – as\nwe saw in the show, the people in the back of the line start to get\nimpatient if the lead player takes too long and might “encourage” them\nto make a choice.\nSince I didn’t have any real-world data to inform my decision (I sure\nhope there are no real-world data on this!), I decided to use the data\nfrom the show itself. I re-watched the scene and timed how long each\nlead player took to take a step. Time in the show isn’t quite preserved\nbecause the camera cuts to other scenes, there’s slow motion, etc., but\nmany of the steps could be timed, so here’s what I measured:\n\n\n# Lead player step times in seconds\nseconds <- c(36, 20, 25, 49, 66, 22, 112, 10, 115, 184, 144, 18, 22, 36, 28, 15, 10)\nsummary(seconds)\n\n\n#> Min. 1st Qu. Median Mean 3rd Qu. Max. \n#> 10.00 20.00 28.00 53.65 66.00 184.00\n\nMost lead players moved relatively quickly, with a median of just 28\nseconds. But a few players took longer. In the show, player 9 (the bully\nguy) took quite a long time, and he might have taken longer if he wasn’t\nforced to move by the woman he scorned.\nGiven these considerations, I chose to model the step time with a\nlog-normal distribution because it prevents negative times and it has a\nlong tail for those one or two players who might take a long time to\nmove. Here are the parameters that fit these data, using the lovely\nfitdistr() function from the {MASS} package:\n\n\nfit <- MASS::fitdistr(seconds, \"log-normal\")\nfit\n\n\n#> meanlog sdlog \n#> 3.5781872 0.8824273 \n#> (0.2140201) (0.1513350)\n\nLet’s take a quick visual check of the fit:\n\n\nmeanlog <- fit$estimate[1]\nsdlog <- fit$estimate[2]\nggplot() + \n geom_histogram(aes(x = seconds, y = ..density..), \n binwidth = 10, fill = \"grey60\") +\n geom_density(aes(x = rlnorm(1000, meanlog, sdlog)), color = \"red\") +\n scale_x_continuous(limits = c(0, 200)) +\n theme_bw() + \n labs(\n title = \"Log-normal fit of time delay for each lead player step\",\n x = \"Time (seconds)\",\n y = \"Density\"\n )\n\n\n\n\nIs it a good fit? Eh, sort of?\nDo I have time to do a better job? Not really.\nOne issue with the log-normal though is that you can occasionally get\na really high number out on that long tail, which I would think\nis highly unlikely because at some point the next player is just going\nto push you off the bridge. So I censored my log-normal fit at 200\nseconds. So, with this in mind, I created a new\nrun_game_timed() function that includes a time delay for\neach step, sampling from a censored log-normal distribution.\n\n\n# Modified function for simulating one game with time\nrun_game_timed <- function(players, num_steps, max_seconds, meanlog, sdlog) {\n lead_player <- 1\n cum_time <- 0\n for (step in seq(num_steps)) {\n cum_time <- cum_time + min(rlnorm(1, meanlog, sdlog), 200)\n if (cum_time >= max_seconds) { return(players$alive*0) }\n if (sample(c(TRUE, FALSE), 1)) { next }\n if (sum(players$alive) == 0) { return(players$alive) }\n players$alive[lead_player] <- 0\n lead_player <- lead_player + 1\n }\n return(players$alive)\n}\n\n\n\nNow I can re-run all the trials and compute a new expected number of\nsurvivors:\n\n\n# Re-run trials\nnum_trials <- 10000\nsims_timed <- players[rep(seq(num_players), num_trials),]\nsims_timed[, trial := rep(seq(num_trials), each = num_players)]\nsims_timed[, alive := run_game_timed(\n .SD, num_steps = 18, max_seconds = 16*60, meanlog, sdlog), by = trial]\n\n# Compute the new expected number of survivors across all trials\ntrials_counts_timed <- sims_timed[, .(count = sum(alive)), by = trial]\nexpected_survivors <- median(trials_counts_timed$count)\nexpected_survivors\n\n\n#> [1] 5\n\nAs would be expected, the expected number of survivors decreased,\nwhich is now closer to the actual number of survivors in the show\n(3).\nBut how about the distribution? Well, by introducing time, all we’ve\nreally done is modify the original distribution to be “zero-inflated”\n(which is super common in data sets you run across in the\nwild):\n\n\n# Plot the distribution of survivors across all trials\ntrial_count_timed_summary <- as.data.frame(table(trials_counts_timed$count))\nnames(trial_count_timed_summary) <- c(\"n_survivors\", \"count\")\nggplot(trial_count_timed_summary) +\n geom_col(aes(x = n_survivors, y = count), width = 0.1) +\n theme_bw() +\n theme(panel.grid.minor = element_blank()) +\n labs(\n title = \"Outcomes of 10,000 trials\",\n x = \"Number of survivors\",\n y = \"Count\"\n )\n\n\n\n\nOf course, this also impacts the likelihood of survival for each\nindividual player. As might be expected, zero-inflating the distribution\nof survivors simply reduces the probability of survival for all\nplayers, with the higher-number players still ranking the most likely to\nsurvive:\n\n\n# Plot the probability of survival based on the player order number\nsurvival_summary_timed <- sims_timed[, \n .(p_survive = sum(alive) / num_trials), by = player]\nggplot(survival_summary_timed) +\n geom_point(aes(x = player, y = p_survive)) +\n scale_y_continuous(labels = scales::percent) +\n scale_x_continuous(breaks = seq(16)) +\n theme_bw() +\n theme(panel.grid.minor = element_blank()) +\n labs(\n title = \"Probability of survival based on player order number\",\n x = \"Player order number\",\n y = \"Probability\"\n )\n\n\n\n\nThis is of course is not quite what might be expected. Instead, I\nwould expect the probability to be a tiny bit lower for the very last\nplayer or two as they could get left behind. But I chose to ignore\nfeature as the lead player delay seemed to be a much bigger factor in\ndetermining survival. Plus, if you have an issue with how “realistic” my\nmodeling is, I would encourage you to remember the context of this\nentire exercise (if you think anything about this is realistic,\nplease see a psychiatrist).\nUpdate 2\nUpdated December 21, 2021\nLukas Gröninger wrote a similar\npost on his blog where he also provides the mathematical\nsolution to the expected number of survivors - check it out!\nUpdate 3\nUpdated May 9, 2022\nHenk Tijms, Professor\nEmeritus at VU University Amsterdam, wrote a fantastic\narticle introducing the concept of Markov Chains. In the article, he\nuses the Squid Game bridge scene as an example (along with several other\nclassic probability problems, like Egg Russian Roulette) to demonstrate\nthe power of Markov Chains. In his analysis he computes the expected\nnumber of survivors, the probability of survival for each player, and\nthe probability distribution of the number of survivors. The Squid Game\nexample starts on page 7.\nAppendix\nHere is the combined code for this entire simulation:\n\n\n# Rmd settings\nknitr::opts_chunk$set(\n warning = FALSE,\n message = FALSE,\n comment = \"#>\",\n fig.align = \"center\",\n fig.path = \"figs/\",\n fig.retina = 3\n)\nset.seed(5678)\n# Create an initial data.table of players\nlibrary(data.table)\n\nnum_players <- 16\nplayers <- data.table(player = seq(num_players), alive = 1)\nplayers\n# Define a function for simulating one game\nrun_game <- function(players, num_steps) {\n lead_player <- 1\n for (step in seq(num_steps)) {\n # 50% chance that the pane is safe\n if (sample(c(TRUE, FALSE), 1)) {\n # The pane is safe, keep going!\n next\n } else {\n # The pane broke 💀\n # Before continuing, check if any players are still alive\n if (sum(players$alive) == 0) { return(players$alive) }\n # The lead player died\n players$alive[lead_player] <- 0\n lead_player <- lead_player + 1\n }\n }\n return(players$alive)\n}\n# Run one iteration of the game\nsingle_game <- run_game(players, num_steps = 18)\nsingle_game\n# Run the main trials\n\n# Repeat the \"players\" data.table for each trial\nnum_trials <- 10000\nsims <- players[rep(seq(num_players), num_trials),]\n\n# Keep track of the trial number\nsims[, trial := rep(seq(num_trials), each = num_players)]\n\n# Now run the simulation for each trial\nsims[, alive := run_game(.SD, num_steps = 18), by = trial]\n# View one trial outcome\nsims[trial == 42]\n# Compute the expected number of survivors across all trials\ntrials_counts <- sims[, .(count = sum(alive)), by = trial]\nexpected_survivors <- median(trials_counts$count)\nexpected_survivors\n# Plot the distribution of survivors across all trials\nlibrary(ggplot2)\n\ntrial_count_summary <- as.data.frame(table(trials_counts$count))\nnames(trial_count_summary) <- c(\"n_survivors\", \"count\")\n\nggplot(trial_count_summary) +\n geom_col(aes(x = n_survivors, y = count), width = 0.1) + \n theme_bw() +\n theme(panel.grid.minor = element_blank()) +\n labs(\n title = \"Outcomes of 10,000 trials\",\n x = \"Number of survivors\", \n y = \"Count\"\n )\n# Plot the probability of survival based on the player order number\nsurvival_summary <- sims[, .(p_survive = sum(alive) / num_trials), by = player]\n\nggplot(survival_summary) +\n geom_point(aes(x = player, y = p_survive)) + \n scale_y_continuous(labels = scales::percent) +\n scale_x_continuous(breaks = seq(16)) +\n theme_bw() +\n theme(panel.grid.minor = element_blank()) +\n labs(\n title = \"Probability of survival based on player order number\",\n x = \"Player order number\", \n y = \"Probability\"\n )\n# Probability of having less than two survivors\nsum(trials_counts$count < 2) / num_trials\n# Re-run the trials using an increasing number of bridge steps (10 to 30)\n\n# Repeat the \"players\" data.table for each trial\nmin_num_steps <- 10\nmax_num_steps <- 30\nstep_trials <- seq(min_num_steps, max_num_steps, by = 1)\nnum_step_trials <- length(step_trials)\nnum_trials <- 1000\nstep_sims <- players[rep(seq(num_players), num_trials*num_step_trials),]\n\n# Keep track of the trial number and the step number\nstep_sims[, trial := rep(seq(num_trials*num_step_trials), each = num_players)]\nstep_sims[, steps := rep(step_trials, each = num_players*num_trials)]\n\n# Now run the simulation for each trial\nstep_sims[, alive := run_game(.SD, num_steps = unique(steps)), by = trial]\n# Compute probability of having less than two survivors for each step size\nstep_counts <- step_sims[, .(count = sum(alive)), by = c(\"trial\", \"steps\")]\nstep_summary <- step_counts[, .(p_under2 = sum(count < 2) / num_trials), by = steps]\n\nggplot(step_summary) +\n geom_point(aes(x = steps, y = p_under2)) + \n scale_y_continuous(labels = scales::percent) +\n scale_x_continuous(breaks = step_trials) +\n theme_bw() +\n theme(panel.grid.minor = element_blank()) +\n labs(\n title = \"Probability of having less than two remaining players\",\n x = \"Number of bridge steps\", \n y = \"Probability\"\n )\nstep_summary[steps %in% seq(18, 23)]\n# Lead player step times in seconds\nseconds <- c(36, 20, 25, 49, 66, 22, 112, 10, 115, 184, 144, 18, 22, 36, 28, 15, 10)\nsummary(seconds)\nfit <- MASS::fitdistr(seconds, \"log-normal\")\nfit\nmeanlog <- fit$estimate[1]\nsdlog <- fit$estimate[2]\nggplot() + \n geom_histogram(aes(x = seconds, y = ..density..), \n binwidth = 10, fill = \"grey60\") +\n geom_density(aes(x = rlnorm(1000, meanlog, sdlog)), color = \"red\") +\n scale_x_continuous(limits = c(0, 200)) +\n theme_bw() + \n labs(\n title = \"Log-normal fit of time delay for each lead player step\",\n x = \"Time (seconds)\",\n y = \"Density\"\n )\n# Modified function for simulating one game with time\nrun_game_timed <- function(players, num_steps, max_seconds, meanlog, sdlog) {\n lead_player <- 1\n cum_time <- 0\n for (step in seq(num_steps)) {\n cum_time <- cum_time + min(rlnorm(1, meanlog, sdlog), 200)\n if (cum_time >= max_seconds) { return(players$alive*0) }\n if (sample(c(TRUE, FALSE), 1)) { next }\n if (sum(players$alive) == 0) { return(players$alive) }\n players$alive[lead_player] <- 0\n lead_player <- lead_player + 1\n }\n return(players$alive)\n}\n# Re-run trials\nnum_trials <- 10000\nsims_timed <- players[rep(seq(num_players), num_trials),]\nsims_timed[, trial := rep(seq(num_trials), each = num_players)]\nsims_timed[, alive := run_game_timed(\n .SD, num_steps = 18, max_seconds = 16*60, meanlog, sdlog), by = trial]\n\n# Compute the new expected number of survivors across all trials\ntrials_counts_timed <- sims_timed[, .(count = sum(alive)), by = trial]\nexpected_survivors <- median(trials_counts_timed$count)\nexpected_survivors\n# Plot the distribution of survivors across all trials\ntrial_count_timed_summary <- as.data.frame(table(trials_counts_timed$count))\nnames(trial_count_timed_summary) <- c(\"n_survivors\", \"count\")\nggplot(trial_count_timed_summary) +\n geom_col(aes(x = n_survivors, y = count), width = 0.1) +\n theme_bw() +\n theme(panel.grid.minor = element_blank()) +\n labs(\n title = \"Outcomes of 10,000 trials\",\n x = \"Number of survivors\",\n y = \"Count\"\n )\n# Plot the probability of survival based on the player order number\nsurvival_summary_timed <- sims_timed[, \n .(p_survive = sum(alive) / num_trials), by = player]\nggplot(survival_summary_timed) +\n geom_point(aes(x = player, y = p_survive)) +\n scale_y_continuous(labels = scales::percent) +\n scale_x_continuous(breaks = seq(16)) +\n theme_bw() +\n theme(panel.grid.minor = element_blank()) +\n labs(\n title = \"Probability of survival based on player order number\",\n x = \"Player order number\",\n y = \"Probability\"\n )\n\n\n\n\n\n\n",
"preview": "posts/2021-10-19-monte-carlo-bridge-game/squid-game.png",
- "last_modified": "2024-02-15T05:11:40+00:00",
+ "last_modified": "2024-02-16T05:11:40+00:00",
"input_file": {},
"preview_width": 2286,
"preview_height": 1326
@@ -130,7 +130,7 @@
],
"contents": "\n\nContents\nOverview\nCreating the surveys\nDefining the choice questions\nImplementing the choice questions\nButtons versus tables\nImplementing the surveys in formr\nTime stamps \nPreview and check\nGetting the data\n\nformr.org is a flexible platform for making surveys using . In this post, I’m going to show you one approach for using formr to create a choice-based conjoint survey (I’m going to assume that you know what conjoint surveys are, but if not take a look at this quick introduction).\nThroughout this post, I will use a demo survey about people’s preferences for apples with three attributes: type, price, and freshness.1\nYou can view the live demo survey here, and all files used to create the survey are on this GitHub repo.\nOverview\nIf you’ve never used formr.org before, the video on this page offers a 5 minute conceptual overview followed by ~40 minute demo covering the basics.\nEvery formr survey is implemented in a spreadsheet. I highly recommend using Google Sheets for this because\nformr has a button on the admin page to quickly re-load your Google Sheet (otherwise you have to upload a new .xlsx file any time you make changes), and\nIt’s easier for collaborating and managing different versions.\nEach “survey” (each Google Sheet) must be loaded into a “Run” to make the survey live. Most runs include multiple surveys chained together to control complex logic, like filtering out a respondent based on their response to a question.\nFor this demo, I have designed the run as a combination of three surveys (links below go to each respective Google Sheet):\nPart 1: Intro and target population screen out\nPart 2: Conjoint choice questions\nPart 3: Demographic and other questions\nDon’t worry about what’s in each sheet just yet - we’ll get to that.\nCreating the surveys\nI find it much easier to design my surveys using .Rmd files (one .Rmd file per survey). I can knit each .Rmd file to a html page to preview the look and feel of my survey without having to use formr at all. This also provides a way to easily print out the whole survey content as a PDF (e.g. open the survey in your browser, then print the page to a PDF). When I’m happy with how things look, I then carefully copy-paste the content over into separate rows in a Google Sheet.\nFor this demo, I designed the questions in each part using the following three .Rmd files in the “survey” folder of the GitHub repo:\n.Rmd file\nGoogle Sheet\np1-intro.Rmd\nappleConjoint_p1\np2-choice-questions.Rmd\nappleConjoint_p2\np3-demos.Rmd\nappleConjoint_p3\nThe links in this table let you compare the .Rmd file with the corresponding Google Sheet. Most general content that I want to display to the respondent goes in the label column of the Google Sheet, and response options to questions go in the choice columns (for part 3, I put the choice options on a separate choices tab). Pay careful attention to the type column - this determines the nature of the row (e.g. note just shows the label column content, mc is a multiple choice question, etc.). The calculate type rows allow me to run code to generate and store objects that can be used across different pages in the survey (these values will also be available in the resulting survey data).\nDefining the choice questions\nThe central component of every conjoint survey is the set of randomized choice questions. To implement these in formr, you first need to define the set of choice questions you want to ask each respondent. I use the cbcTools package (which I developed 😄) to create these questions. The code to create the choice questions for this demo survey is in the make_choice_questions.R file in the repo.\nThe data frame of randomized choice questions is saved as the choice_questions.csv file. Once created, you’ll need to host it somewhere on the web so that you can read it into your Google Sheet. For this demo, the file is hosted on the GitHub repo, but you can also upload your choice_questions.csv file inside your Run (see the “Upload Files” button on the left side menu), which will generate a unique url to the file.\nImplementing the choice questions\nI implement the choice questions in part two of my survey (the appleConjoint_p2 Google Sheet). To do this, I use the first few rows of the sheet to read in the choice_questions.csv file and make the following calculations:\nRandomly generate a respondentID by sampling from all possible respID values in the choice questions.\nCreate a new df data frame that includes only the rows for the specific respondentID.\nCreate a df_json object that converts the df data frame to JSON.\nThat last step is a bit of a hack, but the reason this is necessary is because each new page on formr is essentially a new session, so every time you start a new page all your previous objects are no longer there and all your libraries need to be re-loaded. The only objects you have access to on separate pages are items that are stored in the resulting survey data (using the names assigned in the name column), so we have to serialize the df object into one long JSON object so that we can access it later in other pages.\nOnce we have everything set up, we can then start defining choice questions. In each choice question row, the first thing I do is define the questions label and then write a code chunk to create multiple data frames to store the values to display for each alternative. For example, on row 10 of the appleConjoint_p2 Google Sheet, you can see the following code chunk under the question label:\n\n\nlibrary(dplyr)\nalts <- jsonlite::unserializeJSON(df_json) %>% \n filter(qID == 1)\nalt1 <- alts %>% filter(altID == 1)\nalt2 <- alts %>% filter(altID == 2)\nalt3 <- alts %>% filter(altID == 3)\n\n\nIn this chunk, the alts data frame is created by converting the df_json object into a data frame and filtering for all alternatives for the first question. Then the alts data frame is broken into three more data frames (alt1, alt2, and alt3) which contain the information about each alternative. These data frames are then used to display information about each alternative. For example, the first alternative is defined using this code:\n\n**Option 1**\n\n\n\n**Type**: `r alt1$type`\n**Price**: $ `r alt1$price` / lb\n**Freshness**: `r alt1$freshness`\n\nI copy this code over to each alternative, adjusting the numbers for alternative 2 and 3. When rendered in formr, the three options looks like this:\n\n\nAnd that’s it! The nice thing about this approach is that the only thing I need to modify in these code chunks for the remaining choice questions is the question number used to define the alts data frame. Other than that, the code for the question label and the alternatives can be reused on the rest of the choice questions.\nButtons versus tables\nIn the example above, the conjoint choice questions are displayed as “buttons” where all the information for each alternative is shown as a button. This works particularly well for mobile phone applications where the user may need to scroll vertically to see each option.\nAn alternative is to use a tabular layout where each column represents an alternative and the row names explain the attribute. This takes a little manipulation to get it right, but the key concept is to use kable() to display the transpose of the alts data frame. I also use the wonderful kableExtra package to modify some of the table stying. If you want to see this version in practice, the survey link is here, and the Google Sheet with the configurtions for this is here.\n\n\nlibrary(dplyr)\nlibrary(kableExtra)\nalts <- jsonlite::unserializeJSON(df_json) %>%\n filter(qID == 1) %>% \n mutate(\n price = paste(scales::dollar(price), \"/ lb\"),\n image = paste0('')) %>% \n # Make nicer attribute labels\n select(\n `Option:` = altID, \n ` ` = image,\n `Price:` = price, \n `Type:` = type, \n `Freshness:` = freshness)\nrow.names(alts) <- NULL # Drop row names\n\nkable(t(alts), escape = FALSE) %>% \n kable_styling(\n bootstrap_options = c(\"striped\", \"hover\", \"condensed\"), \n full_width = FALSE, \n position = \"center\"\n )\n\n\n\n\nImplementing the surveys in formr\nYou’ll need to upload each Google Sheet survey into formr to convert them into surveys. Go to your admin page, click on “Create Survey”, then import one of the Google Sheets. This creates one survey. On the left panel you can click “Test Survey” to preview it.\nOnce you have all three surveys loaded into formr, you can then assemble them into a “Run” by clicking on “Runs -> Create New Run”. Give the run a name, then add your survey to the run by clicking on the icon. You’ll want to add all three surveys, and then at the end add a stopping point by clicking the icon. You can use other logic to control how the user navigates through the survey, such as a “Skip Forward” ( icon) to screen respondents out before letting them get to a later part of the survey.\nThe specific logic used in this demo is as follows:\nStart (part 1)\n |\n V\nCheck screen out question --> Screen out non-target respondents\n |\n V\nChoice questions (part 2)\n |\n V\nCheck choice responses --> Screen out respondents that chose \n | all same responses\n V\nFinal demographic and other questions (part 3)\n |\n V\nFinish\nNotice that there are two points where respondents can be screened out of the survey:\nAt the end of part 1, I ask a question to identify if the respondent is part of the target population I am interested in. This allows me to screen people out of the survey eary on before they get too far in if they’re not who I’m looking for. In this demo, I ask if they prefer the color Red or Blue and screen out people who chose Blue.\nAt the end of part 2, I compute whether or not the respondent chose the same response for every choice question or not, which is a good indicator that they were probably just clicking through the survey. I don’t want these respondents in my sample, so I screen them out here.\nHere is a screenshot of the specific run settings:\n\n\n\nTime stamps \nSince your entire survey is designed in , why not take advantage of that fact to collect more about your respondents? One thing I always do on my formr surveys is grab the time each respondent spends on every page. This is implemented by running Sys.time() at the top of every new page, which I then use to compute the difference between each time stamp to get the time spent on every page. This is useful in general just to be more informed about how your respondents are going through your survey, and particularly useful for examining behavior on the conjoint choice questions.\nPreview and check\nThe link to the survey will be https://your_run_name.formr.org. You can control whether your survey is “live” or not by modifying the “volume” icons. For collecting data, I recommend setting it to the icon, which means people who have the link can access the survey.\nBut before you go live, it’s a good idea to do some quick testing. You can test each survey separately from their respective survey admin pages, and you can also test the entire run from the run admin page (check the left side menu). When testing, you may get an error - don’t panic! The error pages look a little different from the errors you’re used to in R, but if you click through the errors you can usually find the root cause of the error (the R error message will be buried somewhere on the page). Many times the errors are small typos, which is another reason why I like to initially build my surveys in .Rmd files - when I knit them to html pages, any typos or other small errors are much more easily identified.\nGetting the data\nOnce your survey is live and you start collecting responses, your response data will not be available in the “Run”. Instead, they will be available in each of the three survey pages. You can use the {formr} package to import the data directly in , or just go to the admin page for each survey and download the data as .csv files. The key piece to remember is that each respondent will be given a unique session variable that you can use to join all of the three separate data files together.\nWith that in mind, keep an eye out for a follow on post on how to join and clean the resulting data from this conjoint demo coming soon!\n\nYes, people have actually done conjoint surveys on fruit before.↩︎\n",
"preview": "posts/2021-09-18-choice-based-conjoint-surveys-in-r-with-formr/images/example.png",
- "last_modified": "2024-02-15T05:11:40+00:00",
+ "last_modified": "2024-02-16T05:11:40+00:00",
"input_file": {},
"preview_width": 640,
"preview_height": 367
@@ -153,7 +153,7 @@
],
"contents": "\n\nContents\nMechanical keyboards\nRemapping your keyboard\nTowards better ergonomics with split keyboards\nBuilding my own split keyboard\nTowards an “End Game”\nFirmware\nTrade offs in going split\nCost\nLearning curve\n\nAlternative layouts?\nTools to build your own keyboard\n\nWhen I entered grad school, I started spending far more time typing on a computer than I had before. After a few months in, I started searching for a better typing experience and ways to reduce the amount of keystrokes needed to get the job done. Now a decade later, I have most definitely taken the red pill and become far more aware of solutions and technologies I never would have imagined back then. This post is a brief summary of my journey in search of keyboard nirvana.\n\n\n\nFigure 1: Me back in 2011 reading about mechanical keyboards.\n\n\n\nMechanical keyboards\nOne of my first roommates in graduate school was a remarkably fast typist and used this loud, clacky mechanical keyboard (I had to wear earplugs or listen to music to get anything done when he was typing). Curiosity got the best of me, and after trying out his keyboard for a few minutes I immediately understood why people use mechanical keyboards. There is something about the combination of aural and tactile feedback you get when typing on a keyboard with mechanical switches that makes the typing experience just…better. But it wasn’t just aesthetically better - I was able to type at a faster speed and with fewer errors when using a mechanical keyboard.\nNeedless to say, I immediately started searching for a mechanical keyboard to replace my crummy old Apple keyboard. Somewhere along my searching, I came across articles about people who “re-mapped” their keyboards so that some key presses would trigger other events. This seemed like a powerful idea for reducing key strokes, and after a bit more searching I discovered Karabiner Elements for mac. This software is invaluable. I made loads of customizations to simplify common operations. I became so dependent on it that I decided to design a custom key cap set for my first mechanical keyboard from WASD Keyboards to match my Karabiner Elements settings\n\n\n\nFigure 2: A rendering of my first custom mechanical keyboard, from WASD Keyboards.\n\n\n\nRemapping your keyboard\nThere are many software options for customizing and re-mapping your keyboard. I use Karabiner Elements for mac, but AutoHotkey is probably the most popular option for Windows users.\nThe use cases for these programs are endless, but I want to highlight one very powerful idea: adding keyboard layers. Every keyboard has at least two layers: the normal layer, and the “shift” layer that converts lower-case letters to upper-case and numbers to symbols. Unfortunately, some of the most useful keys (e.g. navigation arrows) are not conveniently located and require you to move your hand every time you want to use them. So one of the first things I implemented in re-mapping my keyboard was to try and move those navigation arrows back to “home row”:\n\n\n\nFigure 3: Where I wanted to relocate the navigation arrows.\n\n\n\nMy solution was to introduce a “Capslock” layer. This layer works just like the shift layer, except it gets triggered when holding down the otherwise rarely-used Capslock key. This layer uses J / I / K / L as arrow keys for navigating the cursor selecting text, and it re-maps the S / D / F keys into cut, copy, and paste as well as N and H into backspace and delete:\n\n\n\nFigure 4: My capslock layer.\n\n\n\nAs you can see in the demo below, this modification enables me to navigate the cursor and edit code / text without ever having to leave home row:\n\n\nYour browser does not support HTML video. \nI use Karabiner Elements for lots of other things too, like:\nLaunching applications like Google Chrome or Terminal with a single hotkey.\nOpening Google Chrome to a webpage (e.g. gmail) with a single hotkey.\nTyping my email send off (\"Cheers,\\nJP\") with a single hotkey.\nYou can see all of my Karabiner Elements settings here.\nTowards better ergonomics with split keyboards\nTypical keyboard layouts are not that ergonomic, and after several years of using my WASD keyboard I eventually started developing wrist pain. My specific pain was from prolonged ulnar deviation, where the hands are bent outward at the wrist joint. So I started searching for a split keyboard to be able to hold my hands at a more natural position while typing (split keyboards actually fix a lot of ergonomic problems).\n\n\n\nFigure 5: Split keyboards fix a lot of ergonomic problems, like ulnar deviation.\n\n\n\nAfter doing a lot of research on different split boards, I decided to try out the Ergodox EZ, a highly-rated (albeit quite expensive) keyboard. I chose the Ergodox EZ as it had a 30-day return policy (though you have to pay the return shipping), and because the company that makes the Ergodox (ZSA) has some of the most highly-rated customer service of any company ever. I figured if I didn’t like it, I could probably return it without too much hassle.\n\n\n\nFigure 6: The Ergodox EZ.\n\n\n\nOne of the other features that attracted me to the Ergodox is their Oryx configurator software. This is an in-browser web application that enables you to fully customize every single key on the keyboard through a very user-friendly UI. All of a sudden, many of the key features that I loved about Karabiner Elements (such as re-mapping my Capslock key to use navigation arrows on the home row) could be programmed directly into the keyboard itself. This meant I could use the keyboard on any computer and have the same re-mapped keys at the ready.\n\n\n\nFigure 7: Screenshot of the Oryx software.\n\n\n\nThere are a lot of things to like about the Ergodox. First, the Oryx software was a game changer. Being able to directly control what each key does gave me all sorts of new ideas that made my typing incredibly efficient and far more comfortable than anything I had used in the past. But the single biggest improvement was probably the thumb key cluster. Your thumbs are quite strong, but on most keyboards they’re only useful for the space bar and maybe some contorted use of modifier keys. In contrast, the Ergodox has several keys right under each thumb, enabling far more efficient and comfortable access to using your thumb to type.\nMy single biggest improvement in using the thumb keys was to swap what used to be my capslock layer trigger to a thumb key. Now instead of holding down the capslock key with my much weaker left pinky finger to trigger a new layer, I could use my much stronger left thumb, freeing up my left pinky finger to actually type key strokes.\nAll together, the Ergodox EZ was a major improvement in my typing, both in terms of ergonomics and efficiency. My typing speed also increased from ~70 on an Apple keyboard to ~90 on the Ergodox. But after about six months of use, I still found some things weren’t quite right with the board. This review is a great summary of the pros and cons of the Ergodox, but my single biggest complaint was the precise location of the thumb key clusters. Most thumb keys were just too far away to be useful, and even the main thumb keys were a little too far out for my hands.\n\n\n\nFigure 8: Me after about 6 months of using the Ergodox EZ thumb cluster.\n\n\n\nBuilding my own split keyboard\nIf you’re at the point where you’re typing on an Ergodox EZ, it doesn’t take long before you start realizing there’s an entire world out there of DIY keyboard hobbists obsessed with ergonomics and custom built keyboards. A short perusal of r/ErgoMechKeyboards and you’ll see a wide range of bizarre keyboards you never would have imagined existed, with names like Iris, Lily58, Kyria, and Corne.\n\n\n\nFigure 9: Me after first discovering r/ErgoMechKeyboards.\n\n\n\nAs an engineering faculty member at George Washington University, many of the tools needed to dive into building my own keyboard (e.g. a soldering iron) were readily available to me via the makers space in the GW Innovation Center. So I convinced myself that making a keyboard would be a fun excuse to learn some new skills and meet more of the engineering students at GW (this was in 2018, only a few months after I had started at GW). I decided to make a Lily58, mostly because the parts were quite cheap and available at keyhive.xyz.\n\n\n\nFigure 10: Me and my first ever DIY keyboard (a Lily58), assembled (mostly) in the GW Innovation Center.\n\n\n\nThe project of building a Lily58 was a great learning experience, and I am more greatful for the people I met in the GWIC than in what I learned about building keyboards. Turns out building a keyboard covers quite a lot of engineering skills, including:\nSoldering microelectronics and circuitry.\nCAD design for making case components.\n3D printing and laser cutting those case components.\nProgramming the keyboard firmware.\nI got to work with some talented GW undergraduates on Friday afternoons working on each of these tasks. Someone even got this very promotional-esque shot of me working with a student soldering some microelectronics:\n\n\n\nFigure 11: A very staged-looking (but actually wasn’t staged) photo of me working with a student in the GWIC.\n\n\n\nI have to say that working with student in the GWIC has been a real highlight of my little keyboard journey. Given that most of what I do and teach as a professor involves the virtual space of data analysis and programming, it was great to get to do some “hands on” engineering tasks with students outside of the classroom.\nTowards an “End Game”\nIn the world of DIY keyboards, many speak of a mythical “End Game” keyboard - one that is so perfected that you hang up your soldering iron, never to build another keyboard again. Of course, in reality each keyboard project is a step in another direction that almost inevitably leads to yet another keyboard project. It’s a similar logic to biking rule #12:\n\n\n\nFigure 12: …where N is the current number of bikes owned.\n\n\n\nIn my case, I have now built no less than four keyboards, so I believe I can officially call building custom keyboards a hobby (problem?).\n\n\n\nFigure 13: Most peoples’ reaction after I show them my keyboards.\n\n\n\nFor each new keyboard project, I either found something new I wanted to try out, or I found something I didn’t quite like about the last project (or both). For my Lily58, the biggest issue was simply the build quality wasn’t that high. I used sockets to hold the key switches in place so I could swap out different switches, but whenever I traveled with the board many of the switches would fall out, which got annoying pretty fast.\nSo I decided to build an Iris keyboard – a close cousin to the Lily58 in terms of the key layout. I also decided to keep a detailed record of my build process as I learned so many tiny details in the process that I figured could help others looking to build their own keyboards (e.g. installing a rotary encoder).\n\n\n\nFigure 14: My second keyboard - an Iris Rev 2.\n\n\n\n\nA rotary what now?\n\nOn most of my DIY boards, I have a rotary encoder on the right hand side, which is like a rotational switch. I use them for a variety of things. Perhaps the most obvious might be as a volume knob, but I also have my keyboards configured such that the rotary encoder triggers different behaviors depending on the active layer, including volume control, scrolling up and down, scrolling left and right, and zooming in and out.\n\n\n\nFigure 15: A rotary encoder I mounted on my Iris keyboard.\n\n\n\nMy Iris was a fantastic keyboard. Again, I worked a lot on it in the GWIC, using the 3D printers to print the case and soldering equipment to assemble much of the components. After some heavy use though, the rotary encoder broke off, and it was essentially irreparable, so I decided to build what seemed like a more robust keyboard – the Kyria.\n\n\n\nFigure 16: My third keyboard - a Kyria Rev 1.\n\n\n\nDesigned by Thomas Baart, the Kyria keyboard has arguably the best assembly documentation of any keyboard kit I’ve seen, and as such it is usually the board I recommend to first-time builders. Thomas put a lot of thought into the design of the board, and in general I would say this is probably the best quality board I have built (more photos here, and details about the board here). I also learned a bit more about using CAD with this keyboard as I modified the 3D printed case using Fusion 360 (thanks again to GW’s institutional license!). That said, the layout of this board just isn’t quite as comfortable for my hands as the Iris layout. In particular, the column stagger is much more aggressive, and it just doesn’t quite work for my hands.\nSo that leads me to my latest (and hopefully last?) build: a Wireless Corne (more photos here). The Corne keyboard has a near identical layout with the Iris, minus the number row. By this point, I had moved to using as many as five different keyboard layers, one of which being a numpad on my right hand, so I no longer used the number row anyway. Going wireless was also a very attractive option as I enjoy working in coffee shops and libraries on my laptop, so traveling with a compact and wireless keyboard would be much more convenient. As with my Iris, I kept a detailed build log.\n\n\n\nFigure 17: My (final?) keyboard - a Wireless Corne.\n\n\n\nFirmware\nI would be remiss if I didn’t at least mention the incredible open source software that runs all of these DIY keyboards. The main software most people use to power their keyboards is QMK Firmware. The software has become so developed that there’s even an open source keyboard configurator, similar to the Oryx software for the Ergodox. QMK is packed with loads of features, but perhaps most importantly is the highly flexible layer configurations, enabling you to map multiple layers so that your hands rarely ever need to leave the home row position.\nQMK’s wireless cousin is ZMK Firmware, which I am running on my wireless Corne. ZMK is newer and shares many of the same features as QMK, though not quite as many. It is also designed quite differently, and I would say it has a slightly steeper learning curve compared to QMK. If you’re newer to building your own keyboard, using QMK with the GUI configurator.\nTrade offs in going split\nThere are several drawbacks to adopting a split keyboard, but I want to discuss two common ones:\nCost\nLearning curve\nCost\nPerhaps one of the biggest drawbacks of split keyboards is that they generally cost more than traditional keyboards. Nonetheless, there are lots of good options for under or around $100 USD, such as:\nMicrosoft Scult ($60)\nKINESIS Freestyle ($99.00)\nLogitech ERGO K860 ($125.00)\nSome of the more premium split boards can cost quite a lot more, such as the Ergodox EZ at ~$350. And even building your own split keyboard isn’t all that cheap, with the final cost usually ranging from ~$100 to $250 depending on the parts used.\nBut despite the sometimes higher cost, I encourage you to re-frame your thinking about the value you get from your keyboards.\nFirst, if you use your keyboard on a daily basis for your job, you should consider your keyboard as a professional instrument. Professional musicians invest thousands of dollars into their instruments, and likewise professional chefs invest thousands of dollars into their knives. So if you type for a profession, why not spend only a few hundred dollars on a professional keyboard?\nSecond, consider your keyboard as a medical device. Billions of people depend on glasses to see well, and a typical pair of glasses can easily cost several hundred dollars. And just as using a correctly-prescribed pair of glasses is important for the health of your eyes, using a more ergonomic keyboard is important for the health of your fingers, hands, wrists, and shoulders (yes, all of those things are affected by how you type).\nSo yes, purchasing an ergonomic keyboard may cost you a little more up front, but it is generally an investment worth making, and especially if you are already finding yourself in pain from typing.\nLearning curve\nAdopting any new technology generally comes with a learning curve, and moving to a split keyboard is no different. But how hard it is to adapt to a new keyboard varies by a variety of factors.\nFirst, if you already are proficient at touch-typing (meaning you use all 10 fingers to type and you don’t need to look down at the key legends), adopting a split keyboard probably won’t be too challenging. For me, I found I was a little slower and made more errors for a few days when first using a split board, but after about a week I was back to otherwise “normal” typing speed. If you are not a proficient touch typist, I recommend getting up to a good typing speed (~60 WPM) before adopting a split keyboard. This is because split boards essentially force you to use the appropriate hand for each key, so if you are also learning how to type while using the split keyboard, you may find it extra challenging.\nSecond, the type of split keyboard can matter quite a lot. Some boards, such as the Microsoft Sculpt, have two split halves but are otherwise quite similar in the key positions compared to typical keyboards. Others, such as the Ergodox EZ, use an “ortholinear” layout, meaning that each row of keys align into columns with no stagger across rows. These columnar layouts are arguably more ergonomic and comfortable to type on, but they can take a little longer to get used to.\nMy recommendation is to try out a split board of some type and practice using it on a site like keybr.com. Get used to it first and give it at least a week of regular use before judging it. It can take a week or more for it to stop feeling strange, and your hands will thank you in the long run.\nAlternative layouts?\nThe one thing I tried along this journey that I generally do not recommend is learning a new keyboard layout, such as Colemak or Dvorak.\nIn case you didn’t know, the typical QWERTY layout is actually a pretty poor layout in terms of typing efficiency and ergonomics. There are lots of stories about why this layout was originally chosen. Some say it was intentionally designed to be inefficient so that early typewriter arms wouldn’t jam up if someone got to typing too fast. My favorite story is that early typewriter salesmen didn’t know how to type, but the QWERTY layout enabled them to type “typewriter” by using only the upper row, making it easier to remember the sequence.\nWhatever the reason for designing the layout, QWERTY is pretty bad, but we’re all unfortunately pretty much stuck with it as the defacto standard. Being the keyboard enthusiast that I am, I have attempted several times to adopt Colemak, but I have never been able to get my speed anywhere near my speed with QWERTY. But I also do not see much benefit to using a different layout, especially relative to the steep learning curve in mastering it. I do not feel any discomfort using QWERTY, and my typing speed on a split keyboard is ~90 words per minute, which is plenty fast enough to be an efficient typist. So, this is just to say that if you came across this post looking for tips on ergonomic typing, I would much more heavily weigh using a split keyboard over the particular key layout on that keyboard. That said, if you’re not that proficient yet at any layout, you might save yourself some trouble and just start with an alternative to QWERTY.\nTools to build your own keyboard\nAt this point in my keyboard journey, I’m starting to wonder how different my life might be today if I had taken the blue pill. Certainly my hands and shoulders would be in much worse shape, but I might also just be happy using a simple old Apple keyboard. Ignorance can sometimes be bliss.\n\n\n\nFigure 18: Me in 2021 with about three too many keyboards in my office.\n\n\n\nNonetheless, should you choose to go down this path, hopefully I can save you a bit of time with some suggested tools. First, you’ll need some pretty standard equipment. Here are my recommendations:\nA cheap soldering iron is all you really need to get the job done. No need to spend a fortune, especially if you won’t use it too often.\nA brass sponge to clean the soldering iron tip is a must. A good brass sponge will make up for anything lacking in a cheap iron.\nA cheap solder fan is a good idea, especially if you’re working in small spaces. Solder fumes are pretty toxic, and these little things actually work very well.\nReverse tweezers are a lifesaver for holding tiny parts.\nA Silicon mat is really nice so you don’t burn your desk and lose tiny parts.\nNot super necessary, but some sort of quad hand tool can be helpful for holding everything still when working.\nIn terms of suppliers, most keyboard parts shops are regional. In the U.S., both Keyhive and Littlekeyboards have a lot to offer.\nIf you want to learn more about and / or compare the many different split keyboards out there, check out this Shiny App I made (source code here). I made it both to help the community of split keyboard users find their next board, but also to learn and practice a lot about building Shiny Apps.\n\n\n\nFigure 19: Screenshot of my splitKbCompare app.\n\n\n\nFinally, if you do choose to adopt any of the solutions I wrote about in this post, please do reach out (Twitter, Reddit) - I’m happy to help however I can!\n\n\n\nFigure 20: Choose wisely.\n\n\n\n\n\n\n",
"preview": "posts/2021-07-03-in-search-of-keyboard-nirvana/images/red_pill.gif",
- "last_modified": "2024-02-15T05:11:40+00:00",
+ "last_modified": "2024-02-16T05:11:40+00:00",
"input_file": {}
},
{
@@ -173,7 +173,7 @@
],
"contents": "\n\nContents\nQuick overview\nPersonal information\nURLs\nCode\nPaste clipboard contents as plain text\nOpen things\nDates\nPackages\nHighlights from my default.yml file\n\n\n\nI recently discovered a wonderfully useful little program called espanso – a “text expander” that I now use on a daily basis. Text expander programs “expand” short text “triggers” into longer / more complex strings of text, like this:\n\n\nThere are lots of these programs out there, such as Typinator, or the creatively-named “textexpander” (macOS even has a build-in expander). But I prefer to use open source software whenever possible (the espanso source code is on GitHub), and especially when using software that tracks the text you type (imagine how easy it would be to grab passwords with this type of software). Plus espanso is free 😄. Of course if you’d like to support the project you can donate here.\nMore than anything, I started (and kept) using espanso because I found it to be remarkably simple to use and easy to customize. So far, it has integrated quite nicely into my day-to-day workflow. In fact, I have become such a fan of espanso that I decided to sit down and write a whole post highlighting some of the ways I use it.\nQuick overview\n\nNote: I’m skipping a lot of details about installation, etc. because it’s all there in the espanso documentation. It’s really simple to get it up and running – check it out.\n\nTo set a text expansion, all you have to do is edit the default.yml file. You can find it’s location with this in the terminal:\n\nespanso path\n\n/Users/jhelvy/Library/Preferences/espanso\nOnce you’ve opened up the default.yml file, you can define an expansion like this:\n\n\n # Simple text replacement\n - trigger: \":test\"\n replace: \"Hello World!\"\n\n\n\nNow anytime I type :test, the text will expand to Hello World!. Pretty cool! 😎\nWhat I really love about this interface is it’s simplicity. With just a few lines you can define a wide range of custom expansions. After using it for just a few weeks, I have already come up with several dozen expansions that I now regularly use.\nI’ll spend the rest of this post highlighting a few use cases, but for reference I also included a copy of most of my settings at the end of this post (I took out some snippets with personal information like the link to my personal Zoom room, etc.)\nPersonal information\nI can’t stand having to type my name, address, email, etc. into online forms. I probably have to do this once a week or so for one reason or another, so coming up with some quick expansions for common personal information is extremely satisfying.\n\nNote: It’s probably a really bad idea to put passwords you use in your espanso settings, so don’t do that!\n\nHere are a few I’ve set up:\n\n\n\nTrigger\nExpansion\n\":jp\"\n\"John Paul Helveston\"\n\":jh\"\n\"John Helveston\"\n\":title\"\n\"Assistant Professor\"\n\":dept\"\n\"Engineering Management and Systems Engineering\"\n\":gw\"\n\"George Washington University\"\n\"@!\"\n\"jph@gwu.edu\"\n\nURLs\nI used to rely on bookmarks for lots of common sites I visit, but I often need to copy-paste the actual url to those sites (e.g. if I’m sending the link to someone else). Before, I had to open up my browser, navigate to the site, then copy the url. Now I can just type a few keys 🎉! I start each of these with \"//\". For example, to get to my GitHub page, I just type //gh:\n\n\nBy far my most used url expansion this year was //zoom to expand to the url to my personal zoom room! 🚀\nCode\nThis is where things start getting really helpful. I write lots of common code snippets for a wide variety of use cases, and the ability to automate these globally (i.e. not just in a code IDE) on my computer is an incredible help. The time savings don’t come from avoiding the extra keystrokes – they come from avoiding typos! It’s all too easy to miss a tiny symbol or letter in a chunk of code which then results in several minutes (if not longer) of searching for the bug.\nAs a convention, I decided to start all of my “code” triggers with a double letter of the first letter of the code language. So html expansions all start with hh, R expansions start with rr, python with pp, etc. I use this rather than starting with the default : symbol because 1) I find it easier to remember, and 2) it doesn’t require me to use the Shift key (so it’s fast).\nHTML\nI don’t write a lot of html, but occasionally I’ll use it to insert an image in a RMarkdown document. And usually I want that image centered. The basic template I usually start with is:\n\n
\n\n<\/center>\n\nI use the trigger hhimg (which I remember as “html” + “img”) to insert this snippet. Here’s the code to define this expansion:\n\n\n - trigger: \"hhimg\"\n replace: \"
\\n\\n<\/center>\"\n\n\n\nNotice the $|$ symbols inside the quotes following “src”. This sets the location of the cursor once the expansion is complete, allowing me to quickly type in the name of the path to the image I want to insert. Another one of those little bonus features about espanso that is just 🤌 *chef’s kiss*.\nR\nSince I use R more than any other language, I’ve come up with a lot of R-specific shortcuts. But before I dive into them, a quick side note:\nOne important modification I learned (thanks to some help on the espanso Reddit community) is that if you’re working in RStudio, the expansions don’t quite work correctly on their own (I believe it might have something to do with RStudio’s autofill feature). To fix it, I had to insert the force_clipboard: true setting into each espanso expansion. For example, in my snippet to read in a CSV file, I inserted this option between the trigger and expansion text like this:\n\n\n - trigger: \"rrcsv\"\n force_clipboard: true\n replace: \"read_csv($|$)\"\n\n\n\nOkay with that in mind, I’ll highlight a few R expansions I use a lot. I’ve organized this based on my common workflow for exploring data:\nAdd a setup chunk (if working in Rmd)\nLoad libraries & data\nWrangle data\nVisualize\nAdd a setup chunk\nI usually include a “setup” chunk in most of my Rmd files, but it’s really hard to remember all the specific details for various settings, so I made a dedicated expansion just for this. With this expansion, I can insert my whole default setup chunk with rrsetup:\n - trigger: \"rrsetup\"\n force_clipboard: true\n replace: |\n ```{r setup, include=FALSE}\n knitr::opts_chunk$set(\n warning = FALSE,\n message = FALSE,\n fig.path = \"figs/\",\n fig.width = 7.252,\n fig.height = 4,\n comment = \"#>\",\n fig.retina = 3\n )\n ```\nLoad libraries & data\n\n\nI use some simple shortcuts to load individual libraries, like rrtidy to expand to library(tidyverse). But (as shown in the above image) I also set up an expansion to load some of my most often used libraries and settings. By typing rrstart, I get this expansion:\n\n\nlibrary(tidyverse)\nlibrary(cowplot)\noptions(dplyr.width = Inf)\n\n\n\nWith my libraries loaded, I then read in my data. I use rrcsv to expand to read_csv($|$), rrex to expand to read_excel($|$), and rrhere to expand to here::here($|$), which I use to define paths to files.\n\nSide note: if you don’t already, you should definitely try using the {here} package to define paths.\n\nWrangle data\n\n\nI usually wrangle my data using tidyverse commands, and I have come up with a few expansions for some common “pipelines”. For example, one common sequence is to do grouped summary operations. I use the rrgs (“gs” for “grouped summary”) trigger to load the code block:\n\n\n group_by() %>%\n summarise()\n\n\n\nLikewise, I use the rrgm trigger to load the code block for a grouped mutate:\n\n\n group_by() %>%\n mutate()\n\n\n\nThere are so many of these you could put together!\nVisualize\n\n\nSince {ggplot2} has such a distinct syntax, I decided to start my ggplot code snippets with gg rather than rr. Some of my favoriate {ggplot2} expansions are the scaling shortcuts:\nggsxc –> scale_x_continuous()\nggsyc –> scale_y_continuous()\nggsxd –> scale_x_discrete()\nggsyd –> scale_y_discrete()\nI also use some snippets for common geoms, like ggpoint:\n\n\nggplot() +\n geom_point(aes(x = , y = ))\n\n\n\nand ggcol:\n\n\nggplot() +\n geom_col(aes(x = , y = , fill = ))\n\n\n\nPaste clipboard contents as plain text\n\n\nThis is probably my favorite little hack. Often times, I want to copy-paste a snippet of text from one source to another, but when I go to paste it somewhere else, it retains all of the text formatting from the original source. My previous work around was to paste it first into a text editor (e.g. Sublime Text), then re-copy-paste it from there with the formatting now stripped away. But since espanso has support for working with your “clipboard”, I found I could simply paste the plain text by using the force_clipboard: true. I use ppp as the trigger for this:\n\n\n - trigger: \"ppp\"\n force_clipboard: true\n replace: \"{{clipboard}}\"\n vars:\n - name: \"clipboard\"\n type: \"clipboard\"\n\n\n\nOpen things\nYou can run terminal commands using an espanso trigger. I haven’t found too many ways to use this feature, but it is helpful for opening files or folders that may be buried deep down in some folder structure. For example, I use the :espanso trigger to open the default.yml file to edit my espanso settings:\n\n\n - trigger: \":espanso\"\n replace: \"{{output}}\"\n vars:\n - name: output\n type: shell\n params:\n cmd: \"open '/Users/jhelvy/gh/espanso/espanso/default.yml'\"\n\n\n\nDates\nThese come from the espanso documentation, but they’re super useful:\n\n\n - trigger: \":date\"\n force_clipboard: true\n replace: \"{{today}}\"\n vars:\n - name: today\n type: date\n params:\n format: \"%Y-%m-%d\"\n\n - trigger: \":time\"\n force_clipboard: true\n replace: \"{{time}}\"\n vars:\n - name: time\n type: date\n params:\n format: \"%H:%M\"\n\n\n\nPackages\nThere are already a lot of great espanso packages. My favorite so far is the emoji package. It’s super convenient, but I often forget the name of specific emojis, so I made another expansion to the url to an emoji cheat sheet site:\n\n\n - trigger: \"//emo\"\n force_clipboard: true\n replace: \"https://www.webfx.com/tools/emoji-cheat-sheet/\"\n\n\n\nHighlights from my default.yml file\nThanks for reading, and I hope you check out espanso! Here are some of the expansions I have come up with in my default.yml file (for now…again, I removed a few that had some personal information)\n# espanso configuration file\n\n# This is the default configuration file, change it as you like it\n# You can refer to the official documentation:\n# https://espanso.org/docs/\n\n# Matches are the substitution rules, when you type the \"trigger\" string\n# it gets replaced by the \"replace\" string.\n\nmatches:\n\n # Sign offs\n\n - trigger: \"cjo\"\n force_clipboard: true\n replace: \"Cheers,\\nJohn\"\n\n - trigger: \"bjo\"\n force_clipboard: true\n replace: \"Best,\\nJohn\"\n\n - trigger: \"rjo\"\n force_clipboard: true\n replace: \"Regards,\\nJohn\"\n\n - trigger: \"cjp\"\n force_clipboard: true\n replace: \"Cheers,\\nJP\"\n\n - trigger: \"cpr\"\n force_clipboard: true\n replace: \"Cheers,\\nProfessor Helveston\"\n\n # URLs\n\n - trigger: \"//emo\"\n force_clipboard: true\n replace: \"https://www.webfx.com/tools/emoji-cheat-sheet/\"\n\n - trigger: \"//dr\"\n force_clipboard: true\n replace: \"https://drive.google.com/drive/u/0/my-drive\"\n\n - trigger: \"//gh\"\n force_clipboard: true\n replace: \"https://github.com/jhelvy/\"\n\n - trigger: \"//jh\"\n force_clipboard: true\n replace: \"https://jhelvy.com/\"\n\n - trigger: \"//talks\"\n force_clipboard: true\n replace: \"https://jhelvy.com/talks\"\n\n - trigger: \"//pubs\"\n force_clipboard: true\n replace: \"https://jhelvy.com/publications\"\n\n - trigger: \"//gm\"\n force_clipboard: true\n replace: \"https://mail.google.com/mail/u/0/\"\n\n # GW\n\n - trigger: \":title\"\n force_clipboard: true\n replace: \"Assistant Professor\"\n\n - trigger: \":gw\"\n force_clipboard: true\n replace: \"George Washington University\"\n\n - trigger: \":dept\"\n force_clipboard: true\n replace: \"Engineering Management and Systems Engineering\"\n\n # Names\n\n - trigger: \":jp\"\n force_clipboard: true\n replace: \"John Paul Helveston\"\n\n - trigger: \":jh\"\n force_clipboard: true\n replace: \"John Helveston\"\n\n # HTML\n\n - trigger: \"hhimg\"\n force_clipboard: true\n replace: \"