Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Novice Publishing #177

Closed
cwickham opened this issue Aug 8, 2019 · 11 comments
Closed

Discussion: Novice Publishing #177

cwickham opened this issue Aug 8, 2019 · 11 comments
Labels
discussion discussion before a proposal

Comments

@cwickham
Copy link
Contributor

cwickham commented Aug 8, 2019

I'm looking for a new section to take on and thought about Publishing, but I have a number of questions about this content that might be worth discussing.

These questions arise from my own levels of publishing, which from least effort to most effort consist of:

  1. README.md (generated from README.Rmd) on github
  2. index.html (generated from index.Rmd) in /docs in github repo with GitHub pages turned on. (I.e. I ignore jekyll)
  3. A hugo site using the blogdown package and hosted on netlify

My thoughts are that a novice student should definitely master the first level. But, this doesn't actually require any discussion of HTML...once the novice has seen Rmarkdown and version control (+ github), it's almost a freebie, especially with:

usethis::use_readme_rmd()

I don't use 2. very often, but it is an easy way to get an actual website, rather than a README page on github, so for example, you can include interactive plots etc. For an R novice it's quite nice because you can take what was a locally generated HTML file from Rmarkdown and make it easily visible to the world.

Mostly now if I need a website I go route 3. blogdown makes this very easy, in that I don't have to install anything else (beyond what gets installed with install.packages("blogdown")), I don't have to leave the R console, and I also don't have to write any HTML or CSS. I usually have the source in version control and then rely on netlify to re-build on push. Is this too much for a novice? Allison Hill argues in this talk that if you drop the version control/CI part, you can get people up and running with a website in 90mins.

What do you think the end point should be for a novice: 1, 2, or 3?

Have a missed another option for publishing that might be more appropriate?

@cwickham cwickham added the discussion discussion before a proposal label Aug 8, 2019
@lwjohnst86
Copy link
Contributor

lwjohnst86 commented Aug 8, 2019

These are really good questions. While I personally feel setting up websites is a bit much for a novice, I do see the value in walking through it step by step and showing how to get one going... I wish I had something like that when I first tried getting a website going and sorta failed.

While I think blogdown is really nice, there are a lot lot of assumptions made about the user's knowledge. For myself, I had a look at blogdown, got confused (first time I encountered TOML and had no idea what that was... even though it basically is YAML... but I didn't realize 🤷‍♂️), left it for 9 months, then came back, decided to use it for setting up a workshop website (strong motivation right there!), dug in a bit more, and figured it out enough to get the website up. I'm not a novice and I felt there's a lot to consider when building a website, for instance that file and folder location are super important for some files.

Another option is to use the base rmarkdown method for creating websites. I personally see this as a very simple, low threshold way to get a website up and going, since they will already have encountered generating HTML files from R Markdown. It seems like a nice logical next step to me. My 2c ☺️

@cwickham
Copy link
Contributor Author

cwickham commented Aug 8, 2019

Agree with your points about blogdown, it seems "very easy" to me, but that's with a history of writing HTML from scratch, using jekyll, using hugo, and then learning blogdown. So, probably not for many and most definitely not for novices. Is it worth having it in the intermediate material?

Ah yes, I'd forgotten about the rmarkdown method!

I'd also forgotten about things like RPubs, perhaps an even easier way to get local documents live on the web.

I think the goal here is for novices to see the whole process of data analysis where the last step is sharing work with others. (I think there is also a thrill for many students in being able to send a link to a parent/SO/friend and saying "look what I made"). From that point of view, I think a minimal approach might look like one of these options:

  • Use a service to get your Rmarkdown files on the web: RPubs.
  • Talk about ways to make a github repo more "visitor" friendly, add README.md, using github_output in Rmarkdown, how to link to other files within a repo.
  • Talk about ways to get a single webpage online, e.g. use Rmarkdown to generate HTML (or md?) and use github pages

I think covering all of these is probably too much, so which should we focus on?

Then there is the question of multi-page sites, I think it might be overkill, but it could be an optional section, covering the base rmarkdown method, along with hosting on github pages.

@lwjohnst86
Copy link
Contributor

I like the second item (more "visitor friendly") as it will link nicely with the git lesson. +1 to that.

@ChristinaLK
Copy link
Contributor

Christina needs to investigate publishing via html w/ spyder...

@joelostblom
Copy link
Contributor

For online publishing with Python, I have not seen any ready made blogging packages targeted at publishing data stories or literate code with output (except for this outdated hugo jupyter blend). As such, I think the easiest is to create an .html file and putting it in a folder that will be rendered on the website (or an .md document if the website framework does the conversion to HTML).

I see three main possibilities to publish HTML reports with Python.

  1. Write code and text in .ipynb format using JupyterLab.
  2. Write code and text in .Rmd format using RStudio.
  3. Write code however you want, save the figures and reference them from a markdown document that is written separately in an editor that supports converting it to HTML.

With Spyder, we would rely on option 3 and then a separate editor for the markdown part. In general, I think writing longer reports in markdown that is separate from where the code is run is a great option as I don't want to rerun all figure generating analysis code each time I update the text and rerender the manuscript. For shorter reports, I think option 1 and 2 can be quicker, and that it can feel clunky using two editors and two formats instead of one (you could use Spyder to edit existing .md files, but I did not see an option to create new ones and there is no convert/export) .

There are many editors that support rendering markdown to HTML as long as full pandoc compatibility is not required. If full pandoc compatibility is required, e.g. for academic writing, I think RStudio is the best Pandoc GUI around. If anyone is interested in more details on the academic writing part for Python, I pasted some notes below:

Academic editors details
  • I think the key attributes (for novices) are

    • Syntax highlighting support for as much as possible of the pandoc markup.
    • Automatic output generation without resorting to the command line pandoc tool.
    • The ability to search for and automatically insert citation keys (like the Zotero word plugin).
    • Comprehensive documentation for how to use pandoc with the editor.
  • With the criteria above in mind, I browsed through the list of markdown editors in the pandoc repo. In short, the viable options for novices seems to be brackets, sublime, atom, rstudio, and texts. This is how they compare in terms of the four criteria I listed above (I have not tested them, just had a quick look, there might be errors).

    Name Syntax Auto output Citations Docs
    Brackets Plugin N Plugin N
    Sublime Plugin Plugin Plugin N
    Atom Plugin Plugin Plugin N
    Texts Y ? Y Y
    RStudio Y Y ? Y
    • Out of these, RStudio has by far the most extensive documentation on how to use the editor and how to use it specifically for academic writing. The downside is that it might be seen as a bit of overkill just for markdown writing (but that goes for all the other alternatives except Texts).
    • I could not get an autocompletion pop up for citation in RStudio, I suspect that this is there and I am just missing how to get it, does anyone know how to get it? To be clear, this is what I mean.

@ChristinaLK
Copy link
Contributor

Thanks @joelostblom . The plan was to use the option you described (creating a single .html file and putting it somewhere where it can be served). I'll explore options 1 and 3. It sounds like the differentiating factor between the two will be how closely we want the report to be tied to the code.

@mbonsma
Copy link
Contributor

mbonsma commented Aug 15, 2019

After some googling (and observing that Spyder plugins for publishing are not likely to be a good option, see #185), my feeling is that introducing Jupyter as a publishing tool could work well.

Pros:

  • Most similar to R markdown with R Studio
  • Integrates closely with code
  • Learn to use Jupyter in the process
  • All-GUI implementation, but with a smooth pathway to get fancier (i.e. jupytext, pandoc, etc)

Cons:

  • We decided not to teach Jupyter, but now we are?
  • Tricky installation
  • Probably takes longer to learn to use than a text editor
  • Unclear to me how you transfer code from Spyder to Jupyter (in the best-practice world)

@joelostblom
Copy link
Contributor

Regarding your last con @mbonsma, juptext is a tool that converts any python script to a notebook, it works especially well with py-files that has code chunks. The conversions happen under the hood, and you could even use .Rmd or .py as your notebook file format without ever touch the ipynb (which I believe might alleviate some of the concerns @gvwilson had around git with json for notebooks). I wrote a bit more about it in the literate programming issue. In general, I am not sure how much we want to rely on an extension, but I think it it better than not having the functionality.

Regarding the first con, we decided to include jupyter as an additional section as an alternative editor (I volunteered to write this during the meeting a long time ago). So expanding into publishing would not require bringing up a completely new tool, just extending the jupyter section.

@mbonsma
Copy link
Contributor

mbonsma commented Aug 15, 2019

@joelostblom agree that jupytext looks like the way to go, but as you mentioned in the literate programming issue, can it be done without the command line? Is that too advanced? (I just installed jupytext and it broke my jupyter lab, but now it's working again, yay! So excited to use it!)

Good point about the first con.

@joelostblom
Copy link
Contributor

@mbonsma Yes, you can use it from JupyterLab and from the classic notebook! The extension is bundled with the conda/pip installation so you should not have to do anything, JupyterLab should rebuild the next time it is opened and then you can pair the notebook to a script from the command palette.

@lwjohnst86
Copy link
Contributor

Closing since both publishing chapters are done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion discussion before a proposal
Projects
None yet
Development

No branches or pull requests

5 participants