As someone who learned most of my initial coding abilities through R and RStudio in a data science context, and since moved on to more “standard” languages and IDEs, I’ve yet to find anything that comes close to the flexibility and integration of RStudio for hacking together data analytics.
VS Code/Python has made some major improvements in the past couple years but it’s still very clunky compared to the ease of running R code line by line without having to start up a debug instance. And now with copilot the most frustrating parts of R (such as remembering all the Tidyverse syntax) have been abstracted away.
My partner does a lot of biostats in RStudio and I really think it breds terrible habits. Instead of categorizing code by files, everything is shoved into massive files. Instead of running a file top-to-bottom, code is run out-of-order which makes the code organization and flow of a program a complete disaster.
There is something to be said about running and processing large CSVs and keeping that in memory while running other parts of the program as well as having clickable access to all the dataframes loaded into memory.
There's nothing about RStudio that encourages big single files or writing huge unstructured scripts. RStudio is a pretty good IDE, and R is a highly expressive functional-first [0] language. R was heavily influenced by Scheme, and has its own powerful metaprogramming [1] system - which is used to great effect in Tidyverse[2] libraries to make APIs that are nicer and convenient than anything reasonably practical in Python.
The problem with a lot of end-user R code is that it is written by statisticians, not programmers. They'd write the same garbage and huge scripts in Python (trust me, I know).
This is the defacto standard way of operating it I understand, which is mostly just hacking at stuff in small chunks until it sort of works and leaving comments throughout it with "run this bit on Tuesdays only".
I recently had to inherit someone's R stuff and I had to learn R and fix it all. It now runs from a makefile repeatably.
> Instead of running a file top-to-bottom, code is run out-of-order which makes the code organization and flow of a program a complete disaster.
That's more a REPL issue than specific to a particular language. It's the tradeoff you make. I write my R programs in Geany and then run the whole thing using Rscript. That gives me a clean environment on every run.
Emacs + ESS? Way more flexible. Maybe less integration because many of the big R package devs work for Posit. RStudio has a lot of superfluous junk in the UI I just don't need or care about.
I've used ESS for the past few years and recently tried using RStudio when I'm on Windows. For my purposes, which is just a little industrial statistics on the side, they are remarkably similar. I feel right at home in either!
I agree - I teach statistics at a University and there is really no alternative to Rstudio for working with R. This is especially true considering that the vast majority of folk using R (in my field) have no prior programming experience. Downloading R, Vscode, downloading some R plugin, getting them to talk to each other, and only then starting to learn R - isn't very straightforward. It's also remarkably consistent on different operating systems - something to consider when half the students are on windows, half on macos...
RStudio Server on a Digital Ocean instance made my life a lot easier. Students fire up a browser, log in, and they're using R with all the packages. It was horrible when students ran R on their own machines back in the old days. Most of the questions I got were tech support rather than related to the material. And these days it has good Python support too.
RStudio is just way better at choosing what code to send (if you only send the line the cursor rests on you’re gonna have a bad time. VSCode is a bit better than that but not great. Also, where does your plots get drawn when you use this? RStudio just works in this regards)
It looks like, as far as I can tell, VS Code doesn't support the interactive window for working in R, which was a bit of a surprise to me when i looked it up.
The python interactive window has pretty much fully replaced my use of jupyter, since it gives you notebook-style output without the annoyance of the notebook format. My usual workflow is highlighting lines of code and shift-enter to execute (there's also a cells syntax).
I'm surprised by this because it _is_ possible to use R in Jupyter (although I never really liked the experience, R Studio was far superior).
An alternative in the Python world that is definitely worth looking into is the JupyterLab Desktop app, which is a standalone installer that is cross-platform and works great for beginners (no command line needed): https://github.com/jupyterlab/jupyterlab-desktop?tab=readme-...
See my other comment in the main thread with more info.
The killer feature of RStudio for me is RMarkdown.
I composed almost all my homeworks in grad school using RMarkdown in RStudio. You get LaTeX whenever you need it, code (I usually use it for R or Julia), and markdown for ordinary text. The kable function renders tables nicely from data frames and ggplot2 creates beautiful plots.
Mathematica and Jupyter have a few advantages, but overall I'm very happy with RStudio.
RMarkdown in RStudio was the killer feature, until the VSCode R extension matured. Not only does it support RMarkdown, it adds a ton of features RStudio doesn't have and runs a lot faster. https://github.com/REditorSupport/vscode-R/wiki/R-Markdown
For my uses, it replaced RStudio 100% of the time.
It’s really nice to have everything you need in one spot. Plus it’ll run on any OS and is free. I started learning how to program with C++ back in the early 2000s which required Windows and a Visual Studio license and it was still a pain to get stuff done. Whether it’s RStudio or Jupyter there’s really never been a better time to start picking up a language and building something useful. Three cheers for the creators, maintainers and community who support tools like this.
That pricing sheet is for Posit Workbench; RStudio Server[0] can host as many people as you have the compute for, and it's free and open source. It does only support one session per user, but might meet the needs of a small research group.
The closest Python equivalent to RStudio is the JupyterLab Desktop app[1,2], which I highly recommend. I've entirely switched to using it for teaching, and it is a godsend, since it works the same way across platforms (win/mac/linux), installs its own Python interpreter independent of any system Python the student might have, and even comes with NumPy/SciPy/Pandas/Seaborn/statsmodels already installed, which makes it possible for me to skip the `pip ...` or `conda ...` instructions altogether.
Between the standalone desktop app, and the convenience of running JypyterLab in the cloud thanks to https://mybinder.org/ links, there is now a smooth path for beginners getting into stats/ML/data science: (1) read notebook on github or nbviewer, (2) run notebooks in the cloud via mybinder links, (3) install JupyterLab Desktop app, (4) learn to install Python+env-manager via command line. Previously, new learners were forced to jump straight to (4), but now there are logical steps along the way!
It's the same stack (jupyterlab server backend + web frontend) but wrapped as an electron app.
Yeah for sure when I use RStudio it seems much more polished, but I guess my attachment to (and comfort with) Python still makes it worthwhile to use JupterLab rather than switch to RStudio.
RStudio and the R language are a couple of my absolute favorite pieces of software. While I'm a software engineer by trade, every once in a while I need to do some data analysis work and throwing together a notebook in RStudio always makes me feel like I'm using a cheat code. For simple tasks, everything is incredibly seamless, plus coworkers who are unfamiliar with R are usually impressed by how nice ggplot visualizations can look.
I'm about as old school as you can get with preference for CLI and simple text-oriented development environments. I recently picked up R again for a long-term data science project (https://matttproud.com/blog/posts/teaser-weather-temp-repres...) after having not used it since university. In spite of a fair bit of annoyance with the R language (https://matttproud.com/blog/posts/rant-and-r-melt-function.h...), I found RStudio to make the prototyping process with R actually tolerable. Big kudos to Posit and the R community for RStudio.
There are a couple of things I would love for the R ecosystem: project scaffolding to do bulk data generation (e.g., from continuously generated data sets). What's the best way to do this: makefiles, or what? I have a relatively short entrypoint R file that sources other leaf files to run specific analyses, but it makes the software engineer inside of me want to curl up and die.
reshape2 (where `melt` is from) has been deprecated for some time, and for pretty good reasons. Try dplyr and tidyr instead - they are much nicer and modern. The equivalent of melt would be pivot_longer. For packaging, renv is the usual choice. I wouldn't structure the package as a bunch of scripts with an entrypoint. Just write functions as you would in other languages, and keep any specific analysis script small.
VS Code/Python has made some major improvements in the past couple years but it’s still very clunky compared to the ease of running R code line by line without having to start up a debug instance. And now with copilot the most frustrating parts of R (such as remembering all the Tidyverse syntax) have been abstracted away.
There is something to be said about running and processing large CSVs and keeping that in memory while running other parts of the program as well as having clickable access to all the dataframes loaded into memory.
The problem with a lot of end-user R code is that it is written by statisticians, not programmers. They'd write the same garbage and huge scripts in Python (trust me, I know).
[0] http://adv-r.had.co.nz/Functional-programming.html
[1] https://adv-r.hadley.nz/metaprogramming.html
[2] https://www.tidyverse.org/
That's not really RStudio's fault. It is just how many people use R and were taught.
> code is run out-of-order which makes the code organization and flow of a program a complete disaster.
In my experience, with R Markdown, this is untrue. I see Jupyter Notebooks with cells run out of order much more often.
I recently had to inherit someone's R stuff and I had to learn R and fix it all. It now runs from a makefile repeatably.
Anyway it could be worse. It could be Minitab.
That's more a REPL issue than specific to a particular language. It's the tradeoff you make. I write my R programs in Geany and then run the whole thing using Rscript. That gives me a clean environment on every run.
Just open a .py file, then select the snippet of code you want to run and cmd+enter
It will open a new REPL for you (using your selected interpreter) the first time, and after that all commands are run in that same one.
The python interactive window has pretty much fully replaced my use of jupyter, since it gives you notebook-style output without the annoyance of the notebook format. My usual workflow is highlighting lines of code and shift-enter to execute (there's also a cells syntax).
I'm surprised by this because it _is_ possible to use R in Jupyter (although I never really liked the experience, R Studio was far superior).
Yes it does.
See my other comment in the main thread with more info.
Is there a good demo or video you can point to that shows this? I have no experience with R, RStudio, or data science, but you've piqued my interest.
https://www.youtube.com/@safe4democracy/featured
I composed almost all my homeworks in grad school using RMarkdown in RStudio. You get LaTeX whenever you need it, code (I usually use it for R or Julia), and markdown for ordinary text. The kable function renders tables nicely from data frames and ggplot2 creates beautiful plots.
Mathematica and Jupyter have a few advantages, but overall I'm very happy with RStudio.
For my uses, it replaced RStudio 100% of the time.
https://posit.co/pricing/individual-products/
If you want a Rstudio server to host for a research group containing more than 5 people, talk to their sales Rep.
Otherwise each person will need to host their own Rstudio server side-by-side on the same machine.
Jupyter and JupyterHub is the way forward.
Especially if they get multi-kernel notebooks mainlined (read: what Org-Mode has been doing for decades)
[0] https://posit.co/download/rstudio-server/
Between the standalone desktop app, and the convenience of running JypyterLab in the cloud thanks to https://mybinder.org/ links, there is now a smooth path for beginners getting into stats/ML/data science: (1) read notebook on github or nbviewer, (2) run notebooks in the cloud via mybinder links, (3) install JupyterLab Desktop app, (4) learn to install Python+env-manager via command line. Previously, new learners were forced to jump straight to (4), but now there are logical steps along the way!
[1] https://github.com/jupyterlab/jupyterlab-desktop?tab=readme-...
[2] https://blog.jupyter.org/jupyterlab-desktop-app-now-availabl...
Yeah for sure when I use RStudio it seems much more polished, but I guess my attachment to (and comfort with) Python still makes it worthwhile to use JupterLab rather than switch to RStudio.
There are a couple of things I would love for the R ecosystem: project scaffolding to do bulk data generation (e.g., from continuously generated data sets). What's the best way to do this: makefiles, or what? I have a relatively short entrypoint R file that sources other leaf files to run specific analyses, but it makes the software engineer inside of me want to curl up and die.
https://tidyr.tidyverse.org/