RStudio is now Posit

toddm · on Nov 4, 2022

From Hadley's video blurb, the ringer is when he states "things just work" in the R environment vis-à-vis python (where he tactfully yet implicitly acknowledges the shitshow that is the python library/package/environment management).

Kudos to the R community and supporters for providing a great and useful platform!

martinsmit · on Nov 4, 2022

As a relatively new programmer who entered it through statistics, I've still yet to have a better UX or "it just works" moment than using the tidyverse. Years after moving to Julia, Python, and Rust I still go back to R to do any tabular data work. Speed isn't an issue, I always have data.table, and I'm productive in a way that I could only hope to be while doing non tabular data tasks.

RStudio is the perfect IDE. REPL/command-line + Scripts + Plots. I could not be happier using it and I wish I could get VSCode to be half as good. Julia for VSCode is pretty good, but the Python science tooling goes 100% towards notebook environments which I'm not a huge fan of so the Python Science VScode experience is subpar.

c7b · on Nov 4, 2022

R is great, and so are some of the packages that lead to the tidyverse, but I think the latter was a bit too much. Re-inventing what already worked with new packages, always overloading R syntax in weird ways (looking at you, ggplot2). I've actually found myself moving back to base R for many of the more basic manipulation tasks.

civilized · on Nov 5, 2022

Base R is loved only by those who were unlucky enough to spend years using it when there was no alternative.

c7b · on Nov 5, 2022

Base R is far from perfect, but for many basic manipulation tasks it works just as fine as tidyverse. Maybe not with piping, but that doesn't really save anything if you format it readably.

There's something to be said about code that just works out of the box. I don't see the need to maximize dependence on third-party libraries as long as the gains are purely "ergonomic". Especially when the creators have a somewhat mixed record regarding long-term commitment vs re-inventing their own wheel.

The real selling point of R imho aren't the data science tools anyway - for that we already have the amazing Python ecosystem (which also the RStudio guys have tacitly admitted with their rebranding) - but the pure statistics packages. Especially if you need something more niche, to the point that you'd use any language just to get an implementation of a specific model, you'll find yourself coming back to R more than half the time. It's simply the language of choice where most statisticians publish their code.

civilized · on Nov 5, 2022

R has some superior data science tools. For example, the tabular data packages dplyr and data.table have no adequate parallels in the Python world. There are many also-rans but no real rivals.

_Wintermute · on Nov 5, 2022

Or anyone who has tried to re-run R code that was written more 6 months ago.

civilized · on Nov 5, 2022

I write R full-time using the full suite of tidyverse packages and that's just not an issue these days. Maybe a few years ago.

And anyway, you'll hit the same issue using third-party packages in any language.

lordgroff · on Nov 4, 2022

Data.table has got to be the most underrated library ever. If the problem is not big data, but other libraries struggle, fire up data.table and marvel at the sheer speed of it all. On rare tasks when everything was too slow, I'd drop down to c++. Data table essentially made sure that was pretty much never required.

trts · on Nov 4, 2022

My workflow for the past year has been develop/analyze in Rstudio and port to Python when ready to deploy to production. Notebooks and VSCode still feel cumbersome to me and not designed as an analytics-first solution.

The time needed to re-write a script in another language, and often using different packages, seems more than made up for by the ease of use of Rstudio.

lordgroff · on Nov 4, 2022

I honestly don't understand why we all went this route of "port to Python". I mean we do it too (not my choice) but it really makes no sense to me.

disgruntledphd2 · on Nov 5, 2022

Because software engineers hate R, and nobody really hates Python that much.

modriano · on Nov 4, 2022

I've found the Jupiter lab IDE to be ideal for my DS and EDA workflows, which typically involves having several notebooks, scripts, and terminal windows open in the IDE. I switched from preferring R to python because having a global shared state across everything in RStudio kept switching the working directory or loading a different file than expected, and I just had very little confidence that things I wrote in R would be reproducible a few years on (which appears to have been a correct concern [0]).

[0] https://datacolada.org/100

civilized · on Nov 5, 2022

You can have multiple RStudio instances open for different projects with different states, and you can use renv to manage package versions reproducibly. None of this is different or worse than python, which is well-known to have its own environment difficulties.

bscphil · on Nov 4, 2022

> acknowledges the shitshow that is the python library/package/environment management

I'm puzzled by this and wonder if you can provide some examples. The scientists I know tend to have incredibly disorganized R code, with a bunch of hard-coded paths and a single global environment in their home directory that all their R packages get installed to. Even stuff that seems critically important like reproducible science can be much harder than you'd expect in a lot of fields because questions like "what version of the libraries did you use" has to be answered (if it can be answered at all) by looking at the references in the paper.

Whereas in Python, I don't know how things could be any simpler. Creating an individualized environment for your project is one command. Installing packages that only live inside that environment is one `pip install` away. Most scientific work is not "distributed" in the sense of having users, but if you do ship a product to users, Python gives you the option of either relying on distribution provided packages (my preferred approach most of the time) or shipping a single binary created with something like PyInstaller.

Lyngbakr · on Nov 4, 2022

I've also seen my fair share of garbage R code and I think Gordon Shotwell's comment that "There really are no production languages – only production engineers" speaks to this.[0] A big problem in the scientific community is that scientists aren't trained to write code like production engineers. I don't see it necessarily as being an issue that is endemic to R, though.

Packrat[1] — an RStudio package — can be used to easily avoid the library versioning issues you describe. The problem isn't that the tooling isn't there or that it isn't easy to use. It's that some folks simply don't use it and are perhaps oblivious as to /why/ they should even use it, anyway.

[0] https://shotwell.ca/posts/2019-12-30-why-i-use-r/ [1] https://rstudio.github.io/packrat/

Gimpei · on Nov 4, 2022

Maybe, but I wonder if it is especially easy to produce horrific code in R. For example, I remember trying to refactor an R codebase that made ample use `load`, leading to all these mysterious variables appearing from nowhere.

dash2 · on Nov 4, 2022

packrat is a little old. You want renv, which is the iteration of the same idea and I've found works very simply and nicely.

tetris11 · on Nov 4, 2022

Also R really didn't play well with conda for a while. It seems to be ironed out in recent years, but I remember the issues of previous years where trying to set up a reproducible R environment in conda was an unreliable endeavour.

goosedragons · on Nov 4, 2022

I think that has more to do with the fact that most scientists are not trained programmers! Plus a lot of data analysis work doesn't lend itself to the same style of programming IMO.

While there could be more effort in getting things like library versions out there a lot of journals don't care so there's no pressure on scientists to provide it.

toddm · on Nov 5, 2022

Most of the replies to your query have addressed the big issue, which is that "data scientists" are almost universally mediocre (at best) coders. It's endemic to that job position and is guaranteed to be language-invariant.

One factor that isn't helping generations younger than mine (mid-50's) is the continual evolution of tools that remove the user from all the underlying parts. I recently worked with someone who told me they "only know Databricks on Azure" and "don't know python." Their self-assessment was accurate, and the utility of that individual was essentially zero.

The problem with python is that people like myself - non-engineers, and mostly end users of software - spend an inordinate amount of time dealing with mismatched library dependencies, deprecated features, rolling-back python versions to get a working kernel and so on.

The fact that the business model of at least two companies (Enthought and Anaconda) is predicated on the difficulty of getting a functioning python environment to work in this day and age speaks volumes about the problem.

If we can't get past "which pip?," how can we expect the other stuff to "just work?"

ensan · on Nov 5, 2022

If you’re just an end user and a non-engineer, how can you (1) universally judge the level of programming of data scientists to be (2) mediocre?

lordgroff · on Nov 4, 2022

The SCIENTISTS have very disorganized R code and also on my experience even worse Python code where they learn inheritance and make some absolutely head scratching choices. It makes me weep from time to time.

Here's the thing, programming is a skill. If people think it's the "not important thing" only the result (seen this often in some of my previous positions), you're going to get disasters yeah.

As for package management in R you can use either Renv or conda. Been coding R for a decade and have always pinned down packages and you could do so well before tooling made it simple as pie.

bscphil · on Nov 4, 2022

> As for package management in R you can use either Renv or conda. Been coding R for a decade and have always pinned down packages and you could do so well before tooling made it simple as pie.

Right, I get that - but OP was claiming that package management in Python was a "shitshow". It's interesting that a lot of people are responding to my comment by saying "actually you can make package management in R just as easy as in Python, it's just that R programmers tend not to be professionals." Doesn't that just confirm my belief that Python's package management story is actually pretty good?

bobbruno · on Nov 5, 2022

While I tend to agree with most of the arguments that DS code is usually of low quality, and that DS are not well-trained in good development practices, I wonder if making them better coders is an attainable goal, or even a proper one. My reasons for that questioning: - Data science requires a significant stack of knowledge beyond coding - in fact, to be a useful DS in a company, you already have to learn about maths, business domains, keep up with the latest algorithms, know how to manipulate data, present, run experiments, analyse them, know deep statistics and some others I am probably forgetting. Adding the SW dev skills on top of that and expecting them to become good developers is a tall order, and only a small percentage of the DS community will achieve it. With the level of demand for ML, I don’t know if this will deliver on the market needs - it’s not that it’s not attainable, I think it’s not scalable; - People coming from a SW dev background tend to think DS is the same, just done by people who don’t code well. That is not true: code is the final product of software development, while it is but a tool for reaching the goal of finding a good ML approach for a DS. The consequence here is that SW dev has a much stronger reason for wanting good quality, maintainable code than DS does. When researching for a solution, many iterations of code written by DS will be discarded without ever having to go to production, and I don’t know if the overhead of keeping good tests, structuring the code, making small commits, etc., is justifiable in this scenario - the goal is not to have maintainable code, it is to see if the model+features has potential for solving the problem. - Evolution and maintenance are also a problem, because the structure that’s good for operations doesn’t help the job of research - it’s not common for a DS to work in a pipeline structure (which seems to be the emerging pattern for MLOps), and forcing them to use that structure on all iterations after the first will have significant productivity issues, to the point of putting success at risk;

I don’t have a solution for the points above, and I understand that, once a promising approach has been found, the code starts to matter much more, because Ops will require it to be automated and executed in a reliable way. For now, what I do is to do the research in a very loose way, not caring about good SW practices. When I find something good, I start refactoring the code to meet the Ops expectations. But I’m a CS major with decades of experience in coding and ML - it’s not reasonable to expect the entire DS community to develop the same skills, it takes too long.

Any ideas out there?

werewolf · on Nov 4, 2022

I have had similar experience. For me the most annoying was the work they put in making it difficult (nearly impossible) to use with conda environments.

caseyf7 · on Nov 5, 2022

Before Hadley, the R environment was also a shitshow. His contributions raised the level of the whole community

fbdab103 · on Nov 4, 2022

>...yet implicitly acknowledges the shitshow that is the python library/package/environment management)

(Disclosure, I am a Python programmer who has suffered through the trash packaging situation since forever)

Since when has R been in a position to cast shade on the reproducible environment of another language? Anytime I dip my toe into the R ecosystem, it feels anathema to development practices to find anyone using renv or equivalent to try and vendor dependencies. Enormous pain to try to try and get old R code running again.

sakras · on Nov 4, 2022

I first thought this was an announcement about how R now uses Posits instead of IEEE-754 floats… I wonder if the rebrand will cause any confusion for either party down the line.

layer8 · on Nov 4, 2022

Yeah, I was wondering why they’re renaming to the name of a numerical data type. I posit that the name will cause confusion. :)

kilbuz · on Nov 4, 2022

The rebrand makes a lot of sense, as the interest and support for Python in the DS/ML community keeps growing. I prefer R for data exploration and visualization, but knowing and leveraging both languages seems to be the way forward. Shiny for Python is a very interesting development.

Kudos to RStudio (Posit) for delivering great product over the last decade+ and growing a kind, helpful community!

rossdavidh · on Nov 4, 2022

If a C-level executive says, "Posit is not about pivoting from R to Python...", then most likely, it's about pivoting from R to Python.

kingo55 · on Nov 4, 2022

It's Hadley Wickham of R's tidyverse though, so I'm more inclined to believe the claim.

swyx · on Nov 4, 2022

its a beautiful thing when personal credibility cuts through generic cynicism :)

lordgroff · on Nov 4, 2022

I'd be shocked if they drop R but I'd dance a jig if they can have the Python experience be as good in their IDE. I despise notebooks (i could write an essay), and developing with vs code is still very very clunky in comparison.

nonethewiser · on Nov 5, 2022

Why do you despise notebooks? Optimized for reading over developing/running?

nonethewiser · on Nov 5, 2022

Why do you despise notebooks? Optimized for reading over developing?

forgotpwd16 · on Nov 4, 2022

RStudio the company changes name. RStudio the IDE will remain named as such (for time being at least).

kgwgk · on Nov 4, 2022

It’s not just the company.

Some RStudio products change their name - the “enterprise” offering.

Another RStudio product doest’t - the open-source IDE.

kasperset · on Nov 5, 2022

Have used Rstudio IDE extensively for past 3 years but recently switched to VScode.

VScode feels more refreshing as compared to RStudio. I love the extensions within VScode that allows it more flexibility as compared to RStudio. Also ability to view hex code as colors in the editor itself. Plus the ability to sync settings using GitHub is so convenient when using multiple computers. On the flip side, Rstudio is more convenient for beginners and being very R focused helps to focus on the "Statistics and data munging".

As for the Rstudio as a company, they have supported Python in the past but with the Quarto they went to extend beyond that. I feel Quarto is still work in progress and has more ambitious outlook as compared to RMarkdown. RStudio cloud is a good option when one have to use specific version of R and alleviates the "Reproducibility" issue to some extent. Especially, when someone does not want to deal with Docker or similar platform. I think RStudio cloud is one of my favorite offering from the company.

nojito · on Nov 4, 2022

Quarto is absolutely game changing.

Lyngbakr · on Nov 4, 2022

I'm currently rebuilding my personal website with it and I'm really impressed. As with most RStudio products I've encountered over the years, I find it is intuitive, well documented, and quite powerful. I also think RStudio has played an important role in making the R community quite pleasant and inclusive.

goosedragons · on Nov 4, 2022

Is it? I liked Rmarkdown until I discovered org-mode and org-babel. It does a lot of stuff better like the option to tangle chunks into multiple files which is killer (last time I looked Rmarkdown still lacked that, not sure about Quarto) and the ability to do stuff like make a table in my text and then USE it in R for calculations is amazing for making examples or grabbing some random HTML table off a site and doing something.

tmalsburg2 · on Nov 4, 2022

Agree. Org-babel makes the alternatives look like toys.

jprd · on Nov 5, 2022

"Org-Mode" requires the use of Emacs, though, correct? Can you offer a tool that doesn't involve having to join that cult, I'm asking for a friend who's already a loooooong time member of a completely different cult. One with less RSI, and RMS ;)

tmalsburg2 · on Nov 5, 2022

RMS plays a negligible role in the Emacs ecosystem today. My solution to RSI was vi keybindings as implemented by evil mode.

shepherdjerred · on Nov 4, 2022

I love Quarto! It's so much more pleasant than writing LaTeX, but you still get professional-looking documents with Python graphs that update themselves!

ansgri · on Nov 4, 2022

Yes it is, I’ve looked at various options to publish jupyter notebooks, finally found Quarto, and it’s a full publishing platform with surprisingly decent UX and easy customizability.

mi_lk · on Nov 4, 2022

say more? I'm curious

1980phipsi · on Nov 4, 2022

I don't want to speak for the person above, but I've used it a bit. It's like Jupyter Notebook with an eye towards producing really nice looking documents. It's pretty easy to use, though they are still working on it and it may not be feature-complete yet.

mbreese · on Nov 4, 2022

How different is it from RMarkdown? I thought is was supposed to be quite similar. Because that is really nice to use when preparing analyses for collaborators. I’d like to get back to having more “compiled” style notebooks for things aside from just R.

nojito · on Nov 4, 2022

It’s the successor to rmarkdown. The same team that worked on rmarkdown are working on quarto.

Lev1a · on Nov 4, 2022

I used RStudio in my university stats course, along with the strong recommendation by my professor to get the book "OpenIntro Statistics" (3rd edition back then).

[1]

I really couldn't care less about statistics, which like with many other topics/courses made/makes it incredibly hard for me to concentrate on and actually learn something about it. I could force the knowledge into my brain to be able to recite and use it in practice over and over again, but the moment the exams come around it's all gone from my head. That certainly made university very problematic.

[1] Edit to add: I forgot to say that using RStudio was the only remotely pleasant part of that Stats course and in later courses where some stats work was needed.

ipsum2 · on Nov 4, 2022

https://posit.co/products/open-source/rstudio/ It would be useful to put a screenshot or some details about what RStudio is, the page is not very descriptive.

scottmcdot · on Nov 4, 2022

I don't understand the post. Are they going to offer a Python IDE equivalent to RStudio?

d_sem · on Nov 4, 2022

I've been a huge fan of the RStudio IDE for its Matlab-like look and feel and its support for R. I hope it continues to improve and continue to be a helpful tool for the community.

kelsolaar · on Nov 5, 2022

RStudio-2022.07.2-576 cannot start without R installed by the look of it:

Error reading R script (), system error 2 (No such file or directory); Unable to find libR.dylib in expected locationswithin R Home directory /Library/Frameworks/R.framework/Resources

civilized · on Nov 5, 2022

As Meta demonstrated, it's important to change your name before accomplishing anything.

wodenokoto · on Nov 5, 2022

Yes. Installing R is literally step 1 in the r studio installation instructions.

https://posit.co/download/rstudio-desktop/

pmarreck · on Nov 5, 2022

That name's already taken by the possible successor to the terrible IEEE 754

https://en.wikipedia.org/wiki/Unum_(number_format)#Unum_III

chrisgd · on Nov 5, 2022

Raised a lot of capital from private equity recently too

gtsnexp · on Nov 5, 2022

Still unable to run it on Mac M1 Max. Anyone out there solved this?

c7b · on Nov 4, 2022

So this is not about RStudio the IDE but the company?

rsrsrs86 · on Nov 5, 2022

Geez, I hate PR.