Python overtakes R as the most preferred Analytics & Machine Learning language

Sept. 16, 2017, 9:52 a.m. By: Prakarsh Saxena

Python overtakes R

R no longer rules the Analytics and Machine Learning kingdom now. Python has finally overtaken it to be most used by programmers and researchers according to a survey by KDnuggets.

KDnuggets, on their website, launched a poll recently asking whether people use R, Python with its packages, or other languages and tools for Data Science, Analytics and Machine Learning, both for the year 2016 and 2017. The result didn’t surprise many since Python showed a significant jump in its use amongst Data Science and ML enthusiasts compared to the previous year to topple R from the top of the chart.

Did we see this change coming?

KDnuggets asked in the poll which was answered by roughly 950+ voters. The poll clearly showed that in 2017, Python userbase overtook R as the leading platform for Analytics and related fields. Although the story was not the same in 2016, when R had a 42% share in the user base compared to 34% of Python, which increased to 41% in 2017, while R experienced a drop in the same to around 36%. Although interestingly, the percentage of users who voted for both the languages as their most used tool increased from 8.5% last year to 12% this year. On a closer look at the data retrieved, we can observe the following things:

  • Python users have been more loyal to their language, with only 9% changing their sides to R and other tools, compared to almost 26% in the case of R compared to previous year. This represents a drastically changing scenario which was quite expected.

  • Only 5% Python users switched to R only, while 10% R users moved to Python. Among those who used both in 2016, only 49% kept using both the languages, while 38% moved to python and 11% moved to R.

By scrutinizing the data they had since 2014, it’s blatantly clear that this day had to come. User base of Python had been steadily increasing; starting from a meager 23% in 2014 which jumped steeply to 47% in 2017 (this figure also considers half the users who voted on ‘Both the languages’ as their option). Also, while the percentage of R users peaked at around 50% in 2015-2016, it has been experiencing steady fall in its user base to current 41%. The usage of other tools also fell as expected over the years.

The data was collected from all major parts of the world- the majority of the participation came from US/Canada and European countries with about 75% votes combined. Asia, Latin America, Africa/Middle East, and Australia/NZ were the other regions making up the rest of the 25%.

The future looks good for the Python users. With such an increase in its use, the developer community will be encouraged to roll out more useful packages and libraries to aid others in the field of Analytics and Data Science. However, it is also believed that R will retain most of its users for a long time to come as well.

You can have a look at the graphical analysis of the topic here.