DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • How Artificial Intelligence (AI) Is Transforming the Mortgage Industry
  • Unlocking the Secrets of Data Privacy: Navigating the World of Data Anonymization: Part 2
  • The Power of AI: Why Web Developers Still Reign Supreme
  • Python vs. R: A Comparison of Machine Learning in the Medical Industry

Trending

  • AI's Dilemma: When to Retrain and When to Unlearn?
  • Article Moderation: Your Questions, Answered
  • Secure by Design: Modernizing Authentication With Centralized Access and Adaptive Signals
  • Event-Driven Architectures: Designing Scalable and Resilient Cloud Solutions
  1. DZone
  2. Coding
  3. Languages
  4. Which Programming Language Is Better: R, Scala, or Python?

Which Programming Language Is Better: R, Scala, or Python?

I use R, Scala, and Python based on which is better-suited for my specific big data use cases. This is my personal view and usage of the languages.

By 
Tomer Ben David user avatar
Tomer Ben David
·
Feb. 02, 18 · Opinion
Likes (22)
Comment
Save
Tweet
Share
35.3K Views

Join the DZone community and get the full member experience.

Join For Free

I recently answered the above question. I didn't phrase the question, but it's a good starting point. I typically stay away from language debates, but this one really interested me, as I have debated the question with myself a lot. I was researching this specific question because I wanted to know which language to use for my next data project. Here are my personal insights. Please let me know what you think!

I use R, Scala, and Python based on which is better-suited for my specific use cases. This is my personal view and usage of the languages.

Use R as a replacement for a spreadsheet. Together with RStudio, it makes a killer statistics, plotting, and data analytics application. You can take log files, parse them, graph them, pivot table them, filter them, etc. — and all with great support from RStudio. It’s a killer data analysis language and workspace. You should consider it as a replacement for spreadsheet workings.

Do you want to grep some lines from a text file? No problem! Just use dateLines <- grep(x = mylog, pattern = "--", value = TRUE). It’s a backfiring arrow and is easy to write once you know the command you need to use. It’s often very difficult to figure out the correct command to use; practice and note-taking are key. This requires time. Consider whether you have the time to commit to it. If not, just use it as your spreadsheet from time to time until you get better with it. Save a note or doc with useful R commands. You will find that with a few plotting commands, you can be a small king in its realm. This example of grep is only one of a million of abilities; RStudio will have you doing analytics like crazy on data.

If you have no time for the above, I still highly recommend that you install RStudio, use it from time to time, and get the hang of it. There is nothing like it that I know of that is so good for quick data analysis and statistics. Just give it a shot and try to replace your routine calculations and quick data manipulations tasks with it.

You can also move on and do machine learning in R. It has extremely powerful libraries for this (i.e. rpart, caret, e1071), and by all means, if you and your teams are fluent with it, feel free to use it. But personally, I would only use it for speculations and quick analysis or modeling. I stop there. It can be very quick, but this is when I turn to language #2: Python.

Use Python for small- to medium-sized data processing applications. Python introduced some type-checking in recent releases, which is awesome. Also, it's an interpreted language, so you have the great benefit of speed of programming. You just write your code and run. However, the caveat is that you don’t have the amazing compiler and features (the good ones, not the kitchen sink one) from Scala. As long as your project is small- to medium-sized, Python is a suitable option.

It's going to be very helpful as you utilize NLTK, matplotlib, numpy, and pandas — and you will have a great time using them. This will take you on the fast route to machine learning, with great examples bundled into the libraries.

I’m not saying you can't do this with R or Scala with great success — I’m just saying that for my personal use, this is the most intuitive way to do what I use it for.

Let's say that I want a quick analysis of CSV: I turn to R. If I want a bulletproof fast app to scale quickly, I use Scala. If my project is expected to be big and to involve many developers, I turn to language/framework #3: Java/Scala.

Use Scala or Java for larger robust projects to ease maintenance. While many would argue that Scala is bad for maintenance, I would argue that this is not necessarily the case. Java and Scala, with their mostly super-strongly typed and compiled features, are great languages for large-scale projects. You have Spark OpenNLP libraries for machine learning and big data. They are robust and they work at scale. It’s true that it will take you a longer time to code in them than in Python, but the maintenance and onboarding of new data will be easier — at least in my experience.

Data is modeled with case classes. It has proper function signatures, proper immutability, and proper separation of concerns.

While the above could be applied in any of these languages, it’s more natural with Scala/Java.

But if you don’t have the time or desire to work with them all, this is what I would do:

  • R: Good for research, plotting, and data analysis.

  • Python: Good for small- or medium-scale projects to build models and analyze data, especially for fast startups or small teams.

  • Scala/Java: Good for robust programming with many developers and teams; it has fewer machine learning utilities than Python and R, but it makes up for it with increased code maintenance.

It’s a challenge to learn them all. I’m still in this challenge, and it’s a true headache, but at the end, you benefit. If you want only one of them, I would consider the following:

  1. Am I managing a project with many teams and many workers, where speed is not the top priority, but stability? Go with Java/Scala.
  2. Am I managing few personal projects that require quick results, or quick machine learning for a startup? Go with Python.
  3. Do I just want to hack into my laptop data analysis and enhance my spreadsheet data analysis and machine learning skills? Go with Python or R.
R (programming language) Python (language) Scala (programming language) Machine learning Data processing Data analysis IT

Opinions expressed by DZone contributors are their own.

Related

  • How Artificial Intelligence (AI) Is Transforming the Mortgage Industry
  • Unlocking the Secrets of Data Privacy: Navigating the World of Data Anonymization: Part 2
  • The Power of AI: Why Web Developers Still Reign Supreme
  • Python vs. R: A Comparison of Machine Learning in the Medical Industry

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!