Programming with R for Official Statistics: Introduction to Programming (2024)

Last updated on 2024-07-09 | Edit this page

Overview

Questions

  • What is programming?
  • What is object oriented programming?
  • How to document code?
  • What is a directory?

What is programming?

Programmers use programming languages to give instructions to theircomputers. In this course, we will learn how to use the open sourcelanguage R to complete common tasks required in the field of officialstatistics. This includes the basics of R, data manipulation, and bestpractices.

There are a few reasons why programming with R is useful for officialstatistics. Data manipulation and analysis with R is:

  • Time-saving: R can complete many computations on a large amountof data that would take a person a long time manually

  • Reproducible: This code can be re-run with other data with smallmodifications and shared with others to be applied to other newpurposes

  • Transparent: When you’ve completed a script using best practices,you should be left with a clear list of instructions to complete thedata analysis in the form of code. This avoids “black boxes” where ananalyst is unsure what they’ve done to the data to get it to it’s finalform

R is an object oriented programming language

Object oriented programming languages use objects as theirmain tools. These objects have classes, which describe theirgeneral properties. For example, in R you might work withnumeric objects, which would contain numbers. You could alsowork with characters, which would be composed of text. We’llexplore classes and data types thoroughly in Episode 3 (Data Types andStructures). We can assign “labels” to these objects, creating avariable and use them interchangeably. We assign objects withan assignment operator. In R, the most commonly used assignment operatoris <-. Try reproducing the example below on your machineby entering the code into the console and hitting the “run” button.

R

# Assign a number to a variablenumber_flowers <- 8# Print the variable's contentsprint(number_flowers)

We can get the value stored within the variable by printing it.

[1] 8

Assigning a new value to a variable breaks the connection with theold value; R forgets that number and applies the variable name to thenew value.

When you assign a value to a variable, R only stores the value, notthe calculation you used to create it. This is an important point ifyou’re used to the way a spreadsheet program automatically updateslinked cells. Let’s look at an example.

# Reassign the variablenumber_flowers <- 7

{: .language-r)

OUTPUT

[1] 7

Variable Naming Conventions

Historically, R programmers have used a variety of conventions fornaming variables. The . > character in R can be a validpart of a variable name; thus the above assignment could have easilybeen weight.kg <- 57.5. This is often confusing to Rnewcomers who have programmed in languages where . has amore significant meaning. Today, most R programmers 1) start variablenames with lower case letters, 2) separate words > in variable nameswith underscores, and 3) use only lowercase letters, underscores, andnumbers in variable names. The Tidyverse Style Guide includes asection on thisand other style considerations.

Documenting Code

Notice that in the above examples, hashtags (#) are usedbefore giving instructions that are intended for you rather than R.Hashtags produce comments, which are handy for leavinginformation about the code that will follow. Commenting as much code aspossible is part of best practices. Always comment your code! You owe itto your colleagues who may see your code (not to mention your futurecoding self).

# Hashtags go before commented code, which is not run# print("This code will not be run")print("Always comment your code!")

OUTPUT

[1] "Always comment your code!"

Directories

A directory is a location on your machine. Say you’d like to open afile that’s located in a folder on your computer. We need to tell Rwhere to look for the file if we expect to find it. Directories areusually listed by referencing nested folders separated by slashes. Thereare small differences due to operating system (OS), so refer todocumentation specific to your OS when learning to work with folderstructures.

For example: /Users/Documents/Learning-R points to afolder called “Learning-R” in a user’s documents folder. Depending onyour IDE (Integrated Development Environment) and setup, you can printyour current directory, known as the working directory. Rautomatically reads and writes files from and to your current workingdirectory.

R

# print current working directory getwd()

OUTPUT

[1] "/Users/Documents/

Before beginning our lessons, please set your working directory tothe folder that we created in the setup section withsetwd(). For example, if your folder is namedLearning-R:

R

# change current working directory setwd("~/Documents/Learning-R")

Key Points

  • Programming makes our work faster, more reproducible, and moretransparent.
  • R is an object oriented programming language
  • Document your code with comments
  • A working directory is the active location on your computer where Rcan read and write files
Programming with R for Official Statistics: Introduction to Programming (2024)

FAQs

Is statistics with R difficult? ›

For those who have a background in other programming languages or have worked previously in the data sciences, it may be easier to learn R than those who are novices to coding or this field. However, some Data Scientists struggle using R due to its numerous GUIs, extensive commands, and inconsistent function names.

Is R programming easy or hard? ›

Learning R can be tough, especially for beginners. Let's explore why many struggle and how to overcome these challenges. R's unique syntax and steep learning curve often surprise new learners. Its complex data structures and error messages can be overwhelming, particularly for those new to programming.

Is R programming harder than Python? ›

Overall, Python's easy-to-read syntax gives it a smoother learning curve. R tends to have a steeper learning curve at the beginning, but once you understand how to use its features, it gets significantly easier. Tip: Once you've learned one programming language, it's typically easier to learn another one.

Is R the easiest programming language? ›

How Does Learning R Programming Compare to Other Languages? Learning R is considered one of the more challenging programming languages to master.

Can I learn R in a week? ›

For learners with programming experience, you can become proficient in R within a couple weeks or less. Brand new programmers may take six weeks to a few months to become comfortable with the R language.

Is R easier than Java? ›

Learning R can be daunting, especially if you're new to programming or statistics. Learning Java is easy because it follows established programming principles and has a structured learning curve.

Is R programming a dying language? ›

In conclusion, the predictions of the death of the R programming language are premature. R continues to demonstrate its expertise, authority, and relevance in the domains of data analysis, statistical computing, data science, and software development.

Which is harder SQL or R? ›

SQL is generally easier to learn for beginners, especially those with no programming background. R has a steeper learning curve but offers more flexibility and depth in data analysis and visualization.

Is R harder than Excel? ›

Most people already learned the basics of Microsoft Excel in school. Once the data has been imported into an Excel sheet, using a point-and-click technique we can easily create basic graphs and charts. R, on the other hand, is a programming language with a steeper learning curve.

Is it better to start with R or Python? ›

If this is your first foray into computer programming, you may find Python code easier to learn and more broadly applicable. However, if you already have some understanding of programming languages or have specific career goals centered on data analysis, R language may be more tailored to your needs.

Is Python replacing R? ›

Many data scientists use both languages, selecting the best tool for each task. R and Python can even be used together - R interfaces like reticulate, rpy2, and plumber allow integrating R scripts into Python applications. So rather than replacing R, Python is complementing it.

Is it worth learning both R and Python? ›

Both languages are well suited for any data science tasks you may think of. The Python vs R debate may suggest that you have to choose either Python or R. While this may be true for newcomers to the discipline, in the long run, you'll likely need to learn both.

Why is R so difficult to learn? ›

R has a reputation of being hard to learn. Some of that is due to the fact that it is radically different from other analytics software. Some is an unavoidable byproduct of its extreme power and flexibility. And, as with any software, some is due to design decisions that, in hindsight, could have been better.

What coding language is R closest to? ›

The language was inspired by the S programming language, with most S programs able to run unaltered in R.

Can I learn R with no programming experience? ›

Though it helps to have basic computer skills and knowledge, you can enroll in a beginner level course to gain the necessary knowledge to use R in your career. You may also be able to succeed in R courses without having much experience in data science.

Is statistics harder than calculus? ›

Some students might find Calculus harder, while others might struggle more with Statistics. It's highly personal, so talk to your teachers and peers to help you make the best decision.

Should I learn statistics before learning R? ›

Key Insights

R is considered challenging to master due to its unique syntax and extensive set of commands, but can be learned with dedication and the right resources. A solid understanding of statistics, data science concepts, and data analytics can make learning R programming easier.

Can you do statistics in R? ›

However, R is a statistical computing language, and many of the functions built into R are designed for statistical purposes. As such, we're going to very quickly go over some statistical terms and a few of the statistical functions built into R.

Is statistics a hard class to pass? ›

AP Statistics may have a reputation as being particularly difficult, but students with successful study habits and a strong mathematical foundation can excel in this course. Students must pass a second-year algebra course and possess solid quantitative reasoning skills to take AP Statistics.

References

Top Articles
Latest Posts
Article information

Author: Dan Stracke

Last Updated:

Views: 5311

Rating: 4.2 / 5 (43 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Dan Stracke

Birthday: 1992-08-25

Address: 2253 Brown Springs, East Alla, OH 38634-0309

Phone: +398735162064

Job: Investor Government Associate

Hobby: Shopping, LARPing, Scrapbooking, Surfing, Slacklining, Dance, Glassblowing

Introduction: My name is Dan Stracke, I am a homely, gleaming, glamorous, inquisitive, homely, gorgeous, light person who loves writing and wants to share my knowledge and understanding with you.