Chapter 3: Finding your way around

This tutorial is different because it has no explicit challenge at the end of it. It’s optional because what it’s going to help you do is expose you to some commands which are will help you find your way around to simplify the use of R.

Again the same instructions. Type in these commands and you’ll get a feel of what each does.

#scroll the previous commands using the arrow keys
#the above in case you haven’t figured out that R is annoyingly case sensitive
#Installing packagesgoto packages and data (in the mac OSX) and select the package ggplot2 to be installed. Alternatively it ca be installed directly from the command line. The library function loads it into the space.
#R has some inbuilt data sets – let’s explore them
# A tiny glimpse of ggplot2
ggplot(data = diamonds, mapping = aes(x = price)) + layer(geom = “bar”, stat = “bin”, mapping = aes(y = ..count..))

#Demo of graphics in R – some fascinating examples of what you can create


Once done – Take the above code –copy paste it into an editor and write a comment before the start of each line explaining what it does. Use help() if required.

Concepts introduced: Today we implicitly introduced packages in R – think of packages as pre built modules which are written to help us with certain capabilities. ggplot2 is a powerful graphics package which we will look into depth much later.

Chapter 2 – Let’s get our hands dirty

A couple of notes  before I start:

The last tutorial was an introduction. This one and everyone from now on will try and follow one simple structure. There will be a goal on what we learn at the end of it. Every subsequent chapter will probably build on what we’ve learned before.

Please type and not just read these. Even if seems like you can understand it – which it mostly will. Refrain from the temptation to just read. Just start typing these commands into your screen. You will go much beyond beyond these tutorials if you teach yourself by typing and becoming familiar with the syntax. Can’t stress enough on this.

Goal of this chapter:

Create a simple data set of multiples of 3 starting from 30 – 120. Explore some basic statistical properties mean, variance, standard deviation and variance of this data set. You will be exploring basic properties a lot if you’re working with data. Useful to know.

Open R

Type the following commands:

#Start typing. Do not read!Including these comments.

#This is a way to comment out code
print(“Why am I typing this?”)
print(“Probbly to learn something”)
print(“Let’s try printing two sentences now.”);print(“Here you go.”)
cat(“This is a way of joining stuff”,”here you go”)
cat(“Use it with variables as well”;c)
print (c+d)
# Now to create small data sets

#Stop here. Notice how different it is from traditional matrix multiplication.
#Stop here.Look up correlation if you do not know it.
#Now we’re getting closer to the goal
seq(from=1, to=15, by=3)
rep(pi, times=5)
#Stop here. You should be able to solve the goal.

Take the above code – copy paste it into an editor and write a comment before the start of each line explaining what it does. Use help() if required.

Additional fun –

We introduced an important concept of correlation. Watch this video from the Khan Academy just to never confuse it causation.

Feedback? Send it romymisra [at] or in the comments.


Adding some Zing to R

I’m reading the O’Reilly book called the R Cookbook. A really good 400 page book with 14 chapters. This gave me an idea if I could condense this book while reading it into 14 short tutorials- for each chapter into a DIY style.

How can we make teaching in general more fun so more people will want to learn R and make the learning curve less steep?

This is a new project: I will be publishing these tutorials over the next month.The tutorials will be 15 pages each. Just kidding – I will keep them as short as possible.

The goal of these tutorial is to make R fun and give you incentive to explore R more and hopefully read a more detailed  book yourself. That’s where I feel most books fail with me – it takes too much effort to sustain  interest of the reader.

I will be posting twice a week. You can figure out the pace you’re comfortable with.

Chapter 1 –  Getting Expectations Straight

Why should you learn R?  Statistics and graphics. Not compelling enough?

Let’s try again: Learn R if you want to play with numbers easily and make sense of them. R makes exploring and finding meaning from small spreadsheets (your monthly budgets?) to lots of data.

In bigger, and usually intentionally big words it’s a tool to do statistics, statistical programming and graphics.

These tutorials are not a 101 on statistics but how you can use R to do statistical analysis. The usage of R is not hard it when you understand the concepts already. But I will try my best to point you to the right resources to learn some concepts along the way. A stat tool is easy to learn how to use if you know what you want to get out of it. The latter is the hard part.

Now let’s figure out if you’re the right audience.

Read these tutorials a) If you’re curious about R or b) You want to learn how to learn how to make sense of groups of numbers or c) You are curious about the data analytics thought process

Of course if you’re a pro in R would love if you read it – and give feedback!


The requirements for these tutorial is you come with an ability to Google when something doesn’t make sense.

Additional Notes to remember – which are usually forgotten

1. Practice makes perfect. There will be points of frustration. As Bill Wtihers said ” You can’t get to wonderful without passing through Alright”

2. Most people usually start and never finish. Just an observation – but you can decide which category you want to be in.

3. #1 is worth repeating.

Setup and first commands:

Goto the R website and download R.  It’s available for Windows, Mac and UNIX. Choose a format – download it, run it and open R.

Once you click on the download link, choose the location closest to you.

Once you open R you will see a blank window with a prompt. This is called as the R console.

1. Type


and press enter.

First use of R – can be used as a calculator.

2. Type


To learn more about what this function does –


> help (max)

> ?max

HW for the next lesson:

Open R and use it as a calculator for the next few days. Whatever calculations you need to do – just open R and start typing them directly in the space.

If you need help help:

Just some steps you can follow throughout these tutorials when you get stuck –

1. Google

2. Type help(function_name) at the prompt or the ?followed by a prompt

3.The R website


Takes you online to the documentation.

That covers it for today. Next chapter in a few days.



On Starting

Starting is hard. Finishing is harder.

On second thoughts, I got that wrong. Starting can be really easy or hard but finishing is always harder. So difficult it makes starting look like a piece of cake.

I’ve been trying to finish this post since ten days.

I would have taken longer had I not decided to post on October 3rd.

Everyone is encouraged to start. The oft talked about inertia of starting.

It requires something else to cross the finish line.

The ease of starting is dependent on my enthusiasm – usually pretty high and lasts for about a few minutes or till I hit the first roadblock. After that I’m not sure what takes me through.

What do I have planned for the blog ahead? I’m not entirely sure. I think I want to make a few graphs – not sure I’m over them yet, work with some data and for the first time write. Will be saving some serious data posts for the blog. The pace and mix will be all be apparent soon.

Going to figure out the rest while working on this. This is enough structure to start.

It was a good break.