Today’s analysis is a real fun data set – the analysis of the non immigrant visas issued by the US government in the last 10 years!

A bit more on the data set: it’s data on the types of visa issued by the US government from 1997- 2009. The data is segmented by a) Type of visa and b) Country and continent. A first look tells me there are more than 80 types of visas US issues each year to over 100 countries. Wow.

Since there are more than 80 kinds of visas here, I will take a specific look at 2 of them for this blog post  – F1: student visas – issued to students wanting to study in the US, H1B: work visas – for people wanting to work in the US.

Questions: These are the two questions I was pretty curious to understand –

1. What is the breakdown of the visas in 2009? This is to give more context to the data set and help exploring it

2. What does the trend of the  ratio of F1/H1 visas look like in the last ten years? How does the trend look like in the last 10 years.

Note: If you’re practicing data analysis in this form, you will realize that asking the best questions is probably the hardest part. The second really hard part is actually cutting down the # of questions (if you’re as curious as me and as insane to find almost everything interesting!). The way I approach this is write down all the questions I want to ask, then pick the two – three most interesting and figure out the answers. For instance in this one I was interested in student visas, work visas, tourists visas (B2), the impact on the tourism industry, H4 visas – people who get married to people working here and are not eligible to work in the US (sacrifices of getting married!) and also even asylum visas! But then just picked the first two and decided to maybe continue one one of them later. The third hard and sometimes frustrating thing here is sometimes realizing that you many not find data to answer to your questions, so you always need to work within a constraint of the data you have and can find.


1. What is the breakdown of the visas in 2009?

Here goes – I put this data on a world map for the year 2009 so you can take a look from where it comes from. Over a million visas were given out last year over 80 categories. Whenever I’m looking at data, where the country is a variable it is almost perfect to visualize on a world map.

Click on the “Click to interact” and you will be able to select a visa class and then see which parts of the world get the most visas. You can play with it for such a long time, it’s so interesting. I love how you select H1, you see the most visas go to India!

2. What does the trend of the  ratio of F1/H1 visas look like in the last ten years?

To answer this question, I had to really clean up the data set (of 14 sheets) to work on what I wanted and create a new one.This is a perfect example where sometimes the best answers to your questions are not from the existing set but from a new data set you create, using the existing one. I introduced 3 parameters here: % H1B’s of the total, % F1’s of the total and the ratio of F1/H1 -> which gives a little indication of the students who come here, how many end of wanting and getting a job in the US. Again click on the “Click to interact” to see what we find.
What you will see is that # of student visas is trending upwards gently but the % of the student visas of the total have gone up. The # of work visas has been fairly constant (within statistical limits) over the last ten years and the % of the work visas of the total have gone up slightly. The ratio of student visas to work visas has been on a rise and in 2009 was 3, implying that every 3 students who came to study here less than 1 wanted and got accepted to work here! I said less than because quite a few work visas are given out to people who do not study in the US.

But what is most interesting is that the total # of visas has decreased quite a bit! Over 7 million in 2000 and just nearly 6 million in the last couple of years.

What could have been the cause of this? Maybe it’s post 9/11 or maybe something else.. we’ll find out. I will save it for next week’s data post. I think it will be a perfect introduction into data segmentation as I go through the steps to uncover the answer!

