As a part of my internship, I have study the basic functions in ggplot using the predefined data sets. But, before we dive into the analysis, let us first know what is ggplot all about.
What is ggplot?
ggplot is a plotting system in R, created by H. Wickman in 2005. Since then it has grown and has been exported to Python and SAS. ggplot is build along the lines of easy to pull visualizations with minimal code. It has very flexible features which allow the user to customize the theme and layering. Throughout out series of evaluation, we will focus on ggplot in python.
To cut the long story short
EASY+FUN+POWERFUL+VERSATILE = ggplot.
- ipython/ipython notebook( preferred but not necessary)
METHOD 1: pip install ggplot
METHOD 2: pip install git+git://github.com/yhat/ggplot.git
Basic Components of ggplot:
- ggplot API– Easy to use interface, prevents need to granular coding.
- Data– Data set in ggplots are handled as Data Frames, same as in pandas. Infact for most of the part(which is necessary as well), ggplot works in close association with pandas.
- Asthetics– The help in determining how our plot is going to look, from limits of axis es to title and themes.
- Layers– The initial plot is just a square segment of x-y system. We can then add layers of information(i.e one piece of information over another).
Our Data set– ggplot comes with a number of data sets that we use for our learning. We shall use the diamond data set for our evaluation.
Import necessary packages.
Check for our data set:
With our data set in place. We are ready to explore more.