Continuing with our study of diamond data, let us employ basic exploration function and see what information we can draw from it.

**EXPLORE THE DATA:**

##### 1. Length of the data:

Use the **len()** function to see the number of rows of data.

##### 2. Names of the columns:

Use the **column()** function.

##### 3. Analyse the first few values:

Use the **head()** function, by default first 5 values are displayed.

##### 4. Analyse the last few values:

Use the **tail()** function, by default last 5 values are displayed.

##### 5. Random selection:

To view data at random from a large data set.

##### 6. Statistical information:

Numeric fields can be evaluated by **describe()** function to present the statistical information of mean, median and range.

##### 7. Determine the correlation between fields:

**corr()** function determines the correlation of all numeric fields in the data set.

##### 8. Values stored in the non numeric fields:

We can simply view the values by using** diamonds[‘color’]** but, it has many repeated values, so better if we view the unique entries. Use **unique()** function for the same.

### Observations so far:

**OBSERVATION 1:** The data has both numeric and non numeric values.

**OBSERVATION 2:** The mean and medians of x,y are approximately same. Do diamonds have proportionate length/breadth?

**OBSERVATION 3:** min of x,y,z is 0. Can length/breadth/height of a real object=0?

**OBSERVATION 4:** Diagonal correlations are all 1. Why?

**OBSERVATION 5: **Price,carat,x,y,z seems closely related with each other. Can a predictive model be developed? What about non-numeric values?