Exploring Titanic Dataset using Microsoft’s Sandance

Subarna Lamsal
codeburst
Published in
4 min readDec 15, 2019

--

I always believe in the immense power of data. Today, data from various sources have become a prominent factor of almost every decisions, insights, and opportunities.

Few days ago , I came through a beautiful data exploratory web-based application called ‘Sundance’. Oh, sorry it’s ‘Sandance’.

Why is sun dancing here?

As I explored it, I was completely fascinated with the way Microsoft’s researchers and engineers came up with this amazing tool. In short, Sandance provides ease of use for data visualizations, pattern identification, trends, and insights. Having dynamic and customizable interface, it offers better decision making capabilities within just few clicks and adjustments. Well, who doesn’t want such ease?

Let’s sandance.

https://www.microsoft.com/en-us/research/project/sanddance/

The above given link provides detailed information about the Sandance, people involved in it, and contact details. At the bottom of the page is the link https://sanddance.js.org/ which directs to another website. At the top of website, select the link “Try Online” ,and you will reach at the main application which looks like this.

https://sanddance.js.org/app/

There are two themes available : light or dark. The webpage is simple and concise with simple options. At the left section is ‘Menu Section’ with chart tabs, search tabs and setting tabs. Also, it contains several options such as customizable chart layouts and all exploratory features that are required to analyze and visualize the data. Similarly, at the right section is the main data visualization frame.

Now, let’s head over to the dataset.

Sandance has built-in two datasets. The first one is “Demo Vote” and the other one is “Titanic”. Here, we are going to do some simple exploration on Titanic dataset as touted above. We can also load our own dataset into the system.

Heading over to Titanic, I want to see how many people survived or died based on gender. Sandance automatically divides features into Numerical and Ordinal, therefore segregating those manually in Jupyter or other platforms are things of apast. At the left portion, under the chart section, we have column mapping which maps the column into the figure. On x-axis, we select gender, and color by to survived since these are the features.Boom!

Two clicks and here we have the full picture.

By default, it takes count into y-axis. From the figure, we can see that in male category around 80% didn’t survive wheareas in female category around 75% survived. One question might arise is why the percentage of male survivors is very less compared to that of female survivors? Actually, everyone had decided to put first priority to female and childrens. So, females and children were the first ones to reach lifeboats, and then males. So, this is one of the reason why female survivors exceeded male survivors.

Tragedy apart, Let’s see the survivors based on the hometown and class. The thing we need to do is to set x-axis to “Joined”, y-axis to “Class” and color by to “Survive”. Actually, the chart has several options to visualize such as grid, scatter, density, column, treemap, bar, and stacks.

Let’s see our query in all these formats.

Since, these all are 3D pictures, we can rotate and see the story behind every features associated with the dataset.

Let’s explore one more thing. I want to know the relation between survivors and ticketcost. Generally, it is considered that lower class people tend to have cheaper ticket. I want to see whether the cost of the ticket has anything to do with the survival rate.

From the image, we can estimate that those who had higher ticket costs did actually survive compared to the one who had lower ticket costs. Seems interesting. Many other questions also arise, however, it is upto you to decide what’s from the data.

All in all, I have explored some of the features of Titanic dataset and all the visualizations are appealing. Also, they help us create a story, and provide insights about the event, topic, or anything else. Sandance is a very good platform for aspiring people like me to be able to see data correctly and properly.

You can explore lots of features that are available in Sandance. As mentioned earlier, we can load our own datasets and play with them. So, why delay? Head over to Sandance and begin your storytelling journey..

Thanks a lot for reading. Do leave the valuable comments and feedbacks in the comment section. Cheers!!

--

--