<<<<<<< HEAD IDEA: Data Viz Makeover 1

Data Viz Makeover 1

Vikram Shashank Dandekar https://www.linkedin.com/in/vikram-dandekar-aab7aa31
01-28-2021

Initial Visualization

Critiquing the Initial Visualization

Clarity

  1. The lead-in statement and visualization are not synchronized. The graph does not bring out the key changes of the labor force highlighted in the lead-in. There is no LFPR data in the graph, only labor force percentages by age buckets. Furthermore, the age ranges described don’t match the age buckets in the graph.

  2. Data is provided as a table below the graph, without any indication of units. This is difficult to read & relate to as a total visualization.

  3. Lack of proper axes and labels - The x-axis is not labeled and there is no y-Axis.

  4. The median age lines are using the discrete x-axis scale (age range) to attempt to represent two continuous data points (median age), which is incorrect & unclear. Furthermore, having these lines intersect with the line graphs does not provide any helpful visual insight.

Aesthetics

  1. Tick marks are shown even though the x axis is categorical in scale.
  2. Chart is pixelated - Image quality is not addressed properly.
  3. It is hard to read the line graph with points representing age buckets.

Suggestions for Graphical Presentation

The following table summarizes the key changes recommended to the initial visualization:

Aspect Critique Recommendation
Clarity The lead-in statement and visualization are not synchronized. Giving priority to the lead-in statement, the data used should consist of both labor force percentages as well as LFPR percentages for the specific age groups identified in the lead-in statement.
Clarity Data is provided as a table below the graph, without any indication of units. The data should firstly be visually represented as part of the chart and not as a separate table below the chart. Also, there should be clear units of measure stated.
Clarity Lack of proper axes and labels The chart should have a clear x-axis and y-axis, with appropriate labels and units of measure. There should also be a dual axis to represent the LPFR separately from the labor force percentages.
Clarity Incorrect usage of the median age lines. There is no need to try force-fitting median age lines on the chart, especially when the axes do not have age as a unit of measure. Providing these two numbers can be done via a simple lead-in statement.
Aesthetics Tick marks are shown even though the x axis is categorical in scale. Tick marks are not necessary as the x axis is categorical in scale.
Aesthetics Chart is pixelated. Image quality must be taken into consideration when publishing a visualization.
Aesthetics It is hard to read the line graph with points representing age buckets. There should be light grid lines on the chart to make it easy for the reader to read the line graphs in relation to the axes.

Sketch of proposed visualization

Key characteristics of proposed visualization:

Preparing the Proposed Visualization with Tableau

Data Preparation

  1. The first observation of the data source is that the table directly downloaded from MOM site is heavily formatted with several merged rows and columns. As such, it is not a good base file to use when importing into Tableau as there will be many null cells cluttered around if we simply drag and drop. Therefore, a basic cleanup of the original source file is far simpler and recommended versus navigating the modifications in the Data Source module of Tableau.
  2. Secondly, the data Table 5 from the MOM site is not sufficient for us to prepare a visualization that can bring out the trends stated in the lead-in statement. We will require Table 7 as well from the same site, and this was downloaded.
  3. Finally, both Table 5 and Table 7 were cleaned and copied into a new excel file named “Raw Input” as separate tabs. Main formatting work done was to unmerge the formatted rows and columns to create clean data tables as such:
Table 7 cleaned
Table 5 cleaned

Importing the Data into Tableau

After opening Tableau desktop, the Raw Input file was imported in. Following this, the Labor Force tab is first dragged & dropped into the logical tables panel:

Labor Force table - First view of data We immediately notice that the column headers are not captured correctly. This is because the data in the raw input file is not aligned to the first row/column. This is easily fixed by checking the “Use Data Interpreter” box on the left to obtain the following:

Labor Force table - fixed row/column alignment!

Next, we drag in the LFPR table. Upon doing this, Tableau will prompt us to link up the LFPR logical table with the Labor Force table that is already present in order to form the relationship. Here, we select the Age (Years) data column as the common link, and doing so will reveal the LFPR table (with a relationship established to the Labor Force table:

Establishing relationship between two logical tables - LFPR & Labor Force

Now that we have brought in the data into Tableau, it is time to structure it in a suitable manner to help us create the right visualization easily.

Structuring the Data for analysis

First, we pivot all the individual year columns by selecting them simultaneously and using the Pivot option from the small drop down on the headers:

Pivoting Years data - Labor Force Table The result is a pivoted table where each row now has an age bucket, the Year and the Labor Force population pertaining to that Age bucket and year. We then re-label the newly formed columns as well as adjust the data type of the Year column to “Date”:

After Pivoting Labor Force Table Recall that we want to categorize the age bucket into three distinct groups that are consistent with the information in the lead-in statement. To do this, we use the drop down list on the Age(Years) column and choose the “Create Group” option:

Entering the Grouping editor Once inside the Group editor, we highlight multiple age buckets that we want to group together, and then press “Group” button, followed by renaming the Group to a suitable name, as shown here:

Creating Groups and Renaming appropriately

After this, we click “Apply” and press “OK”. We now see that a column of Groups is successfully created:

Grouping successful for Labor Force Table These groupings will help us to create the visualization very easily later on. Moving on, the same steps of Pivoting and Grouping of Age buckets and performed on the LFPR table and this gives us the result:

Grouping successful for LFPR Table

The next step that is required now is to link up the “Year” and “Year (Lfpr)” columns via the matching in the Logical Table:

Linking up Year data from both tables Finally, we will need to filter away the “All” Grouping of the age buckets so that it doesn’t come in the way of creating the Visualization later. To do this, we simply click on the “Add” button under Filters in the top right and configure “All” grouping to be excluded for both tables:

Filter away the “All” bucket from the tables

We are now ready to create the visualization.

Creating the Base Visualization

First, we drag in the “Age Years (Group)” as Column, “Year” as Column and “Population (’000)” as Row to get a basic skeleton:

Grouping of Ages with Y axis as Population and X axis as Year Now, we add in the “LFPR(%)” Data as a second Y Axis into the visualization using the “Dual Axis” option:

Using the Dual Axis option to add LFPR as second Y axis The output will look like this:

First Dual Axis output showing both LFPR and Labor Force Data

Notice that the axis are not in the intended units that we would like. Therefore, we first amend the Population Axis using a Table calculation to show the percentage of total labor force for each data point. Note that we need to use “Specific Dimensions” setting and un-check the “Year of Year” option in order to correctly calculate the % of labor force within the Age Groupings:

Converting Labor Force axis from Absolute to Percentages Next, it is time to adjust the LFPR. Notice that by default, the axis values are under the “SUM” calculation. This needs to be changed to “AVERAGE” to correctly show the LFPR percentages within each age group by Year:

Adjusting Summed LFPR axis to Average LFPR We now see that the basic data is in place and showing correctly:

Base Visualization is completed! It is now time to apply some Visualization fundamentals to enhance the aesthetics of the chart and make it easy to read for the audience.

Beautifying the Visualization

Firstly, the following are performed:

Thus far, we have the following visualization: Basic formatting of axes and assignment of bars/lines and colors Next, the following fine modifications are made in order to to make the visualization more appealing:

After performing these changes, the visualization will look as such:

Second Level formatting of lines/bars completed

To finish up the visualization, the following final touches are made:

Final Visualization

Here is the Final Visualization at the end of the Makeover process:

Final Visualization

Key Observations revealed by the Visualization

  1. For the labor force aged 24-52, the Labor Force Participation rate has increased through 2009 to 2019, but their population is decreasing as a portion of the total labor force. This is most likely due to the slower influx of residents with ages 15-24 into the 24-52 group versus the outgoing population into the “55 and above” bracket.

  2. The percentage of the labor force aged 55 and above has been steadily increasing from approximately 40% in 2009 to 50% in 2019. This is a clear indicator of an aging population.

  3. The percentage of the labor force aged 15 to 24 has generally been slowly decreasing from ~9% in 2009 to below 8% in 2019. This points to a slowdown in the birth rate in Singapore. This would also explain the flat LFPR trend of this age group.

References

  1. Tableau Help Guide - https://www.tableau.com/support/help
  2. Fundamentals of Visualization - https://clauswilke.com/dataviz/
  3. YouTube guide for Tables - https://www.youtube.com/watch?v=a0rM8wAp4U0
  4. Table 5: RESIDENT LABOUR FORCE PARTICIPATION RATE BY AGE AND SEX, 2009 - 2019 (JUNE) - https://stats.mom.gov.sg/Pages/Labour-Force-Tables2019.aspx
  5. Table 7: RESIDENT LABOUR FORCE PARTICIPATION RATE BY AGE AND SEX, 2009 - 2019 (JUNE) - https://stats.mom.gov.sg/Pages/Labour-Force-Tables2019.aspx