What are characteristics of data that would influence how you visualize it?
What information do you have that would be visually interesting?
What information do you not have that you need to understand the importance of the data?
Example: A banking database where each record is a bank transaction and the fields include date, deposit or withdrawal amount, customer id, and the interest rate of the account.
Viz for Self
Let's talk about exploration.
What are characteristics of data that would influence how you visualize it?
What information do you have that would be visually interesting?
What information do you not have that you need to understand the importance of the data?
Example: A spreadsheet of experimental crop growth measurements where each record is a measurement, and the fields include date, plant species, plant id number, number of leaves, plant height, number of internodes, and average leaf length.
Viz for Self
Let's talk about exploration.
What are characteristics of data that would influence how you visualize it?
What information do you have that would be visually interesting?
What information do you not have that you need to understand the importance of the data?
Example: A computational simulation of a galaxy where each record is a timestep in the evolution of the 3D grid, and the fields include time, X position, Y position, Z position, gas density, gas temperature, gas metallicity, and number of stars.
Viz for Self
What do you want to get out of visualization for yourself?
Do you want to find meaning?
Do you want to understand how to guide further visualizations?
Is the story you want to tell already known to you?
What shortcuts can you take?
Viz for Experts
To design a visualization for experts, you need to analyze how they process information.
What do they know?
What conventions will they assume?
Are they able to fill in the blanks of information?
Viz for Experts
Viz for Experts
Viz for Experts
Viz for Experts
Viz for Experts
Viz for Experts
Experts often want to interrogate the data themselves.
How can they do that?
Linked Dashboards
Side-by-side comparison plots
Text annotation with specific values listed
Color bar annotation
Viz for Experts
Experts are looking to isolate variables to make scientific conclusions.
How can we make visualizations more analytical?
Reduce the dimensionality of the image (slices)
Viewpoint from "outside the box"
Extremely high contrast color choices (or highlight different features)
Viz for the Public
This is what you're most accustomed to, because usually YOU are the public.
Explore the dataset in a Jupyter notebook. Make sure you include things that did and did not work.
Summarize the characteristics of the dataset in words: what does it
represent, what are the fields/columns/rows, what data types are they, etc
Final Project: Part 1 (cont)
Your datasets need to be submitted as well. To do this, include this
information in your Jupyter notebook:
What is the "name" of the dataset?
Where did you obtain it?
Where can we obtain it? (i.e., URL)
What is the license of the dataset? What are we allowed to do with it?
How big is it in file size and in items?
Make a simple plot showing a relationship of interest. You can use matplotlib or pandas (or other). Don't worry about colors, labels or anything else of that nature!
Final Project: Part 1 (cont, cont)
You can share raw data sets and sources, ask questions about reading/modifying the dataset and post code to do so that isn't working.
Please do not share processed or cleaned datasets.
Final Project: Part 2
Submit in a Jupyter notebook.
See Assignment description for more details.
Final Project: Part 3
Visualization for the public -- see Assignment description for more details.
You will submit this as your final project and get some feedback -- both from the instructors and in the forum.
You will also provide feedback for 3 other students (more on this later).
Final Project: Part 3 (cont)
You may submit one or more of the following items: Idyll webpage repository, narrative Jupyter notebook.
This component will include a "for others" visualization that is deeply
narrative with appropriate interactive (or static) content and, ideally, sharable on a
website.
Some possible ways to approach this:
Infographic
Idyll
Jupyter notebook
Raw HTML
TOPIC 3: More vega-lite
Today
Vega-lite - II
Marks - more
Selections
Transformations
Computations
Vega-Lite - I
Recall that vega-lite is defined in a JSON specification. This specification will typically take a form similar to this:
We principally used a vega-lite "embed" mechanism:
var embedded = vegaEmbed('#vis', yourVlSpec);
We are also able to specify a configuration variable to this at the config
option. (Details) You may find it
useful to update the actions option in opt, which controls which items are
available in the menu:
var embedded = vegaEmbed('#vis', yourVlSpec, {'actions': false});
Vega-Lite - I: Embedded
The object returned by vegaEmbed is a
Promise.
This means that when you access it, it may not yet be available -- so instead
of actually calling on it, we supply a function to be called at some point
when it is ready -- when the promise has been resolved. This function will
be called with that object.
(This type of syntax, for deferring actions to the future, is very common in
Javascript.)
Vega-Lite - II
Last week we discussed marks and encodings.
This week we will continue with marks, adding on transformations and selections.
Marks - I
vega-lite has numerous different mark types. We can break these down by the type of data they can represent. We will only consider "primitive" marks today.
area & line
bar & rect
point & circle & square
rule & text
tick
geoshape
We will demonstrate several of these using our datasets, but first we need to learn how to transform data.
Transformations - I
At the view-level of your definition, you can specify transformations that modify, filter, or reshape the data.
At the top level, we specify a transformation. We can transform data within a given dataset (by specifying a new attribute of each data point) or by reshaping the data.
The types of transformations we will cover today are filter and calculate.
Transformations - II
We apply a filter transform by specifying the field to filter on and the filtering characteristic. This can be a selection, an expression, or a logical definition. We will address selection and expression filtering later.
Transformations - III
A logical filtering operation might look like one of these:
We can use lt, gt, lte, gte, eq, oneOf, range and valid.
Transformations - IV
We can also compute a new field using the calculate transform. This is an expression that is evaluated on every data point, which is supplied as the variable datum to the expression.
Selections are defined with names -- this seems to be the most common stumbling block. You get to choose the name!
We use selections in one of a few ways.
We can conditionally encode data -- for instance, change visibility, or alpha, or color.
We can use selections as input for filtering data. Typically this is done with one plot showing unfiltered data and another using a filter from that selection.
Scale a domain based on a selection
Selections - II
There are three types of selections:
single -- selecting a single point,
multi -- multiple points
interval -- collections of values along encoding axes
We will focus on the interval selection.
Selections - III
We can define a box-based selector that operates along the x axis by specifying which encoding it is linked to. Here, we name it valrange, but we can choose whatever name we like.