# Prep Notebook, Week 11

Here we'll redo some of our Altair plots, but now we'll be sure to save these in the appropriate place in our Jekyll template in order to have them show up on our project page.

We'll start with copying directly from the vega editor, just in case folks want to use that instead of Altair in Python.

## Setup -- Making a new project 

We'll first start by making a new project markdown file in order to add our visualizations.

We can do this by copying a file that is already there and them modifying for our new project:
1. Project name
1. the "tools" tag
1. the image (we might wait to do this until we can take a screenshot of one of our new plots)
1. the short description

Note: we want to make sure to keep all the `custom_js` tags in order to be able to include our interactive creations!

## 1. Copying directly from vega-editor

We can start with saving vega-lite code directly as json if we do our development in the vega-editor.

If we start from this [simple vega-example](https://vega.github.io/vega-lite/examples/stacked_bar_h.html) and [open it in the vega-editor](https://vega.github.io/editor/#/examples/vega-lite/stacked_bar_h) we can export from the vega-lite editor.

If we click the Export button in the upper tool bar, we see that we have a few options for how to export visualizations.

For this to work, you need to make sure that you have the full raw URL for the dataset from the [list of vega-datasets](https://github.com/vega/vega-datasets/tree/master/data). Double check that it is the raw-data link!

![click on the "Export" button in the vega editor](images/vegaeditor/vega_editor1.png)


We can save with JSON (config file un-checked) using the middle upper panel's interface:


![make sure "include config" is checked](images/vegaeditor/vega_editor2.png)


We can then save the downloaded .json file in the assets/jsons/ folder in our main Jekyll page directory.

Once we have that we can link using vegachart in your Jekyll project page:

```html
<vegachart schema-url="{{ site.baseurl }}/assets/json/direct_from_editor.json" style="width: 100%"></vegachart>
```


## 2. Save from Python -- data at URL

In [1]:
import pandas as pd
import numpy as np
import altair as alt

Let's use our Mobility dataset from last time and remake our dashboard plot:

In [2]:
mobility_url = 'https://raw.githubusercontent.com/UIUC-iSchool-DataViz/is445_data/main/mobility.csv'

In [3]:
brush = alt.selection_interval(encodings=['x','y'])

chart1 = alt.Chart(mobility_url).mark_rect().encode(
    alt.X("Student_teacher_ratio:Q", bin=alt.Bin(maxbins=10)),
    alt.Y("State:O"),
    alt.Color("count()")
).properties(
   height=400
).add_params(
        brush
)

chart2 = alt.Chart(mobility_url).mark_bar().encode(
    alt.X("Mobility:Q", bin=True,axis=alt.Axis(title='Mobility Score')),
    alt.Y('count()', axis=alt.Axis(title='Mobility Score Distribution'))
).transform_filter(
    brush
)

chart = (chart1.properties(width=300) | chart2.properties(width=300))

chart

Now that we have this plot, we want to save it in a place we can access from our Jekyll template.  This place, like with copying from our vega-editor, is in `<YOUR GITHUB PAGES DIRECTORY>/assets/json`.

If working locally, we want to make sure we save this in this directory.

If working on PrairieLearn, we just need to save the json somewhere and then make sure we download and move it to the `assets/json` folder in our template:

In [4]:
# from PL:
# myJekyllDir = "./"
# --> then be sure to download these jsons to your "asset/json" folder with a right-click in PL!

# working locally:
myJekyllDir = '/Users/jnaiman/jnaiman.github.io/assets/json/'

Now that we have the path set, let's save our chart:

In [5]:
chart.save(myJekyllDir+"altair_mobility_dashboard.json")

### Saving within a "container"

The mobility dashboard is a "faceted" chart, since it is made up of two plots.  As of last I checked, you can't save faceted charts within a variable "container" that will update if the user changes the size of the window (i.e. if on mobile device, or just changing the size of a browser window).

However, for single-facet charts, you can save in a container that is variable.  For example:

In [6]:
scatter = alt.Chart(mobility_url).mark_point().encode(
    x='Mobility:Q', # "Q for quantiative"
    y=alt.Y('Population:Q', scale=alt.Scale(type='log')),
    color=alt.Color('Income:Q', scale=alt.Scale(scheme='sinebow'),bin=alt.Bin(maxbins=5))
)
scatter

In [7]:
scatter.properties(width='container').save(myJekyllDir+"population_scatter.json")

The above will now resize (within reason) if your browser window changes.

Support for multi-faceted charts is on the development pathway within vegalite, so hopefully that will be supported in the near future.

## 3. Save from Python -- using locally stored data

Like before, let's remake the dashboard with data that is local:

In [8]:
mobility = pd.read_csv('https://raw.githubusercontent.com/UIUC-iSchool-DataViz/is445_data/main/mobility.csv')

In [9]:
mobility

Unnamed: 0,ID,Name,Mobility,State,Population,Urban,Black,Seg_racial,Seg_income,Seg_poverty,...,Migration_out,Foreign_born,Social_capital,Religious,Violent_crime,Single_mothers,Divorced,Married,Longitude,Latitude
0,100,Johnson City,0.062199,TN,576081,1,0.021,0.090,0.035,0.030,...,0.005,0.012,-0.298,0.514,0.001,0.190,0.110,0.601,-82.436386,36.470371
1,200,Morristown,0.053652,TN,227816,1,0.020,0.093,0.026,0.028,...,0.014,0.023,-0.767,0.544,0.002,0.185,0.116,0.613,-83.407249,36.096539
2,301,Middlesborough,0.072635,TN,66708,0,0.015,0.064,0.024,0.015,...,0.012,0.007,-1.270,0.668,0.001,0.211,0.113,0.590,-83.535332,36.551540
3,302,Knoxville,0.056281,TN,727600,1,0.056,0.210,0.092,0.084,...,0.014,0.020,-0.222,0.602,0.001,0.206,0.114,0.575,-84.242790,35.952259
4,401,Winston-Salem,0.044801,NC,493180,1,0.174,0.262,0.072,0.061,...,0.019,0.053,-0.018,0.488,0.003,0.220,0.092,0.586,-80.505333,36.081276
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
736,39205,John Day,0.115854,OR,7935,0,0.001,0.002,0.002,0.004,...,0.015,0.015,0.208,0.331,0.000,0.195,0.108,0.628,-118.531197,44.594025
737,39301,Friday Harbor,0.101695,WA,14077,0,0.002,0.010,0.012,0.022,...,0.021,0.060,2.716,0.171,0.000,0.219,0.148,0.604,-123.052956,48.525379
738,39302,Bellingham,0.115575,WA,166814,1,0.006,0.057,0.046,0.051,...,0.028,0.098,0.063,0.294,0.001,0.195,0.099,0.538,-121.263443,48.831154
739,39303,Port Angeles,0.085840,WA,90478,0,0.007,0.122,0.025,0.028,...,0.021,0.043,0.476,0.260,0.001,0.235,0.124,0.598,-123.544647,47.912067


In [10]:
brush = alt.selection_interval(encodings=['x','y'])

chart1 = alt.Chart(mobility).mark_rect().encode(
    alt.X("Student_teacher_ratio:Q", bin=alt.Bin(maxbins=10)),
    alt.Y("State:O"),
    alt.Color("count()")
).properties(
   height=400
).add_params(
        brush
)

chart2 = alt.Chart(mobility).mark_bar().encode(
    alt.X("Mobility:Q", bin=True,axis=alt.Axis(title='Mobility Score')),
    alt.Y('count()', axis=alt.Axis(title='Mobility Score Distribution'))
).transform_filter(
    brush
)

chart3 = (chart1.properties(width=300) | chart2.properties(width=300))

chart3

We can save like before:

In [11]:
chart3.save(myJekyllDir+"altair_mobility_data_dashboard.json")

In principle, there is nothing different between the *visual* aspects of the chart, but what are the pros and cons of using locally stored data?

Some pros can be that we can used modified data to plot with -- instead of having to do all manipulations of data using vega-lite/altair syntax we can use Python which might flow more naturally from a data-analysis workflow.

One con is file size.  Let's take a look at this:

In [12]:
import os

In [13]:
os.stat(myJekyllDir+"altair_mobility_dashboard.json").st_size # size in bytes - so ~1kb

954

In [14]:
os.stat(myJekyllDir+"altair_mobility_data_dashboard.json").st_size # so ~7Mb

699495

So this file size is much larger -- this can be an issue for saving and loading our interactive plots on our webpage!

Right now, there is no reason to really use this last method -- the data isn't really being manipulated at all so we don't really need to save it with our JSON output. 

However, if we did have local data that we did want to use, one way around the file size could be passing a subset of the columns to our plot -- only the ones we are actually using:

In [15]:
mobility_small = mobility[['Student_teacher_ratio','State', 'Mobility']]

In [16]:
brush = alt.selection_interval(encodings=['x','y'])

chart1 = alt.Chart(mobility_small).mark_rect().encode(
    alt.X("Student_teacher_ratio:Q", bin=alt.Bin(maxbins=10)),
    alt.Y("State:O"),
    alt.Color("count()")
).properties(
   height=400
).add_params(
        brush
)

chart2 = alt.Chart(mobility_small).mark_bar().encode(
    alt.X("Mobility:Q", bin=True,axis=alt.Axis(title='Mobility Score')),
    alt.Y('count()', axis=alt.Axis(title='Mobility Score Distribution'))
).transform_filter(
    brush
)

chart4 = (chart1.properties(width=300) | chart2.properties(width=300))

chart4

In [17]:
chart4.save(myJekyllDir+"altair_mobility_data_small_dashboard.json")

In [18]:
os.stat(myJekyllDir+"altair_mobility_data_small_dashboard.json").st_size # so ~55kb

55609