class: center, middle, inverse, title-slide # Why data visualization? ### Abhijit Dasgupta ### BIOF 339 --- layout: true <div class="my-header"> <span>BIOF 339: Practical R</span> </div> ## The failed promise of data summaries --- It is tempting to try and summarize data just by some data summaries, like means, medians or standard deviations. Summaries, by their very nature, compress information Which means .heatinline[some information gets thrown out] -- Visualization gives us a way to see patterns in the data that would not be obvious from data summaries It also allows us to use our natural visual ability of pattern recognition to understand our data --- ### Anscombe's data .pull-left[ ![](01-why_plot_files/figure-html/05-Plotting-5-1.png)<!-- --> ] .pull-right[ | Statistic | Value | |:----------|------:| | `mean(x)` | 9 | | `mean(y)` | 7.5 | | `var(x)` | 11 | | `var(y)` | 4.13 | | `cor(x,y)` | 0.82 | The variables for each data set have the same values of data summaries ] --- ### The [DataSaurus](https://www.autodeskresearch.com/publications/samestats) dozen .pull-left[ ![](01-why_plot_files/figure-html/05-Plotting-6-1.png)<!-- --> ] .pull-right[ | Statistic | Value | |:----------|------:| | `mean(x)` | 54.3 | | `mean(y)` | 47.8 | | `var(x)` | 281 | | `var(y)` | 725 | | `cor(x,y)` | -0.07 | The variables for each data set have the same values of data summaries ] --- .left-column30[ A single point can completely change the computed correlation ] .right-column70[ ![](anim/corr1.gif) ] --- Data summaries are meant to help distinguish between different data sets Both Anscombe and Datasaurus show that this promise is not met by standard data summaries The previous example shows how a single point can change data summaries --- layout: true <div class="my-header"> <span>BIOF 339: Practical R</span></div> --- ## Why visualize data? - Summary statistics cannot always distinguish datasets - Take advantage of humans' ability to visually recognize and remember patterns - Find discrepancies in the data more easily --- class: middle, center # Some examples --- layout: true <div class="my-header"> <span>BIOF 339: Practical R</span> </div> ## Gallery --- <img src="01-why_plot_files/figure-html/05-Plotting-7-1.png" style="display: block; margin: auto;" /> ??? This is a typical plot in scientific journals --- ### Kaplan-Meier plots ![](01-why_plot_files/figure-html/km1-1.png)<!-- --> --- <img src="01-why_plot_files/figure-html/05-Plotting-8-1.png" style="display: block; margin: auto;" /> ??? We can put ggplot figures together in a panel with some annotations very easily using the cowplot package. These graphs can be cleaned up some. --- <img src="01-why_plot_files/figure-html/05-Plotting-9-1.png" style="display: block; margin: auto;" /> ??? This is a plot of the diamonds dataset that comes with ggplot2 --- ### Manhattan plot <img src="01-why_plot_files/figure-html/05-Plotting-10-1.png" width="600px" height="500px" style="display: block; margin: auto;" /> ??? Manhattan plots are often used in GWAS studies. You can customize the annotations and the line for the significance levels --- ### Circular Manhattan plot <img src="/Users/abhijit/ARAASTAT/Teaching/BIOF339/slides/lectures/week3/img/Circular-Manhattan.trait1.trait2.trait3.jpg" width="500" height="500" /> ??? This gives a different representation of the manhattan plot. This example looks at three traits simultaneously --- ### Maps .pull-left[ <iframe src="img/map2.html" width="1200" height="500" scrolling="no" seamless="seamless" frameBorder="0"> </iframe> ] .pull-right[ <iframe src="img/map1.html" width="1200" height="500" scrolling="no" seamless="seamless" frameBorder="0"> </iframe> ] --- ### Heatmap .pull-left[ ![](01-why_plot_files/figure-html/heatmap-1.png)<!-- --> ] .pull-right[ ![](01-why_plot_files/figure-html/hm2-1.png)<!-- --> ] --- ### OncoPrint ![](01-why_plot_files/figure-html/unnamed-chunk-3-1.png)<!-- --> --- ### Interactive graphs
<!---<iframe src="why_plot_files/pl.html" width="1200" height="500" scrolling="yes" seamless="seamless" frameBorder="0"> </iframe> ---> ??? These graphs are clickable. For example, click on a symbol on the legend, or drag your mouse over a region with left button held down. --- ## Network graphs <iframe src="anim/fn.html" width="1200" height="500" scrolling="no" seamless="seamless" frameBorder="0"> </iframe> --- ## Animated graphs ![:scale 40%](anim/gapminder.gif) --- layout: true <div class="my-header"> <span>BIOF 339: Practical R</span> </div> --- class: middle, center, inverse # For more in-depth looks at data viz, consider BIOF 439 in the Spring