BIOF 439: Data Visualization using R
Number of credits : 1
Summer 2021
Syllabus
Abhijit Dasgupta, PhD
Prerequisites, if any: None
This course will demonstrate and practice the use of R in creating and presenting data visualizations. After a short introduction to R tools, especially the tidyverse packages, we will look at good principles for data visualization, examples of good and bad visualizations, and the use of ggplot2 to create static publication-quality graphs. We will also explore modern web-based interactive graphics using the htmlwidgets packages as well as dynamic graphics and dashboards that can be created using flexdashboard and Shiny. We will explore ways in which bioinformatics data can be presented using static and dynamic visualizations. Finally, we will use RMarkdown and several packages to develop web pages for presenting data visualizations as self-explanatory, and possibly interactive, storyboards.
All course materials (lectures, videos, homework, discussions) will be available on the class Canvas site.
Required and Recommended Texts: There are no required texts for this class. However, the following texts, freely available online, will be used for reference:
Required Journal Articles: There are no required journal articles for this class
When you complete the course successfully, you will be able to:
This class will communicate primarily via Slack.
You will see a channel #spring2021-a
. Please join this channel. Please use Slack for broadcasting messages, answering questions and the like. When you ask a question, please ask it under the #general
or `#spring2021-a channels, so others can learn as well. I should respond within 24 hours.
The Canvas Discussion forum will be used for graded class discussions.
I will also hold virtual office hours on Zoom, time TBD, one evening a week.
This course will run for 7 weeks. Of these, there will be instructional material, including videos, lectures, slides, discussion, tutorials and homework, for 6 of the weeks. The seventh week will be dedicated to a culminating project that will be submitted by the end of the seventh week. Your grade will be determined by class participation, i.e., discussions & Slack participation (30%), homework assignments (50%) and the final project (20%).
Detailed course outline
flexdashboard
packageReadings: R4DS Chapters 4 and 27
Resource: PDV Chapters 2-4
Theme: Descriptive plots
Theme: Analytic plots
Theme: R for Bioinformatics
Theme: Dynamic visualization
Theme: Presenting your graphs
Reference: R Markdown: The Definitive Guide by Yihui Xie, J.J. Allaire and Garrett Grolemund (available online)
flexdashboard
)Class presentations and discussion
I believe in teaching practical methods for using R as a tool in achieving informative data-driven visualizations. As such, this course is opinionated, in that I make certain choices of what parts of R to teach to make things most accessible and useful. The course will be a mixture of didactic lessons, interactive tutorials and exercises, culminating in a final project that brings different aspects of the course together into a single dashboard.
R is a tool to be used, not studied, and so I promote active learning by doing in order to become familiar with R, its advantages and disadvantages, and using R regularly through the course to learn its capabilities to visualize data. Students will be expected to create simple dashboards to show their data story from the first day, thus learning how to apply their learning to their own workflows and work environments.
Methods for students to achieve success
Time commitment Daily practice for even 30 minutes is good, but for particular class work I don’t expect more than a couple of hours a week.
Students can be successful in this course through following the teaching materials, participating in discussions on Slack, and practice. R is a language in the same way that French or Japanese is a language (you’re just talking to a computer), and so the only way to retain the knowledge gained in this class is to use it. The exercises and tutorials are meant to get you used to using R for different purposes, so please do them diligently.
This course should take around 4-6 hours of time weekly, depending on the week.
The most important thing is to be polite, considerate and empathetic in all communications and discussions. There are different levels of knowledge about R in this class, and so some questions may appear trivial to some but are essential for others. Be kind, and if you can help a classmate, do so with grace and civility. The class learns best if we all help and support each other.
This course adheres to all FAES policies described in the academic catalog and student handbook, including the Academic Integrity policy listed on page 11 of the academic catalog and student handbook. Be certain that you are knowledgeable about all of the policies listed in this syllabus, in the academic catalog and student handbook, and on the FAES website. As a student in this program, you are bound by those policies.
All course materials are the property of FAES and are to be used for the student’s individual academic purpose only. Any dissemination, copying, reproducing, modification, displaying, or transmitting of any course material for any other purpose is prohibited, will be considered misconduct, and may be cause for disciplinary action. In addition, encouraging academic dishonesty by distributing information about course materials or assignments which would give an unfair advantage to others may violate the FAES Academic Integrity policy. Course materials may not be exchanged or distributed for commercial purposes, for compensation, or for any purpose other than use by students enrolled in the course. Distributions of course materials may be subject to disciplinary action.
FAES is committed to providing reasonable and appropriate accommodations to students with disabilities. Students with documented disabilities should contact Dr. Mindy Maris, Assistant Dean of Academic Programs.
Students are responsible for understanding FAES policies, procedures, and deadlines regarding dropping or withdrawing from the course or switching to audit status.
FAES adheres to the NIH’s harassment policies, which can be found at the following link: https://hr.nih.gov/working-nih/civil/statement-workplace-harassment Faculty and students in FAES courses are responsible for being familiar with the NIH’s harassment policies and adhering to them.
It is in your best interest to use, utilize, question and understand all the instructional material provided, and to submit questions and homework in a timely manner. Since this course is completely asynchronous, there is no attendance required at particular times.
Participation will be judged through the assigned discussions as well as through activity on Slack.
Assignment submission is through Canvas. Each submission will consist of a R Markdown file and the corresponding HTML file. Both are required. Just submitting the R Markdown doesn’t let us see the results easily, and just submitting the HTML doesn’t let us evaluate your code. If you have trouble knitting the R Markdown to HTML, let me know and I can help. If it’s really impossible and you’re tearing your hair out, reach out to me at least by Saturday so I can see if (a) I can help, or (b) I can see if reasonable accommodation can be made. The latter will be a rarity, generally.
Homework is assigned at 10am each Monday and is due by 11:59pm the following Sunday.
No late submissions of homework or discussion are allowed. However, for homework, I will only use the top 4 scores for your grade, so you will have the option of not submitting or doing poorly on 2 of them.
The guidelines for submitting assignments will be posted as a screencast during the first week of class.
We will get your assignment grades and feedback to you within a week of submission.
Grades will be based on the following requirements:
Final project