Skip to main content
Caltech Library logo

Library Instruction: Software / Data Carpentry

Descriptions and resources for workshops offered by the Caltech Library.

Carpentry @ Caltech Library

The Caltech Library is proud to be a member of the Carpentry Foundation. The Carpentry initiatives organize hands-on workshops to teach researchers essential computing skills. Open-source platforms are taught wherever possible, including OpenRefine, Git/GitHub, Python, and R, among others. To date, we have five instructors trained to offer Software and Data Carpentry workshops. Here, you can find out more information about who we are and what we teach. We periodically offer two-day trainings hosted at the Library - check our schedule to see if one is coming up.

The Caltech Library is also happy to partner with campus groups to design and host Carpentry workshops for their target audiences. Contact us if you would like more information on co-sponsoring a workshop for your group!

Past Workshops:

Data Carpentry Workshop with R
Held July 20-21, 2017

An Introduction to Data Carpentry for Humanists
Held May 5-6, 2017

Data Carpentry Workshop for Caltech Graduate Students
Co-Sponsored by the Caltech Graduate Student Association (GSA)
Held April 26-27, 2017

Software Carpentry

Lessons Offered:

The Unix Shell: Covers the basics of file systems and the shell, which are fundamental to using a range of other tools and computing resources.

Version Control with Git: Covers the basics of using the Git version control environment from the command line.

Programming with Python: Teaches basic programming concepts using Python.

Programming with R: Teaches basic programming concepts using R.

R for Reproducible Scientific Analysis: Teaches the fundamentals of using R to write modular code, and covers best practices for using R for data analysis.

Using Databases and SQL: Covers the basics of using a database to explore experimental data.

Data Carpentry

Lessons Offered:

Data Organization in Spreadsheets: Covers good data entry practices, formatting data tables in spreadsheets, avoiding common formatting mistakes, handling dates in spreadsheets, basic quality control and data manipulation, and exporting data.

Data Cleaning with OpenRefine: Teaches how to use OpenRefine to effectively clean and format data and automatically track any changes that you make.

Data Analysis and Visualization in Python: Covers basic Python syntax, the Jupyter notebook interface, importing CSV files, using the pandas package to work with data frames and calculating summary information from them, and a brief introduction to plotting.

Data Analysis and Visualization in R: Covers basic R syntax, the RStudio interface, importing CSV files, the structure of data frames and calculating summary statistics from them, factors, and a brief introduction to plotting.

Data Management with SQL: Covers what relational databases are, how to load data into them, and how to query databases to extract just the information that you need.

Meet Our Instructors!

Gail ClementGail Clement Stephen DavisonStephen Davison Tommy KeswickTommy Keswick Tom MorrellTom Morrell Donna WrublewskiDonna Wrublewski

The Unix Shell (SWC)

Data Organization in Spreadsheets (DC)

Data Cleaning with OpenRefine (DC)

Author Carpentry (Multiple Lessons)

Using Databases and SQL (SWC)

Data Management with SQL (DC)

The Unix Shell (SWC)

Version Control with Git (SWC)

The Unix Shell (SWC)

Version Control with Git (SWC)

Programming with Python (SWC)

Data Analysis and Visualization in Python (DC)

Author Carpentry (Multiple Lessons)

Programming with R (SWC)

R for Reproducible Scientific Analysis (SWC)

Data Analysis and Visualization in R (DC)