Below, you will find download info, tutorials, readings and links to other useful websites – but first: an analogy that I found helpful in my teaching, for instance when students forget to load a package after installing it, forget to check data sets for typos etc. before carrying out analyses, …

P-Values, Replication, Reproducibility, and Open Science

Currently, there is an ongoing debate about the way in which we can avoid abuse of statistics and ensure that research is as unbiased as possible. You can find some discussion in a recent articles and more references and links on another pages of the Experimentalfieldlinguistics-Blog:

R(studio): Downloads and Info

R: http://www.r-project.org/

http://www.r-project.org/ R Studio: http://www.rstudio.com/

http://www.rstudio.com/ Wikipedia article about R http://en.wikipedia.org/wiki/R_(programming_language)

Resource Sites, Blogs, and Groups

R Seek: targeted searches for R resources and functions http://rseek.org/ R-statistics Net : An educational resource for all things related to R language and its applications in advanced statistical computing and machine learning. http://rstatistics.net/ Quick R / Statsmethods Net : a resource site aimed at people with some statistical background who want to learn R http://www.statmethods.net/ R-Bloggers : http://www.r-bloggers.com/ Data Science Central : an online resource for big data practitioners http://www.datasciencecentral.com/ Shravan Vasishth’s website: http://www.ling.uni-potsdam.de/~vasishth/ Statistics 545 : http://stat545-ubc.github.io/index.html Data Sharkie blog: https://datasharkie.com



MOOCs, Youtube, Webinars

Youtube Channels and Playlists: Data Camp: https://www.youtube.com/channel/UC79Gv3mYp6zKiSwYemEik9A/featured MarinStatsLectures https://www.youtube.com/user/marinstatlectures/featured R-Programming: https://www.youtube.com/user/pradeeppandu LearnR https://www.youtube.com/user/TheLearnR The New Boston https://www.youtube.com/playlist?list=PL6gx4Cwl9DGCzVMGCPi1kwvABu7eWv08P Christoph Scherber

https://www.youtube.com/channel/UCREyQL8aE7mLWkb6_KOMKIg

Data Science Central Webinars : http://www.datasciencecentral.com/video/video/listFeatured

: http://www.datasciencecentral.com/video/video/listFeatured MOOCs EdX (online courses from MITx, HarvardX, BerkeleyX, UTx, etc.) https://www.edx.org/ Khan Academy : https://www.khanacademy.org/ Coursera : https://www.coursera.org/ Udacity : https://www.udacity.com/

Data Camp: https://www.datacamp.com/

https://www.datacamp.com/ Searchable MOOC lists : https://www.mooc-list.com https://www.class-central.com/subject/statistics

:

General Introduction, Cheat Sheets, and Overview

RStudio Cheat Sheets: https://www.rstudio.com/resources/cheatsheets/

Datascience Cheat Sheet (including info about data formats, tools, tutorial links, etc.) https://www.datasciencecentral.com/profiles/blogs/20-cheat-sheets-python-ml-data-sciencet

Useful Packages for Linguistics & Psychology

Tidyverse packages (Hadley Wickham): https://www.tidyverse.org/packages/

note: the tidyverse is based on the following principles: Each variable is a column; each observation is a row, and each type of observational unit is a table. For instance: dplyr for work with dataframes

manual: https://cran.r-project.org/web/packages/dplyr/dplyr.pdf

Introduction to dplyr: https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html

dyplyr tutorial:http://genomicsclass.github.io/book/pages/dplyr_tutorial.htm ggplot2 for graphics. manual: https://cran.r-project.org/web/packages/ggplot2/ggplot2.pdf tidyr for tidying up data, data wrangling: https://www.youtube.com/watch?v=RbUWwuJeUC8 stringr for work with strings (very useful for corpus work and work on production experiment results, for more info, see below)

(Hadley Wickham): https://www.tidyverse.org/packages/ note: the tidyverse is based on the following principles: Each variable is a column; each observation is a row, and each type of observational unit is a table. For instance: List of useful packages: http://citizen-statistician.org/2015/08/09/r-packages-for-undergraduate-stat-ed/ in a tutorial for R in typology: http://www.comparativelinguistics.uzh.ch/bickel-files/ezr.html

languageR (with psycholinguistic data sets): https://cran.r-project.org/web/packages/languageR/languageR.pdf

(with psycholinguistic data sets): https://cran.r-project.org/web/packages/languageR/languageR.pdf PraatR, a package for controlling Praat: http://allthingslinguistic.com/post/103840914592/praatr-an-r-package-for-controlling-praat

a package for controlling Praat: http://allthingslinguistic.com/post/103840914592/praatr-an-r-package-for-controlling-praat The childes-db project is an open database storing data from the Child Language Database (CHILDES) in an easily accessible, tabular format. Researchers can interface with CHILDES through interactive visualizations or the childesr R package: http://childes-db.stanford.edu/. For some worked examples, see this publication.

R package: http://childes-db.stanford.edu/. For some worked examples, see this publication. lme4 for mixed effects regression models (for more info, see below)

for mixed effects regression models (for more info, see below) prettyR or gmodels (crosstabs, etc. for people with SPSS withdrawal symptoms)

or (crosstabs, etc. for people with SPSS withdrawal symptoms) for importing data: foreign, readr.

Working with Strings and Regular Expressions (RegEx)

Regular Expression Tutorial: http://www.regular-expressions.info/tutorial.html

Regular Expressions with The R Language http://www.regular-expressions.info/rlanguage.html

Handling and Processing Strings in R (Gaston Sanchez):http://gastonsanchez.com/Handling_and_Processing_Strings_in_R.pdf

Introduction to String Matching and Modification Using R and Regular Expressions (Svetlana Eden): http://biostat.mc.vanderbilt.edu/wiki/pub/Main/SvetlanaEdenRFiles/regExprTalk.pdf

stringr for string location and manipulation Manual: https://cran.r-project.org/web/packages/stringr/stringr.pdf Overview: http://journal.r-project.org/archive/2010-2/RJournal_2010-2_Wickham.pdf Easy examples: http://thebiobucket.blogspot.co.uk/2011/11/simple-but-propable-useful-regex.html#more



Importing and Exploring Data

see info about tidyverse above

This shows you how to import a data set in txt-format and gives hints for other formats. It also demonstrates how you can use notepad to create a txt-file with data in colums that can be easily imported into R: https://www.r-bloggers.com/importing-data-into-r/

This is a really comprehensive guide to data import (with a link to further tutorials, especially for xls):

http://www.r-bloggers.com/this-r-data-import-tutorial-is-everything-you-need/

http://www.r-bloggers.com/this-r-data-import-tutorial-is-everything-you-need/ Comprehensive Guide For Data Exploration in R | R Tutorial | Learn R

https://www.analyticsvidhya.com/blog/2015/04/comprehensive-guide-data-exploration-r/

https://www.analyticsvidhya.com/blog/2015/04/comprehensive-guide-data-exploration-r/ An introduction to sorting, merging, etc. : http://www.r-bloggers.com/working-with-the-data-frame-in-r/

Saving and Exporting

In RStudio, you should create a project and save it when you leave RStudio (you will be asked whether you want to save). This will save your workspace and keep the objects that you have created (e.g. through data import). It will also save your history (your list of commands, which you can see in the “history” window). For more info about projects, see: https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects. This tutorial can be helpful: https://www.stat.ubc.ca/~jenny/STAT545A/block01_basicsWorkspaceWorkingDirProject.html#workspace-and-working-directory

In R, you can save the console your commands and the output as a text-file quite straightforwardly, using save file.

in RStudio, things are a bit more complicated: You can copy and paste the console content to a text editor and save it there. You can use the save option in your history window for your list of commands; and you can use “sink” for your outputs, e.g. > sink("sink-examp.txt") > 3+4 > sink() This will create a text file with your output. For this example, this is a single line ([1] 7), but it could also be a list of lines or a table etc. You can use Markdown to publish your data: https://rmarkdown.rstudio.com/.



Descriptive Statistics and Basic Test Statistics

See info about tidyverse above

Descriptive Statistics: http://www.statmethods.net/stats/descriptives.html

Frequencies and Crosstabs: http://www.statmethods.net/stats/frequencies.html

The “apply-family” of grouping functions: http://www.r-bloggers.com/using-apply-sapply-lapply-in-r/ http://stackoverflow.com/questions/3505701/r-grouping-functions-sapply-vs-lapply-vs-apply-vs-tapply-vs-by-vs-aggrega



(Mixed Effects) Regression Models

Vowel Analysis

J. Stanley’s Tutorial: http://joeystanley.com/blog/making-vowel-plots-in-r-part-1

Thomas Kettig and Bodo Winter’s Canadian vowel shift analysis:

paper (2017): https://www.cambridge.org/core/journals/language-variation-and-change/article/producing-and-perceiving-the-canadian-vowel-shift-evidence-from-a-montreal-community/A45A2F348CBC7AA652035F17177AFE30

materials and scripts: https://github.com/bodowinter/canadian_vowel_shift_analysis

Online R and Statistics textbooks

See also my Statistics Reading List

R vs. Python: Free books etc.

http://ucanalytics.com/blogs/r-vs-python-comparison-and-awsome-books-free-pdfs-to-learn-them/

You can also look at my Pinterest or teaching material list, but first – how about some cooking with R?