You'll be a better data scientist if you're comfortable working in a Unix (or Linux or Mac OS X) command-line environment and are able to make use of command-line tools. For example, as we all know, most data needs to be cleaned, and often times reshaped and combined with other data before it can be easily viewed or used to obtain descriptive statistics or estimate multivariate models. Command-line tools provide flexible and efficient ways to handle these cleaning and data management tasks, regardless of how big the data is. The specific topics included in the Tour of the Terminal workshop are: using an interactive shell; file system structure, pathnames and permissions; pipelines, sequential execution, background execution and i/o redirection; emacs text editor; commands, options and arguments; building shell scipts; regular expressions; and the Unix stream editor (sed). So consider taking a break from the point and click interface and enhance your data science toolset.
May 5, 2014, 9:30 am – 12:00 pm