The last regularly scheduled meeting of the DST4L class on November 26th, 2013 was devoted to course feedback (the originally scheduled topic “Putting it all Together: On the Web, PDF, Reporting” was changed to account for a short holiday week and the hackathon planned in January).
Chris Erdmann opened the session by reiterating the original intentions of the course. As Chris was listing the course intentions, I found myself checking off the goals as “achieved” in my head. Yes, I have a better understanding of the data lifecycle and how researchers use data. Yes, I learned some new tools that I will incorporate (or already have incorporated) into my workflow to make my work more efficient, and thus also upgrading my skill set. Yes, I applied working in a non-linear way.
Yet, despite all these successes, Chris opened the feedback part with the question, “What went wrong?” This was in keeping with the spirit of full disclosure and open communication for the class, but while I have many suggestions for how the course might have worked better for me as an individual, I really want to first acknowledge where I think the course has succeeded.
Exposure to tools
It is difficult to know what the best tool to use is for a particular job when you’ve never been exposed to the possibilities. I had heard of Python before the course, but I had no idea as to how well-suited it is for batch text processing of metadata. I have created some sophisticated macro scripts comprised of 100s of lines of code, the results of which, can also be accomplished using only a few lines of Python code, a Python library, and regular expressions. I still have a lot of learning and work to do in order to replace and augment the tools I’ve already created but this was a game-changing revelation for me. I’ve already used (and recommended to others) OpenRefine for data clean-up and transformation projects. It’s possible I might try out some of the web scraping tools for certain projects. Having a better understanding APIs will allow me to more efficiently work with existing structured metadata. Exposure to all of these tools was a definite success of the course.
Another important success of the class is the community that has formed around it. While I already knew many people going into the course, I also met many new people, and working with my classmates through some challenging learning topics has brought us closer together. I am hopeful that the Harvard Library will be able to form an ongoing, learning community around the DST4L class (including the local data science community outside of Harvard) and that we will be able to continue to expand our knowledge learning from and teaching each other new skills. I think there have been successful models for this concept in the Library’s past that we can learn from (the Macro Express Working Group and the Aleph Reporting Users Group come to mind) and I am excited to work with my DST4L colleagues to ensure that this local community thrives moving forward.
As far as “What went wrong?“ I could probably expound as to how the class could have been better structured to suit my specific needs, but I think that would have been missing part of the point of the course. The idea was to expose us to things that we weren’t familiar with or that might expand our understanding of our roles in the library. I think overall, the course was effective in this regard.
For my part, I also would have liked:
- the option to work individually on my own project (mainly due to time constraints)
- more homework exercises (e.g. reading and writing files in Python, regular expression challenges, linked data in OpenRefine)
- greater exposure to pymarc
The scope of topics for the course was ambitious and there was no realistic way to learn in depth everything that we covered in the timeframe that we were together. I tried to supplement my class exposure to Python with Codecademy and Lynda.com tutorials and quickly found that there simply isn’t enough time in the day to juggle work, the class, a group project, and extra tutorials. My time outside of class might have been better spent trying to replicate in Python one of my existing text processing macros rather than working on a group project or systematically trying to learn Python.
As it is, I will need to go back to certain sections of the course, re-read the class notes, and figure out where I might be able to get more training in certain skills. There is tremendous value in now knowing what more I want to learn and the resources available for accomplishing this.
I’m excited to continue to build on some successes of the course and can see paths to achieving greater accomplishments because of what we’ve learned in DST4L. I’m grateful for the opportunity to have participated in the course and also to the course instructors and my colleagues for sharing their knowledge. I’m guessing many of my classmates feel the same way.