In this series of posts, we reflect on the Methodology Workshop for Natural Language Programming workshop, coordinated by Crow developer Mark Fullmer and hosted at Purdue.
In our first session of the workshop, our Crow team gained a basic understanding of how to approach coding. The principles we took away from the session gave us a solid theoretical framework with which to build practical coding skills. We’ll be learning Python given its simplicity, flexibility, and adaptation to the text processing which is a part of Crow research.
Principle 1: Automate everything. Automating as much of our data as possible not only increases our efficiency and accuracy, it also gives us the ability to plug pre-programmed segments into future projects with minimal fuss or extra effort.
Principle 2: Separation of concerns. Like many things in life, programming is easier to do when broken down into small steps, each one performing a different function like assembly lines in a factory. Step by step, we worked through an example to consider the process of writing code. First, we separated words into individual entities using a delimiter, such as a comma. Next, we scanned the words through the computer system, issuing a frequency count for each word. Lastly, we displayed the list for frequency analysis. Splicing our code into separate “factories” provides two advantages: 1) the code can easily be recycled for future programs. It is much more efficient to tweak small segment of code then rewrite an entire program. 2) It’s easier to test the accuracy of our program when its broken up into short code segments. Simply modify your test when you want to reach a different result for that portion of code.
Principle 3: Don’t make assumptions. We learned that creating code on the assumption that the results we need now will be the same results we need tomorrow is a crucial mistake. For instance, hardwiring a text processor to remove all apostrophes will make it useless if down the road we need to analyze possessive nouns. Instead, it is better to create an optional “factory” that can be removed or upgraded to obtain the desired result. Also, we shouldn’t assume that a computer can read the text in its current format. Elements such as capitalization, punctuation, and character encoding are not read by computers the way we read them and must be cleaned from a text before it can be analyzed.
Principle 4: Avoid hardcoding. When labeling our different “factories” we should leave room for flexibility or else the name won’t match the function when we make changes. We must maintain a balance between generalization and specificity.
Principle 5: Keep it simple, stupid. A hallmark of good Pythonic code is that the simplest methods are used even if it requires writing more code.
Principles 6 and 7: Convention over configuration, and Write your code for the next programmer. Following standardized coding formats will make our code more accessible to other programmers than if we personalize code to our own preference. To further help other programmers decipher our work, it is helpful to add inline commentary that explains difficult aspects of our code and to follow the formulaic syntactic already in existence.
Principle 8: Don’t repeat ourselves. The goal with coding is to reuse and recycle. Instead of rewriting a slightly different version of the same code for multiple different programs, we should write the code once then modify it for different uses. Writing code is much like creating a résumé: build one, but tailor multiple drafts to different employers.
Principle 9: We don’t want to write new code unless it’s absolutely necessary. More than likely, someone else has already written the code we need, so there is no point in reinventing something we can borrow. This principle is the programmer’s version of “think smarter not harder.”
We concluded our session by discussing these principles and articulating them in other ways to see how much we took understood the principles. After learning more about the coding process, Crow members felt confident to move forward into writing actual Python code. More on that in our next post!