Comp 15 Code Style Guide
How to write clear, concise, and modular C++ for Comp 15
"A computer language . . . is a novel formal
medium for expressing ideas about methodology, not just a way to
get a computer to perform operations. Programs are written for
people to read, and only incidentally for machines to
execute."
-- Structure and Interpretation of Computer Programs Harold Abelson, Gerald J. Sussman, Julie Sussman
A program is a written description of your
approach to solving a problem. It is a description that a computer
can execute, but human readers are the target audience for whom
you write. The compiler doesn't care about your variable names,
code formatting, and just ignores comments. People on the other
hand rely on all those things, and you want them to understand and
have confidence in your solution! Therefore, endeavor to write
clear, concise, modular code.
All code you submit will be graded for structure
and organization (including readability) as well as for
functionality. Formatting matters. When you submit your work, it
should be beautiful, and there should be no question of your
dedication and correctness. Prepare it as if you were writing up
an example for a good textbook. You wouldn't use a book that had
no chapters, figures, captions, section headings, paragraph
breaks, etc. Similarly, no one wants to read unorganized,
undocumented code with huge functions and 500 character
lines.
Just as with a paper for a literature course, the
first draft of a program that works is not usually the version to
submit. We allocate points of your homework/project grades for
code that meets the criteria for readability. For example, we
expect that you will clearly and consistently indent your code and
adhere to an 80 column limit. We will also grade your internal
documentation.
This document contains requirements and also hints for how to write programs that others, including the graders, will find easy to read. No document can be complete, and if you ever have questions about what is readable and what is not, please come see us!
Furthermore, although some of these style
guidelines are specific to our course, and others would not be
applicable to all programming languages, writing readable, concise
code is an extremely valuable skill that is important to practice
early in your CS career: it will benefit you forever. No matter
whom you're writing your code for, they will appreciate being able
to read it without difficulty.
Finally, this style guide is written for Comp 15 students, and refers to key Comp 15 concepts. Therefore, it's important to continually refer back to this style guide as you progress through the course and gain a better understanding of the ideas that will contextualize these guidelines.
Table of Contents
Internal Documentation (Comments)
Internal documentation comes mainly in the form
of comments, and is crucial to having easy-to-understand
code. Ideally, it would allow a reader with little to no prior
knowledge of C++ to understand the gist of what is happening in
your code, and why. For this reason, much of your internal
documentation will focus on the whys of
your program — how does a given section of code contribute
to the overall functionality of your program? This kind of
documentation is important to include at many levels, from the
README that describes your project as a whole, to the files that
make up your program, to the individual functions you write, and
even down to short but potentially confusing snippets of
code.
There are two options for when to comment your code:
- Before writing your code. Use comments (potentially combined with pseudo-code) to plan what you're going to write. As you write your code, update the comments to reflect changes in your plan.
- As you write your code. Describe what you're doing in whatever piece of code you're currently writing. As your code becomes more complete, you can edit and potentially remove comments.
Writing your comments after finishing your code is not an option! If you show code to a member of the course staff, it is acceptable for them to say "I can't read this code without documentation. I'll help someone else while you work on that. Let me know when the documentation is complete."
The most important part of commenting is that the comments and code are in-sync with each other. Regardless of when you choose to write your comments, give them a once-over before submitting them to make sure that they still accurately represent your implementation. We have pondered for hours why code seemed to be different from what the comments said, only to conclude eventually that the code was different from what the comments said. Then we had to find out which (if either!) was correct for the problem. Confidence in the code goes down after this.
File header comments go at the top of each file.
They must include some straightforward but important information
— your name, the date, and the assignment title. When you
edit a provided file that already has an author's name in it, you
should add "Edited by" and then your name underneath the original
author's name.
Another thing we look for in file headers is the purpose of the file. Most notably, this should not be something we can deduce from the file name!
Header comments should also include any known bugs or "to do" items. For example, "Currently only supports sophomores. TODO: support students of other years."
For this course, you will often write your own classes, with one .h file and one .cpp file for each class. If your class were called "MyList," for example, then you would have a MyList.h and a MyList.cpp. One thing we often see when grading file header comments is, for example, MyList.h having the stated purpose "This is the header file for the MyList class." This is not helpful! Because we are already in agreement on the file naming scheme used in our course, we already know before opening the file that it is a header file for a class named MyList.
Instead, write thorough, detailed descriptions of your files that keep potential users of your code in mind. Here are some things to consider that will help you write informative file purposes comments:
- When a user opens your driver file (the file that contains main()), they are probably wondering: What does this program do as a whole? What purpose does it serve? Is it a good fit for what I need to accomplish?
- When a user opens a header (.h) file, they probably have noticed that it is a class header and are wondering: What is this class? What role does it play in the overall program? How does it interact with the other classes? What needs are best met by this class? They may be deciding whether to repurpose the class you wrote for a completely different program or, conversely, whether to use the rest of your program but replace that class with something else.
- For each class, the comments should describe what instances of the class represent and what the abstract state of an instance is. For example: "StudentList is a class that represents an ordered list of Student instances. Every new StudentList begins empty, and clients can then add and remove Students from the list." You may go on to define the main behaviors (public functions) or you may document those functions as they are declared, but the overall purpose/intended uses of the class should be documented at the top. If there are limitations, document them: "This class only handles lists of length 100 or less."
- It is never the purpose of a class to "contain nodes." That is an implementation detail that clients don't care about. Implementation-specific information should go in the private section of the class and/or in the implementation.cpp file.
- When a user opens a .cpp file for a class, it is probably because they have decided to use your class, either in your program or a program of their own, and have run into some confusion regarding how it works. They are probably wondering: What's the best way to instantiate and then use this class? How do I accomplish the purposes laid out in the header file? Is there some bug still in this class I should know about? Is there some weird quirk or subtlety in the way this class is meant to be used?
Keeping an imaginary user in mind will allow you to write helpful, non-obvious descriptions of your files, even for relatively straightforward programs and classes.
A Note: This will be reiterated below in the section on README files, but it is important to remember here that code's purpose is never determined by its implementation details! When deciding whether to use a program or class, people care about whether it will accomplish their goals, not how it will accomplish those goals.
Function Contracts
Function contracts appear just before the function header and are meant to explain how the function fits into the larger program. They must include the function's purpose, any parameters, any return values, and any other information that might affect the rest of the program. A function contract should not include implementation details, nor should it include information that can be found in the function prototype.
- Purpose should say what the function
does, not how the function does
it. Often this takes the form of answering how does this
function fit into the larger program. It should not include
any implementation details. E. g., a function's purpose
might be "To update the table so it maps the given key to the
given value. The previous value associated with that key, if
any, is lost."
- Parameters should include information about what they represent in the function and larger program. This section should also include any restrictions or expectations on what will be passed as a parameter.
- This section should not include information that can be gleaned from the function prototype: saying "two ints and a string" is unhelpful.
- Return value (if the function is non-void). "Return the
number of distinct items in the table" or "Return the first
item in the list greater than or equal to the given
value."
- Other information: This section should include side
effects caused by the function. For example, in
removeFromBack(), a list would change size (decrease in size
by one). Input and output would be considered side effects,
too. Other information in this section might be memory
management that occurs or needs to occur as a result of the
function running, any exit or failure conditions that would
cause the program to halt or the function to throw an
exception, or identified or suspected bugs within the function
("this function does not work for __ reason" or
"I think this function is causing my
program to __").
Include:
- Bugs. If you have identified a bug to a specific line of your program, make a note. Graders are more understanding of documented bugs versus bugs they have to suss out. At least, it shows you know it's there.
- Confusion. If there's an unclear or complicated section of your code, something that would not be obvious to someone with experience coding, explain it.
- Cases. If you are writing the code to deal with a specific case that's relevant to key invariants of the data structure, you should identify the relevance of the invariant.
Don't Include (in your submission):
- Questions: In-line comments asking questions about the code or what should be done ("Do I need to increment this here?", "I don't know if this works") are not helpful and should not be in submitted code. Those kinds of questions can and should be addressed in office hours.
- TODO: Comments that indicate things that need to be written and/or fixed are useful while you are working but should not be submitted.
Commented-out code should not be submitted! It's great to test out multiple potential solutions to a problem, but by the time you submit, you should have chosen the code that works the best and deleted anything you decided not to use. The occasional exception to this rule is unit testing code that is meant to crash your program.
The README
A README is a file used to describe a program. Every open source project has a README file that describes what is going on in a program, what is necessary for it to work properly etc. In this course, it additionally will answer questions related to the topics/data structures covered in the assignment. The first thing someone opens when they look at a project is the README. When we are grading, that is the first file we open and the place we will go if we have questions or confusion about the code. It is crucial to fill it out thoroughly and in detail.
README files contain some straightforward, but important information:
- Your name, the date, the assignment title, and the course (Comp 15)
- How to compile and run your program
- Acknowledgements of any help you received — TAs you
sought help from in office hours, peers you vented about your
coding struggles with (without showing them your code, of
course!), snippets of code from lecture or the course website
you used as inspiration, online resources you found helpful
(including our class' forum), etc. Even if an assignment was
particularly easy for you and you didn't receive help from any
of these sources, this section should not be left blank. You
could instead list one obvious source of information, such as
the professor's lecture on a relevant topic.
- Any bugs still in your program that you were unable to fix, and where they can be found in the code
- Data structures section for you to describe the data structures used in the assignemnt
- Sometimes the spec will include questions that should be answered in your README. It's worth noting here that those tend to be a significant portion of your grade.
- Time spent section for you to note how many hours you spent on a particular assignemnt
There are also parts of the README that are a little less straightforward:
- You should describe the purpose of your program. This can and should be done in three sentences or less, but it's important to think hard about what information would be useful to a potential user of your program. Namely, the implementation details of a program are irrelevant to its purpose. When I was deciding between Atom, VSCode, Vim, Emacs, and Sublime, I considered the features of each editor and how effectively each one suited my needs. I did not wonder about the underlying implementation details of each editor at all.
- You should describe each file you have submitted. Guidance for describing source code files can be found above, in the section on file headers. Other files you might submit, which will have simpler descriptions, are the README (which can be described by just saying "this file"), unit testing source code files, and sample inputs that you used for testing. It's important to list every file you submitted, even if the description is obvious! For example, if one of your files is mistakenly left out of your submission, having it listed in the README will help the TA understand what happened and remedy that.
- You should describe, chronologically and in detail, how you tested and debugged your program as you were working on it. In this course, you will be required to unit test every program you write. Unit testing is necessarily something you do as you write your code, so the full process is not always perfectly visible from the final files you submit. This is your chance to convince the grader that you thoughtfully and thoroughly tested your program as you were writing it. Include, for example, information about how you isolated and tested particularly tricky functions, and describe bugs you found while unit testing and how you fixed those, or edge cases that came up through your testing.
- Last but not least, seeing as this is a Data Structures course, you should describe the data structures you used in your program! This is your chance to explain the implementation details of your program. What data structures did you practice writing this week? How do they work? What are their advantages and disadvantages? Similarly, if you learned and practiced implementing a new algorithm, briefly explain how it works. Write as if this section is meant to be read by somebody who has only taken an introductory computer science course, but is really interested in learning about the concepts from our course.
The Code Itself
Organization into Classes and Structs (Data Structures!)
- If a struct is meant to be used in a class, it should be
defined inside of that class. If it is used internally to the
class, it should be private. For example, a Node struct used
inside a linked list should be private. If a struct needs to
be declared outside of any other class for whatever reason, it
should be declared in its own header file.
- Non-interface functions in a class should be declared
private. The only functions that should be public are the ones
required for the client to be able to use the class.
- All data members should be declared private.
Recursion
- Recursive functions should only be concerned with the current element. Avoid dereferencing "next" or children pointers in the function body; rather, just recurse to the next or child node.
Variables
- No global variables: no variables
should be declared or defined outside of functions with the
exception of global constants.
- Use of literals should be avoided; use global (or static)
constants instead. If your program calls for an 8x8
two-dimensional array, for example, create a global
constant ARRY_SIZE = 8, then declare
your array
with int[ARRY_SIZE][ARRY_SIZE]
rather than int[8][8].
- Variable names should be descriptive, with the exceptions
of variables with very limited scope: i and j for loops
(triple-nested loops should not
occur); curr_<type>
(e.g. curr_node) as a
pointer; temp
or aux
for very temporary variables
(accessed by 3 lines of code or fewer). One appropriate use
for temp would be swapping two
elements in an array, for example. Foo,
var, and x are not
descriptive variable names and should never be used.
Brevity
- Use boolean expressions and values. For example, there is no reason to write
if (isBig == true)
when you can write
if (isBig)
- Don't write functions longer than 30 lines between the opening and closing brace. This will affect your modularity grade.
- Do not rewrite code that is given to you, create functions with nearly identical uses, or write in-place code when there is a function you can call to do the same work.
- Use helper functions to simplify code, especially if they can be called in many places. But a helper function can also be useful if it provides a name for a computation that makes other code clearer. For example, rather than test
if (front == nullptr)
all the time, you can define a function called
if (isEmpty())
- Try to avoid doing unnecessary work. For example, rather than recomputing something several times, compute it once and save it in a well-named variable.
Whitespace
- Indentations should be made up of either 4 or 8 spaces, not tab characters (with the exception of makefiles). Tabs display differently based on a computer's settings, but spaces always look the same. Most text editors have a setting to output a specified number of spaces instead of a tab character whenever you press the "tab" key.
- Furthermore, indentation should be consistent! Indentation is a crucial part of code's readability, as it helps readers understand how code fits together. Make sure the levels of indentation accurately reflect which function or code block(s) any given line is part of.
- In our course, the width of your indentation should be either 4 or 8 spaces. Indentation of only two spaces makes your code more difficult for us to read. If this causes you to struggle to adhere to the 80 column rule, then your code has too many levels of nesting, and needs to be more concise, more modular, or both.
- Binary operators (operators that take two pieces of input, such as '+', '*', '=', and '=='') should have spaces around them.
- Unary operators (operators that take only one piece of input, such as '++'') should not have spaces around them.
- When declaring pointers, the asterisk should be attached
to the variable name, not the type
(Node *curr rather than
Node* curr).
Rationale: The compiler interprets the * as a
decoration of the variable name on its right, not part of
the type: Node* np1, np2; declares one
pointer variable and one variable that contains a
Node, which is confusing.
Putting the * next to the
variable it modifies makes it clear to the reader which
variables, if any, hold pointer values:
Node *np1, np2;.
(np2 is poorly named, but its
status in the declaration is clear.)
- In lists declaring and initializing variables, there should be a space after every comma.
- There should be a single space between a loop or conditional keyword (such as for, if, and while) and its corresponding opening parenthesis, but no space between the name of a function and its corresponding opening parenthesis.
- This should go without saying, but do not violate any of these guidelines in order to make your code adhere to the 30 line or 80 column rules. This will make your code difficult for the grader to read, and therefore will not help your score.
Other Guidelines
- No line should be longer than 80 characters. In other words, no files that you wrote should have more than 80 columns. Typing "wc -L *" into the terminal will display the number of columns in each file in your current folder.
- There are some style rules that pertain to conditional statements and loops, as well. These determine the control flow of your program, so having easy-to-read conditional statements and loops is immensely important to a reader's ability to understand what is happening in your code.
- Use the keywords "and," "or," and "not" as opposed to the equivalent operators "&&," "||," and "!" so your code is both easier to read and less prone to difficult-to-spot bugs caused by typos. (Note: You should use "!="" when checking for inequality)
- Don't treat non-boolean variables
as booleans — we know that when a number is 0 or
pointer is null, it will evaluate to false. For
readability, however, you should explicitly compare the
variable to zero. For example,
write while (n > 0) rather
than while (n)
- The break
keyword should only be used when you have
a switch statement. In addition
to making loops more difficult for a reader to understand,
it undermines one objective of our course, which is to
learn to write code (including loop conditions)
thoughtfully.
- Returning from inside of a loop,
however, is often useful and
encouraged. The continue keyword
may be used occasionally, but should be avoided when
possible.
- Curly braces. We are somewhat flexible on this, but be
consistent, but here are some notes:
- If an open curly brace ({) is placed at the end of a line of code,
there should be a space before it to separate it from
what came before. It may also be placed on the next
line lined up under the beginning of the statement it's
part of.
- Close curly braces (})
should be on a line by themselves unless the statement
they are part of continues. For example, you may write
"} else {" all on
one line (or you can put each item on its own line).
- We do not require the use of curly braces for blocks
of a single statement. That is, you may write
an if statement or
a for loop without any curly
braces if the body is one line long. However, as a
rule, do not put an entire if statement on one line.
- The keyword auto
should only be used when declaring
variables of the iterator type. This
is a Data Structures course — you're expected to develop
a very strong grasp of which class, struct, or primitive data
type you are working with at any given time.
- Finally, when working with pointers, the
keyword nullptr is preferable to the
old-fashioned, C-style
constant NULL.
How We Evaluate
When we grade functionality of your code, one
thing we do is to use automated tests and compare your output to
the output we expect. If your results differ from ours on multiple
tests, we try to fix bugs that may have caused multiple tests to
fail, and will refund points for tests that failed due to a
"cascading error" — basically, we won't deduct points for
the same bug twice. HOWEVER, this process breaks when we spend an
hour looking at your code and cannot find a bug that we know must
exist. It takes way longer for us to do this debugging process if
we can't read your code. If you have readable, well documented
code, it is much more likely that you will get points back for
these errors, which often amounts to a difference of multiple
letter grades.
This is another reason it is especially important to document wherever bugs occur in your code — any time we save by not having to find the bug ourselves, we can spend fixing the bug and then giving you back those cascading error points.
Finally, our parting word of advice is to always give your code a quick once-over immediately before submitting. Maybe there are function contracts or file headers that are no longer relevant; maybe you forgot to document something as you were writing; or maybe there is a particularly complicated function that you will realize needs better in-line commenting. Documentation and Style points are, in many ways, the easiest points to earn on any given comp 15 assignment — whether or not you have time to get your code working perfectly (or at all) before the due date, you can always get full points for this section. Don't let yourself miss out on what should be very reliable credit just because something slipped your mind!