The Grammar According to West

by Douglas B. West

Summary

I have been accumulating these observations for many years. Writing textbooks has led me to think about how best to present mathematics. I have also noted writing errors commonly made by my thesis students and in papers submitted to journals. Here I collect my conclusions.

My first objective for this document was to educate my students, thereby reducing the time needed to edit their theses. Since it exists, I have made it publicly available in the hope that others may find it useful. If you don't find it useful (or if you object to it on principle), then please ignore it. I hope to make some writers of mathematics (especially students) aware of issues they may not have considered, where small changes can produce mathematical writing that is easier to read by wider audiences.

After an introductory explanation of why care in writing mathematics is needed, I discuss (1) mathematical style, (2) notation and terminology, (3) punctuation and English grammar as used in mathematical writing, and (4) English usage for non-native speakers. Some points are minor distinctions, but even these make mathematical writing clearer when used consistently. My intent is not to make writing rigid, but rather to make it transparent to avoid distracting the reader by ambiguities or awkwardness in the flow of the narrative.

Index of specific items
Mathematical style
Abstract/Intro/Conclusion
syntax for definitions
"where" in definitions
"double-duty" definitions
"Let G=(V,E) be a graph"
expressions as units
separation of formulas
notation starting sentence
"let x,y be"
conditions in parentheses
mixing words & notation
"Let .... Then"
"When/For/Since"
"As/For" as reasons
"Hence/Thus/Therefore"
"by Theorem X"
"so" vs. "so that"
"such that" vs. "so that"
"Assume/Suppose/Let"
"any/each/every"
universal quantifiers
"less/fewer"
sets vs. sizes
possessives on notation
nested proofs
"best possible"
numerals vs. words
Terminology/Notation
":=" (for definitions)
"such that" in set definitions
"sequence/series/list"
"v1,v2,…,vn"
lists under relations
"k=1,2,...,n"
"Big Oh" notation
"maximum degree Δ"
hyphenation
"a-b path"
order of a graph
digraphs and hypergraphs
connected components
"maximal" vs. "maximum"
multicharacter operators
"induct on", "by induction"
"clique" or "complete subgr."
isomorphism vs. subgraphs
"proper coloring"
"partitions" vs. "parts"
"pairwise" vs. "mutually"
"disjoint sets"
"union/join"
edge or path "between"
set minus
"left hand side"
English usage
introductory words
quotations/periods
which/that
antecedents
naked "this"
"distinct/unique"
contractions
"i.e." and "e.g."
"different than"
articles ("a/the")
possessives & titles
adjectival names
conjunctions & commas
semicolons
excessive commas
serial comma
appositives
passive voice
"the below"
"either"
"we have been proving"
"non-"
For Non-native speakers
"bound of"
"a joint work"
"few" vs. "a few"
"usual"
"partial case"
"passing a vertex"
"can not" and "may be"
"evidently"
"as evidenced by"
"principal" vs. "principle"

Introduction and Motivation

In live mathematical conversations, one takes many shortcuts that are inappropriate in precise mathematical writing. The context is known by all participants, and shortcuts evolve to save time. Furthermore, the speaker can immediately clarify ambiguities. Without immediate access to the author, written mathematics must use language more carefully. In addition, mathematical concepts are abstract, without context from everyday experience, so the writing must be more consistent to make the meaning clear. Outside mathematics, imprecise writing of English can still be understood because the objects and concepts discussed are familiar.

Some mathematicians object to some of my recommendations. Many time-honored practices in the writing of mathematics are grammatically incorrect. These mistakes in writing cause no difficulty for readers with sufficient mathematical sophistication or familiarity with the subject, but it is unnecessary to restrict the audience to such readers. A bit of care leads to clearer writing that makes mathematics more easily accessible and readable to a wider and less specialized audience.

Various languages other than English have conventions of usage or grammar that lead to typical errors in English mathematical writing by their native speakers. I have put discussion of these special items in a separate section at the end. My explanations use terms for English parts of speech and punctuation, giving technical reasons for some recommendations. I hope that readers who are unfamiliar with these terms will still benefit from seeing what the choices are.

Before I start, several disclaimers are in order. I apologize in advance for my own grammatical errors. Habits die hard, and it is easy to err in applying principles of writing. Note in particular that there are inconsistencies between what I propose here and what I wrote in my earlier books. Those books were written in the previous millennium, and I have learned many things about clear writing since then. Also, I am a speaker of American English, and some points are consistently different in British English (such as the treatment of "which" vs. "that" and the aversion to serial commas).

Some of my conclusions conflict with manuals of English style. The conclusions I have drawn are intended to produce clear mathematical writing that is more logically consistent than publishers' conventions. This applies especially to punctuation and to words that serve as logical connectives.

I welcome corrections, suggestions/inquiries and "pet peeves" that may lead to inclusion of further items in later versions of this guide.

    Mathematical style

  1. Abstract, Introduction, and Conclusion. We begin with the overall structure of a research article in mathematics. The abstract states the results as fully as possible in a brief presentation. Crucial terms without which the typical reader won't know what is being said should be defined. The abstract must stand on its own, especially in the age of electronic communication where it may be seen separately from the rest of the paper, and hence it must contain no numbered reference to the bibliography.

    The first section of the paper is an "Introduction" that should motivate the problem, discuss the related results, state more completely what the results are, and perhaps summarize the techniques or the structure of the paper. In addition, the introduction should contain the concluding remarks or key conjectures.

    There is generally little or no value in a separate section of concluding remarks. Such remarks either are redundant or contain information that readers will look to the introduction to find. Readers who study the full details of the proofs are well aware of the statements that summarize what has been done. Readers who do not read the full details have no reason to go on to the concluding remarks. A mathematical research article is not read like a novel or even like an essay that seeks to "persuade" the reader; it does not need an epilogue.

  2. Definitions. Words being defined should be distinguished by italics (or perhaps boldface in a textbook context). When italics are used to indicate a word being defined, it is unnecessary to use "called" or "said to be"; the use of italics announces that this is the term being defined and replaces these words.

    Many definitions are phrased as "An object has property italicized term if condition holds." We use just "if" even though subsequently it is understood that an object has the property if and only if the defining condition holds. The italicization alerts the reader to this situation. The convention can be justified by saying that the property or object does not actually exist until the definition is complete, so one does not yet in the definition say that the named property implies the condition.

    Definitions written by non-native speakers sometimes contain errant commas. In each sentence below, the comma should be deleted.
       "A bipartite graph, is a graph that is 2-colorable".
       "A graph is bipartite, if it is 2-colorable".
    The first example is a mistaken placements of a comma inside a clause (see discussion of Commas).

    Note the difference in italicization above. When written as an adjective-noun combination, the term being defined is the name for structures that have the property; hence the full term bipartite graph is italicized. When the property alone is being defined and is positioned as a predicate adjective, only the adjective is italicized.

  3. "Where". A formula may contain notation that has not yet been defined, if the definition of that notation follows immediately in the same sentence. The formula is then followed by a comma and the word "where" to introduce the definition. For example, "If G is a bipartite graph, then χ'(G)≤Δ(G), where χ'(G) is the edge-chromatic number and Δ(G) is the maximum degree of G." (Technically, the comma is needed because the definition of the notation is an appositive).

  4. Double-Duty Definitions. One cannot make a statement about an object before the object has been defined. Similarly, one cannot use notation in a formula unless the notation has previously been defined. In particular, these tasks cannot correctly be accomplished at the same time with one instance of the notation. For example, "The neighborhood of a vertex v is N(v)={u: uv∈ E(G)}" is incorrect. With a subject and a verb before the equation, the equation is a single unit (see expressions as units). This sentence defines the neighborhood of v to be a particular equation, and it does not define the notation N(v).

    Of course, readers sufficiently familiar with the context have no trouble understanding what is meant, but why disenfranchise other readers? One can just as easily write "The neighborhood of a vertex v, denoted N(v), is {u: uv∈ E(G)}". Alternatively, one can introduce the notation as an appositive in a conventional position immediate after the term defined: "The neighborhood N(v) of a vertex v is {u: uv∈ E(G)}".

    A common Double-Duty definition is "Let G=(V,E) be a graph". The sentence defines the equation G=(V,E) to be a graph. Of course, the writer intends simultaneously to introduce notation for a particular graph and its vertex set and edge set, but that is not what the sentence says. It is better to write "Let G be a graph" and use operators V and E to refer to the vertex and edge sets of G as V(G) and E(G) (see also Operators vs. constants.)

    A more subtle example is "For each 1≤ i≤ n,". The introduction of the notation i has been lost because the inequalities impose conditions on it before it is defined. Since the expression is a unit, grammatically the phrase is referring to each inequality written in this way. Correct alternatives that express the intended meaning include "For all i such that 1≤i≤n", "For i∈[n]", and "For 1≤i≤n". The third option is slightly different from the others; it means "whenever i is such that the conditions hold", implicitly introducing i in a specified range but avoiding the grammatical problem.

  5. Expressions as units. In order to be consistent and avoid confusion for the reader, one must decide whether an equation or inequality is a noun unit or is read with the relational symbol as a verb. Treating the symbol as a verb often forces rereading to clarify the meaning, and one often wants to have another verb in the sentence. For these reasons, it is best to treat notational expressions as single objects (nouns, essentially).

    For example, "there exists i<j with xi=xj" ascribes a property to the inequality i<j (and is a Double-Duty Definition of i). Without context, it is hard to tell that the author meant "there exists i such that i<j and xi=xj". Consider also "The number of nonneighbors is n-1-d(u)≥ i." The number of nonneighbors is not an inequality, it is a number; the author is trying to make two statements in one inequality. For clarity, separate the statements: "The number of nonneighbors is n-1-d(u), which is at least i".

    Exceptions. Applying this principle with very simple expressions leads to ponderous writing. Here are two notable exceptions:
       1) In "Choose x∈ V(G) such that x has minimum degree," we are choosing x, not the expression "x∈ V(G)". The justification for this exception is that the membership or containment symbol is read as "in", which is not a verb. (One can treat nonmembership in the same way.)
       2) "Let G'=G-x". When introducing notation for an object or expression by a single imperative verb ("let", "set", "put", "choose", etc.), we read the equality symbol as the verb "equal", truly an exception. This exception can be recognized by the lack of any verb outside the notational expression. Continuing with another verb, as in "Let G'=G-x be ...", would produce a Double Duty Definition.

    If the introductory part of the sentence is longer, then we may already have a noun and a verb, and the expression again becomes a unit. For example, "Include each vertex independently with probability p=(ln n)/n" should be "Include each vertex independently with probability p, where p=(ln n)/n".

  6. Separation of formulas. Avoid placing two formulas consecutively, separated only by a comma. For example, "For x<0, x²>0" may be read as something other than a hypothesis and a conclusion. Similarly, "For some k with k<n, n-k+f(n)<n/2" requires the reader to stop and go back to insert the missing words. The mathematics will be easier to read if the formulas are separated by the comma plus "it follows that", "we have", etc; include the words that enable the reader to understand the sentence the first time. The difficulty arises because commas occur also in notation, and the eye does not immediately distinguish between commas that occur in notation and commas that are intended to cause a pause or to substitute for words.

    When the second formula just specifies an object, the separation can be accomplished by specifying the type of object, as in "When k=2, the graph G is Eulerian" instead of "When k=2, G is Eulerian." One can always rewrite to notational expressions separated only by a comma. Sometimes it is very easy, as in changing "For every bipartite graph G, χ(G)≤2" to "If G is bipartite, then χ(G)≤2".

  7. Initial notation. Never begin a sentence with notation. Always one can prepend a specifier (such as "The graph G is" instead of "G is") or rewrite the sentence in another way to avoid starting with notation. Following this rule makes mathematics easier to read. The principle here is similar to the separation of formulas.

  8. Lists of size 2. It is common but ungrammatical to write "Let x,y be vertices in G"; we would not write "My friends John, Mary came to dinner." The concatenation is an instance of two formulas separated by a comma. To see what can go wrong, consider the following clause: "Since a|b and a,b are maximal and minimal,". What was meant was: "Since a|b, with a maximal and b minimal,". In general, the comma within a list of two elements should be replaced with "and" when discussing the two elements as individual items. For example, "If x,y are adjacent" should be "If x and y are adjacent" or "If {x,y} is a pair of adjacent vertices".

    Exceptions. With a list of size at least three, omission of "and" does not cause as much confusion, and including it is awkard. Here the objection to the common mathematical convention is much weaker: we accept "Let x,y,z be the vertices of T," although writing "Let {x,y,z} be the vertex set of T" would be more precise.

    Another sensible exception is "Choose x,y∈ V(G)". Here the relation is between each variable and the set, and we accept this as a single formula. Again a justification is that we can read as the single word "in", without a verb. Similarly, many mathematicians write, "For n,m≥2" to mean the conjunction of n≥2 and m≥2. The exception for the membership symbol is consistent with other exceptions for the membership symbol; doing it with inequalities is more questionable. Avoid doing it with equalities (see Variable equal to list). it unnecessarily requires a pause for the reader to figure it out.

  9. Parenthetic or wordless restrictions. Many writers of mathematics impose restrictions parenthetically or via commas, thereby omitting words in sentences. Although this is common, it imposes unnecessary hardship on the reader, or at least an unnecessary pause to extract the meaning. Parentheses next to notation are mathematical objects and therefore cannot substitute for words. A phrase like "Let m(m≤n) be the size" is immediately clear only to the author.

    Other examples: "Suppose there is an edge xy (≠e) in G such that" should be "Suppose that G has an edge xy other than e such that". Similarly, "For k≤m with k even" improves on "For k≤m (k even)" or "For k≤m, k even", and "Consider ai for 1≤i≤n" is better than "Consider ai (1≤i≤n)". One can also separate by putting words into the parentheses: "For k≤ m (where k is even)". Note that "Suppose that there is an edge xy≠e in G such that" is a Double-Duty Definition; "xy≠e" is not an edge.

  10. Mixing words and notation. Words cannot be compared with notation via a relational symbol. Do not write "Consider a graph G with maximum degree ≤ k". Grammatically, the sentence does not indicate where the inequality starts. If one side is written in words, then the relation must also be written in words. This restriction is a logical consequence of treating the notation as a unit; in that sense, the sentence above says that the maximum degree of G equals the expression "≤ k".

    The same principle applies to logical symbols. In written mathematics, do not use the symbols ∃,∀,⇒,iff) to substitute for words in sentences. Shorthand notation used to save space on lecture slides need not follow these restrictions, since the slides summarize the lecture and are accompanied orally by sentences.

  11. Statements of implication ("Let ... Then"). The common two-sentence mathematical construction
         "Let hypothesis. Then conclusion."
    is grammatically incorrect. The second sentence is not a sentence, since the implicative sense of "then" plays the role of a
    conjunction. The simpler form
         "If hypothesis, then conclusion."
    is less choppy, easier to read, grammatically correct, and faithful to the mathematical sense of a conditional statement. When there are many hypotheses, resulting in too long a sentence, some creativity can be applied. First a sentence (perhaps beginning with "Let") sets the context. The last crucial hypothesis is saved for a statement of implication, using the "If/then" form.

    Used at the beginning of a sentence, the English word "Then" is temporal, as in "Then we left." Since the implicative sense of "then" is so common in mathematics, the temporal sense should rarely be used, to avoid confusion. Usually the temporal "then" at the beginning of a sentence can be changed to "Now" or "Next" with less confusion and essentially the same (and more accurate) meaning, especially in a proof.

  12. Words of hypothesis: "If", "When", "For", "Since". For ease of understanding, a sentence that begins with "If" should later have ", then" to start the conclusion. The word "then" should not be omitted, and a comma should precede it. The comma can be omitted in a brief implication contained within a clause already set off by a comma, as in "Since f is the squaring function, if x=0 then f(x)=0".

    When readability would be improved by omitting "then", the sentence should instead start with "When" or "For", as in this sentence itself. A comma still follows the condition introduced by "When" or "For". The structure of a sentence beginning with "Since" is like those beginning with "When" or "For"; a comma follows the first clause. After "Since" or "Because", the concluding clause cannot begin with "then" or "so"; "then" is used only with "If".

  13. "As" and "For" introducing reasons. In English, the words "as" and "for" may be used to introduce a reason given after the statement of the conclusion from that reason. For example: "I ate early today, for I was hungry," or "He stopped writing his answer, as time had expired." Banish these uses from mathematical writing; they introduce confusion, especially for non-native readers. "As" also means "like", and "for" is most often used to specify a universe. Compare "The degree is at least one, for a vertex in the neighborhood" with "The degree is at least one, for a vertex in the neighborhood is not isolated." The meanings of "for" differ, but the reader does not discover that until the end of the sentence.

  14. Words of conclusion: "Hence", "Thus", "Therefore" A long proof does not fit in a single sentence; hence often one needs a word to start a sentence that states a conclusion. Among the choices are "Therefore", "Hence", and "Thus". Purists (and copy editors) desire a comma after every such introductory word or phrase (as they do after "Finally", "On the other hand", "In 1965", etc.). This can make language overly formal.

    Among these choices, I treat "Therefore" as the most formal, introducing a major conclusion and hence taking a comma. Because "Hence" and "Thus" are single syllables, I use them without commas to indicate the flow of argument without making the writing choppy. This choice modifies strict English punctuation in the service of mathematical understanding. It is not incorrect to put commas after all these introductory words, but it enhances mathematical communication to omit the commas after short words introducing short conclusions that are just a step along the way.

  15. "by theorem X". Consider the sentence "Since G has at least 3n-5 edges, by Theorem X, we know that G is not planar." Does Theorem X imply that G has at least 3n-5 edges or that G is not planar? Since the reader will not know the author's intent, "by Theorem X" should never be placed between a reason and a conclusion. The options are "By Theorem X, G has at least 3n-5 edges, and therefore G is not planar" and "Since G has at least 3n-5 edges, Theorem X implies that G is not planar."

  16. "So" and "so that". Because of its other uses in English, "So" is too informal to introduce a sentence of conclusion (with or without being followed by a comma). It is best to reserve "so" for use as a conjunction, like "but": "The graph is connected, so each vertex is reachable from every other vertex." In this usage, no word is needed to introduce the reason that precedes the conclusion. As a conjunction, "so" is preceded by a comma, not a semicolon: "The graph has no odd cycles, so it is bipartite." This form is best used when the conclusion is short. When "so" is used as a conjunction, there is no "that". Thus "We have x²=0, so that x=0" should instead be "We have x²=0, so x=0".)

  17. "Such that" vs. "so that". "So that" means "in such a way that". Use "such that" when imposing a condition and "so that" when producing a construction in a certain way. In particular, "so that" requires a verb or action and specifies the way in which the action is done. "Such that" generally imposes a condition on a noun structure. Compare "Consider a graph such that no vertex is isolated" and "Color the graph so that no two adjacent vertices have the same color; what follows "such that" modifies "graph", but what follows "so that" is a condition on the action of coloring.

  18. "Assume", "Suppose", and "Let" A statement that is assumed is an axiom, considered throughout to be true. Something supposed is a hypothesis. Hence "Suppose" or "Suppose that" is more appropriate to introduce a case or an argument by contradiction. In contrast, "we may assume" introduces a consequence of an argument or symmetry and henceforth will be true. I do not really understand what the phrase "Assume for a contradiction that" actually means; use "Suppose to the contrary that". Similarly, change the incomprehensible "By way of contradiction" to "To the contrary".

    "Suppose" vs. "Suppose that". After words of hypothesis or conclusion ("suppose", "assume", "implies", "conclude", etc), use "that" when what follows is a clause with an English verb. Omit "that" when what follows is just a noun unit, such as a notional expression. For example, "Assume the hypothesis" is a complete "imperative verb - object" sentence. The principle is the same in "Suppose x+y≤10".

    The distinction made here is a matter of some debate. Some authors are more formal and want to use "that" after the introductory word when what follows is a notational formula containing a relational symbol, treating that symbol as a verb. However, I think it is better to maintain the consistency of treating formulas as noun units. In addition, the role in clarification played by "that" when a clause with a verb follows become unnecessary when the clause is condensed into notation. Finally, the notation may be displayed, which emphasizes its role as a fact (noun) and makes "that" especially unnecessary. For consistency, the use of "that" should be the same when the formula is not displayed. A related example is "the case k=2", as opposed to "the case that k=2"; here "k=2" is the case, which is a noun, so there is no "that".

    In English, we also do not always use "that" when a verb is present. When the instruction is informal, without abstract concepts, "that" is usually dropped to avoid ponderous language. For example, "Suppose the hypothesis is true" would be awkward with "that". Similarly, the very short "Suppose there is" would be awkward with "that" after "Suppose", because the verb is gone before one even notices it; this is almost like "Suppose [notation]".

    This exception may seem awkward. A better solution when introducing notation is to avoid "Suppose x is" entirely: "Let G be a graph" is better than "Suppose G is a graph". Compare "Suppose x=1" and "Let x=1"; the second sentence is better. The first assumes the truth of an equality; the equation is a unit. The second is more active. Because we never say "Let that . . .", we either view "Let" as the entire verb or view the equality sign as the verb. This usage of "Let" is an exception to the treatment of expressions as noun units; it is not used with inequalities, because an inequality sign would need to be read as the lengthy "be less than or equal to" to become a verb.

  19. Universal quantifiers. The word "any" can mean "some" or "all" in different contexts, so it can be imprecise. It is clearer to use "each" or "every" as a universal quantifier when referring to a singular object.

    Numbered plural variables cause difficulty. In English, "for every two elements" is awkward because "every" is singular. Thus here it is better to say "for any two elements". The presence of "for" is suggestive of the universal quantification and helps avoid ambiguity. Nevertheless, there may still be confusion: consider the sentence "Form G' from G by adding an edge joining any two vertices with distance 2 in G." Some readers will think that only one edge is added, so this exception must be used with care.

    Avoiding "any" is not imperative. Evaluate its use in context, making sure to prevent misinterpretation. "Any" is a good substitute for "an arbitrary", and the meaning of "not any" is fairly clear.

    Using an indefinite article ("a" or "an") as a universal quantifier can be dangerous, as in "Prove that a bipartite graph has no odd cycle." Some readers may interpret "a" as "one" or "some", turning universality into existence. Using "every" is clearer. Putting "must" before the conclusion can suggest universality but is usually unnecessary.

  20. Position of universal quantifiers. Although logical formulas specify the universe over which a formula holds before stating the formula, when there is a single universal quantification the sentence may read better with the quantifier at the end. This order has the added benefit of emphasizing the conclusion when the context is easily understood. For example, one might prefer to write "For every graph G that is bipartite, χ(G)≤2" as "Always χ(G)≤2 when G is bipartite". Similarly, "ai∈S for 1≤i≤n" improves on "for 1≤i≤n, ai∈S".

  21. "Less" vs. "fewer". Use "less" when comparing numbers, and use "fewer" when referring to a set of objects. For example, "the number of edges is less than k" is correct, as is "the graph has fewer than k edges" or "G' has fewer edges than G".

  22. A set differs from its size. Comparing incomparable quantities is often called "comparing apples and oranges". One cannot compare a set with an integer; it is incorrect to write "Sperner proved that no antichain of subsets of an n-set is larger than C(n,n/2)". One must distinguish between a set and its size. Here one can write "no antichain has size greater than C(n,n/2)" or "no antichain has more than C(n,n/2) elements". (Due to the inadequacy of html, we use the notation C(n,k) for the binomial coefficient "n choose k".)

  23. "Estimate". Many mathematicians, particularly analysts, use the English word "estimate" as if it had the same meaning as the English word "bound" (both as a noun and as a verb). They write "now we estimate this quantity" when they mean "now we prove an upper bound on this quantity". In English, "estimate" means "approximate"; both upper and lower bounds are needed to give an estimate. This common usage by analysts is incorrect English and does not say what is meant, even if they are assuming an unstated implicit lower bound of 0.

  24. Possessives on notation. Do not write "Let x and y be v's neighbors"; always use "of" ("the neighbors of v") instead. Similarly, do not pluralize notation by referring to indexed elements or sets together as "the ai's". Usually "each ai>" or "a1,…an" or some other notation is preferable. Possessives and plurals of this sort should be reserved for informal oral communication.

  25. Nested proofs. Do not nest proof environments. No new proof label should occur before the end-of-proof marker for the current proof.

  26. "Best possible". "Best possible" is an adjective used as a single term; it indicates sharpness. We write "This result is best possible", just as we would write "This result is sharp". "This result is the best possible" indicates that this result is better or more valuable aesthetically than all other results in the world, which is not what is meant. The definite article should not be used here. Think of "best possible" as a technical term that is already a specific predicate adjective, so no definite article is needed.

    The informal phrase "is most likely" is similar to "is best possible"; there is no article because "most likely" is used as a single term. Another example is "best practice", which is a single technical term in areas of management science, etc. It is used as a single term, without "the".

    Although "This result is best possible" is a complete sentence, it is somewhat vague, since it does not specify the sense in which the result cannot be improved. Often it is more informative to say something like "the constant in the upper bound cannot be improved". For this reason, some writers suggest avoiding the term "best possible" in written mathematics.

  27. Numerals and spelled-out numbers. In standard English writing, numbers less than 10 usually are spelled in full, while numbers more than 10 are written in numerals. In mathematical writing, the basis for the distinction is different. Numbers less than 10 are spelled out only when used as adjectives expressing the quantity of objects in a set. They must remain as numerals when designating the value that a quantity equals. For example, "The two vertices both have degree 3" or "A cycle of length 4 has four edges". A reader contributes another excellent example; compare the two sentences below:
       Although X is not a cycle, its Betti invariant is 1.
       Although X is not a cycle, its Betti invariant is one.
    The first sentence says that the Betti invariant of X equals 1. The second sentence says that the Betti invariant of X is a cycle.

    Terminology and notation (especially in discrete mathematics)

  28. Definition symbol ":=". Some mathematicians use this symbol to indicate that the preceding symbol is being defined to mean the subsequent object. If this occurs in a sentence like "Let [n]:={1,…,n}", then the verb states that the notation is being defined, and the special notation is unnecessary. If it occurs in a sentence about the object being defined, such as "Consider a coloring of [n]:={1,…,n}", then it is an improper Double-Duty Definition and should be rewritten: "Consider a coloring of [n], where [n]={1,…,n}." Reading ":=" requires thinking "be defined to be" when preceded by "let", and it requires even more convoluted phrases when placed in a Double-Duty Definition. This awkward notation is never needed and encourages grammatical errors.

  29. "Such that" in set definitions: ":" vs. "|". For many reasons, the colon ":" is a far better choice than the vertical bar "|" to mean "such that" in a "notation/condition" definition for a set. For example, we may write "{3n+1: n∈N}". The vertical bar is heavily used in mathematics, most notably for size of sets, but also for divisibility and other purposes. Using it for this purpose leads to such messes as "{|A|||A|||B|}", which purports to describe the set of sizes of sets A that divide the sizes of sets B. The colon is far less used in mathematics. Even so, the best reason for using the colon is that this mathematical usage is similar to the meaning of the character in English. Finally, since "such that" is not a binary operator, this usage should be expressed in TeX using "\colon\," instead of ":". As in English, there should be space after the colon but not before it.

  30. Sequences, series, and lists. In mathematics, a sequence is a function whose domain is the set of natural numbers (perhaps with a shift of the initial element). Discrete mathematicians abuse this term in using it for an ordered finite set. A good name for such an object is list. An n-tuple is a list of length n. It is an abuse of terminology to say "a sequence of length n". (For finite graphs, in particular, "degree sequence" should be changed to "degree list". To avoid this problem, one can sometimes refer to the "vertex degrees" rather than "degree sequence" or "degree list".)

    The usage of "series" in English is contrary to its usage in mathematics. In English a "series" usually consists of finitely many occurrences in order, as in the "World Series" or the title "A Series of Unfortunate Events". In mathematics a series is an infinite sum.

  31. Listing the second element of a list. The expression "v1,v2,…,vn" for an indexed n-tuple is a style used to suggest that the elements are indexed by the first n positive integers with no skips. However, the most natural interpretation of the expression "v1,…,vn" is exactly the same. The appropriate convention is that indices in a list are consecutive unless explicitly indicated otherwise. Another reason to eliminate v2 from the expression is that "v1,v2,…,vn" forbids the possibility n=1.

  32. A list with relations. The sentence "Let x1≤…≤xn be a list of integers" is a Double-Duty Definition; the writer attempts simultaneously to introduce notation for the elements of a list and to impose inequalities on them. The expression "x1≤…≤xn" denotes a set of relations, not a list; what is meant is "Let x1,…,xn be integers such that x1≤…≤xn." (To avoid repeating the notation, it is better to write "Let x1,…,xn be integers, indexed in nondecreasing order.") Similarly, a chain of sets under inclusion is a list A1,…,Ak such that A1⊆…⊆Ak; the expression "A1⊆…⊆Ak" is not itself a chain.

    Although html does not have a standard character for line-centered dots, the ellipsis in an indexed list with relations should be vertically centered on the line ("\cdots" in tex), while the ellipsis in an indexed list separated by commas should be on the baseline ("\ldots" in tex).

  33. Variable equal to list. Many mathematicians write "for m=1,2,…,n" (with or without the "2") to mean "for m∈{1,…,n}" or "for 1≤ m≤ n". The expression "for m=1,…,n" is mathematically incorrect; it sets the value of m to be a list of numbers. The same principle applies to writing "i=1,2" to name two cases; this should be i∈{1,2}.

  34. "Big Oh" Common usage of "Big Oh" notation is another instance of setting expressions equal when they cannot be equal. The expression "f(n)=O(n²)" does not mean that the value f(n) equals the set represented by the notation O(n²). What is meant is "f(n)∈ O(n²)"; Knuth has written at length on this subject. An alternative that is roughly correct is to be more informal, writing "f(n) is O(n²)", in which "is O(" can be read as "is on the order of". Since it is convenient to do arithmetic with these classes of functions, this problem will not go away. An unsatisfying compromise is to use the membership symbol where the grammar of computation permits, in order to ensure that the meaning of the concept is understood.

  35. Operators vs. constants. We never use f to denote the value of a function f at a point x. The same principle applies to graph parameters and other operators. For example, the maximum degree of a graph G is denoted Δ(G). Here Δ is a function, not a number, and hence Δ should not be used to denote the value of the function Δ on a particular graph.

    It is tempting for mnemonic reasons to write "We write V=V(G) and Δ=Δ(G)". Admittedly, this usage is not confusing when discussing only one graph at a time; the difference between a graph invariant and a real-valued function is that we rarely focus on the value of a real-valued function at just one point. Nevertheless, it is rare that a paper discusses only one graph, and hence it is better to use V(G) and Δ(G) for objects associated with G. The problem is particularly bad with Δ, since this character also occurs in mathematics as a difference operator. One often sees "Δn" meaning the change in the value of n, so one should not use "Δn" to mean the maximum degree times the number of vertices in a graph. (In my textbook I violated this principle by using n(G) and e(G) for the numbers of vertices and edges in a graph G while using n for the number of vertices of a particular graph and e as a particular edge; the error will be corrected in the third edition.)

  36. Hyphenation. A k-edge connected graph is a connected graph with k edges (compare with "n-vertex connected graph"); the meaning is different from "k-edge-connected graph". When the hyphen is missing, "k-edge" modifies "connected graph" because adjectives modify only nouns, not other adjectives. (Similarly, a non-specialist reader would think that a k-edge coloring is a coloring of k-edges, not a coloring of edges using k colors.)

    Two-word terms used as single concepts to modify nouns must be hyphenated when so located (without the hyphen in this sentence, we would be discussing two "word terms"). This principle applies in the correct sentence "A well-known theorem is a theorem that is well known." The same principle applies to parameters in adjectives: "k connected graphs" would be k graphs that are connected, in contrast to "k-connected graphs". Adverbs behave differently, since they can modify adjectives; for example, we may write "upper chromatic number" without hyphens.

    Another hyphenation issue arises in graph theory with analogous concepts for vertices and edges. Often a concept for edges is an analogue of a fundamental concept using vertices. In this setting, we do not need "vertex" as an adjective to specify "connectivity" or "chromatic number", but we add "edge" for the analogous edge concept. We then hyphenate "edge-connectivity" and "edge-chromatic number". This makes sense because in both cases the problem for edges is a special case (for line graphs) of the general coloring or connectivity problem. When comparing "edge-coloring" and "list coloring", the difference is then that we are not coloring the lists, so the format of the term is different from that for edge-coloring.

    When an expression involving addition or subtraction is used as a parameter modifying a noun, it should be enclosed in parentheses. For example, write "(k+1)-connected graph", not "k+1-connected graph".

  37. "a-b path". In "a-b path", "a-b" is not a word and has no notational meaning by itself. Even worse, often a-b is treated as a mathematical expression in tex and is typeset using a long minus sign with extra space around it. The intent is to specify a path with endpoints a and b. Thus a and b are parameters designating a certain type of path. Under the principles of hyphenation above, there must be a hyphen between "b" and "path". Furthermore, the endpoints are independently expressed parameters, with no operation being performed on them. Hence the correct notation is "a,b-path".

  38. Graphs are not sets. When h is a vertex in a graph G, it makes no sense to write h∈ G, since h could just as easily be an edge. A graph consists of a vertex set and an edge set; one should write v∈ V(G) and e∈ E(G). This is also the reason why the convenient notations |G| and ||G|| are mathematically inconsistent for the order and size of a graph.

    The term "order" for the number of vertices of a graph is not as popular as it once was. Some readers find it confusing and prefer "number of vertices". On the other hand, it is very convenient, while overuse of "number of vertices" becomes quite awkward.

  39. Directed graphs and hypergraphs. These models are variations or generalizations of graphs. In a digraph, the edge set consists of ordered pairs. The redundancy of saying "directed edge" or "directed path" or "directed cycle" is not helpful, as it suggests that the digraph contains such objects that are not directed (the term "weak path" is available for a path in the underlying undirected graph). Using these terms also eliminates the possibility of making statements that hold for both graphs and digraphs, like Menger's Theorem.

    Similarly, one should not use "hyperedges" to refer to the edges of a hypergraph. Hypergraphs generalize graphs by allowing edges to have arbitrary size. Calling them "hyperedges" eliminates the possibility of saying that graphs arise as a special case, since graphs have edges, not hyperedges.

  40. "Connected components". Unnecessary redundancy has similar disadvantages. We should not speak of the "connected components" of a graph, because there are no disconnected components of a graph. Writing "connected components" suggests that there are components that are not connected.

  41. "Maximal" vs. "maximum". Many mathematicians use these words interchangeably. One can make a useful distinction by using "maximum" to compare numbers or sizes and "maximal" to compare sets or other objects. Thus a maximal object of type A is an object of type A that is not contained in any other object of type A. A maximum object of type A is a largest object of type A; here "maximum" is an abbreviation for "maximum-sized". For example, in a graph we may speak of "maximal independent sets" and "maximum independent sets"; these are convenient terms for distinct concepts that are both important.

    Although this distinction is sensible and has become established in many settings (such as "maximum antichain" and "maximum independent set"), potential confusion can be reduced by using "largest" and "smallest" instead of "maximum" and "minimum". For example, it is harder to misinterpret "a largest matching" than to misinterpret "a maximum matching".

    For consistency, then, one should not write "a vertex of maximal degree" or "the maximal number of edges"; that is, "maximal" should not be applied to numerical values. This is consistent with usage in continuous mathematics, where we write that a continuous function "attains its maximum" on a closed and bounded set.

  42. Multicharacter operators. A string of letters in notation denotes the product of individual quantities. Therefore, any operator whose notation is more than one character should be in a different font, generally roman. This convention is well understood for trigonometric, exponential, and logarithmic functions, and it applies equally well to such operators as dimension (dim), crossing number (cr), choice number (ch), Maximum average degree (Mad), etc.

  43. "Induct on" and "By induction". The phrase "We induct on n" is convenient but not correct. From given hypotheses, we deduce a conclusion; we don't "deduct" it. When we announce the method of induction, we must instead say "We use induction on n." The verb "to induct" is used when a person is inducted into an honorary society, for example.

    A different problem arises in the induction step. When we cite the induction hypothesis, we must write "By the induction hypothesis", not "By induction". To obtain the conclusion for the smaller instance, we are invoking the hypothesis that the claim holds for smaller values; we are not invoking the principle of mathematical induction.

  44. Cliques vs. complete subgraphs. These terms traditionally were used interchangeably in graph theory, but it is useful to distinguish them. There is a difference between a set of pairwise adjacent vertices in a graph (dual to an independent set of vertices) and a subgraph isomorphic to a complete graph. Both concepts are needed, and the appropriate terms for them are "clique" and "complete subgraph". Thus "clique" should be reserved for a set of vertices, and then the meanings of "clique of size 5" and "5-clique" (the same) are clear. In previous centuries, also "clique" was sometimes used to mean "maximal clique", which should not be done.

  45. Isomorphism classes vs. subgraphs. A graph is a pair consisting of a vertex set and an edge set. Paths, cycles, and complete graphs are graphs whose edge sets are described in specific ways. The notations Pn, Cn, and Kn do not specify a vertex set, and hence in specifying paths cycles, and complete graphs they must refer to the isomorphism classes.

    Hence we should never write "a Pn" for a member of that class. We can write that a graph "contains a path with n vertices", because that is a structural description of the subgraph, but we cannot write "contains a Pn" or "consider a Pn in G". We can say "contains ten copies of Pn" to refer to subgraphs that are n-vertex paths; each such subgraph is a member of the isomorphism class denoted by Pn.

    Nevertheless, complete strictness about this notation produces very awkward writing. Thus when $H$ is the notation for an isomorphism class, we still write "H⊆G" to mean that some subgraph of G belongs to the isomorphism class or is "isomorphic to H", even though we are not specifying particular subsets of the vertices and edges of G. graph with n vertices. The reason we accept this slight abuse of the notation "H⊆G" and not the expression "a Pn" is that "a" is an English word whose meaning and grammatical usage cannot be changed, which emphasizes the difficulty that Pn is not a singular object.

  46. Proper coloring. A k-coloring (or k-edge-coloring) of a graph is a partition of the vertices (or edges, respectively) into k classes. In combinatorics generally, a k-coloring of a set partitions it into k classes, arbitrarily. This general concept appears in many areas of mathematics, including Ramsey theory, graph decomposition, and chromatic numbers. In the latter context, a proper [edge-]coloring is one in which adjacent [or incident] elements do not receive the same color.

    Some authors who write extensively about chromatic number and edge-chromatic number drop the word "proper" and use k-[edge-]coloring for the restricted concept. The minor convenience gained by dropping this word is overwhelmed by the negative influence of introducing inconsistency of terminology in combinatorics. Use "proper k-coloring" when that is what is meant. For other variations, such as "acyclic k-coloring" or "dynamic k-coloring", the adjectives replace "proper" by imposing other restrictions on the k-coloring, so the word "proper" is then no longer needed.

  47. Partitions vs. parts A partition consists of blocks or "parts". Do not use "partition" to refer to the members of a partition. (Students often make this mistake.)

  48. "Pairwise" and "mutually". Old-fashioned mathematics took the old-fashioned word "mutually" to describe a binary relation satisfied by all pairs in a set, as in "a set of mutually orthogonal Latin squares". In English usage, "mutual" indicates symmetry. Hence modern mathematics should avoid using "mutually" in this way. Instead, the word "pairwise" states exactly what is meant. The change becomes even more important in light of modern terms like "mutual independence" in which "mutual" explicitly does not mean pairwise. (Thus "mutually orthogonal Latin squares" is now ambiguous, but we cannot escape the notation "MOLS(n,k)" in design theory.)

  49. Disjoint sets. Disjointness is a binary relation. Hence "Consider disjoint sets A1,…Ak" is technically incorrect; we should instead say "pairwise disjoint sets". However, this is a universally understood abuse of terminology, and including the word "pairwise" each time would be ponderous. This principle can be extended to other commonly used binary relations do not make non-binary sense, such as "isomorphic".

  50. Disjoint union vs. join. In most of graph theory, it is common to use the notation of multiplication to denote a graph consisting many disjoint copies of a single component. Thus rK2 is the graph consisting of r disjoint edges. Similarly, Pn1+…+Pnk denotes a linear forest, consisting of k components that are paths with orders n1,…,nk. For consistency, G+H should therefore denote the disjoint union of two graphs G and H.
    Some authors use G+H to denote the join of G and H, which consists of the disjoint union plus edges joining every vertex of G to every vertex of H. There is other notation available for the join, such as GH. However, authors unfamiliar with the join operation (x∨y) in lattices or boolean algebra may not like this. I think an overstruck "+" and "◊" would be reasonable and would suggest the operation, but "⊕" is unavailable because it often represents symmetric difference.

  51. Between. An object that is between two other objects separates them; this is the common mathematical sense of "between". Referring to an edge (or path) with endpoints u and v as an edge "between" u and v is somewhat inconsistent with the rest of mathematics. One can say "an edge joining u and v" instead. In a planar embedding of a graph, an edge shared by the boundaries of two faces is an an edge between the faces.

  52. Setminus. The operator \setminus most often denotes difference of sets. Hence it is somewhat misleading or old-fashioned (and looks rather pompous) to use it for deletion of elements, as in "G\setminus e". Use "G-e" instead. Also, the notation G\setminus H is easily confused with G/H (especially by students). Of course, there are some contexts (matroids and various algebraic topics), where these notations have special meanings and are quite important, but for simple set difference A-B is preferable.

  53. "Left hand side". There is no "hand side", so this expression makes no sense. Even if one correctly hyphenates to make it "left-hand side", there is still no "hand". Just write "left side".

    English usage in mathematical writing

  54. Introductory words. Words or phrases like "nevertheless", "for example", "to the contrary", and "on the other hand" usually should be separated by commas from the rest of the sentence. Introductory prepositional phrases are a bit different. I am told that a phrase with one preposition ("In 1995") does not require a comma, but a phrase with two prepositions ("In August of 1995") does. Another reader tells me that an introductory phrase with at least five words (perhaps we should say five syllables) should be followed by a comma. I would use the comma unless the intent is to lead into what follows as a single thought (see Hence/Thus/Therefore).

  55. Quotations and ends of sentences. It is traditional correct style in English grammar that all terminal punctuation comes inside quotation marks. My understanding is that this convention arose from the technical aspects of printing presses. Its purpose was to lessen the danger of breakage of fixed metal type in printing presses. In the era of electronic publishing of mathematics, this justification is obsolete, and we can replace the convention with logical punctuation. When the material being quoted is treated as an item within the sentence and is not itself a sentence, the terminal punctuation logically comes outside the quotation marks. Copy editors trained in literary punctuation still object to logical punctuation but should be overruled.

  56. "Which" vs "that". The following two sentences have different meanings:
    1) "She will attend our meetings that concern calculus."
    2) "She will attend our meetings, which concern calculus."
    Sentence (1) states that among our meetings, she will attend those concerning calculus and perhaps no others. Sentence (2) states that all the meetings concern calculus, and she will attend them all. In common English, the distinction is perhaps even clearer: compare
    1) "I have two shirts that need cleaning."
    2) "I have two shirts, which need cleaning."
    In (1), two of my shirts need cleaning. In (2), I have only two shirts.

    When the phrase after the relative pronoun specifies a further restriction of the class that has just been introduced, the correct pronoun is "that", and the subsequent phrase tells which of the items in the class are those being discussed. If the subsequent phrase speaks about the totality of the class, then the proper pronoun is "which". When "that" and "which" both seem usable, use "that" when the sense is "having the property that", and use "which" when the sense is "all of which" or "the only one of which". Usually a comma is appropriate before "which". Usually "that" is correct when an indefinite article ("a" or "an") has been used on the word being modified. Beware: This distinction is not made or is made the opposite way in British English. Some American style manuals don't care, but in mathematics there are two distinct meanings to be expressed.

  57. Immediacy of antecedents. When using "which", "that", "where", or other words to introduce explanatory or descriptive phrases, the subsequent phrase modifies the most recent item. For example, "an embedding of G on a surface which has no crossings" indicates that the surface has no crossings, not that the embedding has no crossings. Making the comment on crossings apply to the embedding requires rewriting: "On a specified surface, consider an embedding of G, which has no crossings". Here "which" is proper because every embedding has no crossings; an embedding is a drawing that has no crossings.

  58. The naked "This". When "This" is used as the subject of a sentence, its antecedent is the most recent noun. If the desired antecedent is the preceding paragraph or some other object, then a noun should be inserted, as in "This discussion implies" or "This inequality implies" instead of merely "This implies". One way to understand this issue is to view "this" only as an adjective, not as a pronoun.

  59. "Every", "distinct", and "unique". The word "every" is singular; it means "each one". Because of this, we write "all values" or "every value"; not "all value" or "every values".

    The word "distinct" has the same meaning as "different". Two things can be distinct, but one thing cannot be distinct. Thus the sentence "Every value is distinct" is incorrect; it has no meaning. Many beginning students think it means that each value is different from every other value, but it does not.

    The word "unique" indicates that there is only one of the items being described. It does not mean that this item is different from other items. Some students think that "The function f maps the points in A to unique points in B" is a statement that f is injective, but it is not. Every function from A to B maps each point in A to a unique point in B.

    The distinction between the words "distinct" and "unique" is made clear by a typical boast on the World Wide Web. The sentence "Our website has one million unique visitors" makes no sense. The intent is to say that among millions of hits there are one million distinct visitors; if there is a unique visitor, then there is no other visitor.

  60. Contractions. Because mathematical writing is formal, contractions ("can't", "won't", etc.) should usually be avoided. They introduce a sudden informality that is inconsistent with the tone of proof.

  61. "I.e." vs. "e.g.". "I.e." and "e.g." are abbreviations for Latin phrases. "I.e." means "that is" and is used to introduce an explanation or restatement of what came before. "E.g." means "for example" and introduces an example. In formal mathematical writing, abbreviations (except as notation) are like contractions; it is better to avoid "i.e." and "e.g." altogether.

  62. "Different than". It is not correct to write "A differs than B", and for the same reason it is not correct to write "A is different than B". The correct wording is "A is different from B". The incorrect wording is modern American laziness.

  63. Abstract nouns and articles. Nouns that specify abstract concepts rather than objects need no articles. For example, "graph colorability" is an indefinite concept, so we do not say "Next we discuss the graph colorability". In contrast, "chromatic number" may be abstract or specific. We may say "Next we discuss chromatic number", referring to the general concept, or "Next we discuss the chromatic number of this graph", since this graph has only one value as its chromatic number.

    Functions or parameters assign a number to each domain object. The resulting value is specific for the object; there is only one choice for it. Hence we do not say "the graph has a chromatic number 3" or "the vertex has a degree 3". These sentences suggest that the object may have more than one value of the parameter. The answer to the question "What is the degree of this vertex?" may be "This vertex has degree 3", but it cannot be "This vertex has a degree 3".

    We also do not say "This vertex has the degree 3", although "The degree of this vertex is 3" is correct. Several instances occur in the sentence "Every graph has an even number of vertices with odd degree, which means that the list of vertex degrees has even sum." The term "even number" takes the article "an" because we are saying which type of number is being used (it is one of the even numbers). The later "odd degree" and "even sum" do not, because these are properties that the vertices and the list do or do not satisfy. Articles are inappropriate when invoking a property.

    Articles also are not used with conceptual nouns. Compare with familiar conversation: we say "This chair has value $100" and not "This chair has the value $100." "Value" and "degree" are abstract properties. Here is another non-mathematical example: We say "I receive compensation for my work," not "I receive a compensation for my work." Compensation is an amount, but here only the abstract concept of receiving compensation is meant, not some number of things. Hence we do not use an article.

    Similarly, abstract properties do not take articles. We say "because transitivity of A implies transitivity of B", not "because the transitivity of A implies the transitivity of B". The property in question is "transitivity", not "the transitivity".

  64. Possessives and titles. The definite article "the" specifies uniqueness. Possessives also play this role. It is incorrect to use both together, because the possessive already provides definite specification. For example, we write "Greene's Theorem" but not "the Greene's Theorem"; this is a theorem proved by Greene, not by "the Greene".

    When discussing a result by two authors, we cannot put possessives on both names, and making only the second name possessive would be wrong. Hence we write "the Greene--Kleitman Theorem". Here "the" serves as a definite article for the unique object "Greene--Kleitman Theorem". When the result is less celebrated, one can indicate the possessive by "of", as in "the theorem of Greene and Kleitman".

    In the examples above, "Theorem" is capitalized. When there is only one instance of an object, and the name of it involves a person, it plays the role of a proper noun and its name is a title. Another example is "the Cauchy-Schwarz Inequality".

  65. Adjectival forms of names. Some graph theorists use "Hamilton cycle" to mean a spanning cycle in a graph, but they would never say "Abel group". When describing a type of cycle, the modifier should be an adjective, so it is better to use an adjectival form of the name when that is available: "Hamiltonian cycle". The same applies to "Euler circuit" and "Eulerian circuit". However, some uses of names as adjectives are heavily ingrained and unchangeable, such as "Fibonacci numbers" and "Catalan numbers" (I do use "Eulerian numbers", though).

  66. Conjunctions and commas. Punctuation shapes sentences; commas encourage the reader to pause at places where doing so aides understanding. Missing commas may require the reader to stop and re-read in order to understand what has been said. Excessive commas delay the reader and may cause unnecessary re-reading.

    Two clauses (in essence, two complete sentences) may be combined using a conjunction; the conjuction must be preceded by a comma. Examples of conjunctions are "and", "but", "then", and "so" (the latter should be treated as conjunctions in mathematical writing). Since a conjunction joins two things, sentences should not begin with these words. This is a logical approach that helps keep writing clear, though strict English usage (especially British) may call some of these words adverbs. See further comments on the use of then and so.

    Exception. The situation is more complicated when the second clause itself contains a conjunction. Compare "If A, then B holds and C holds" with "If A, then B holds, and C holds". In the first sentence, it is clear that A implies both B and C. The proper grouping or meaning in the second sentence is unclear. Since we only have one comma symbol and don't parenthesize sentences to indicate grouping, a short conjunction of two sentences within a larger conjunction is written without a comma.

  67. Semicolons. Compound sentences consist of two complete sentences with no conjunction separating them. This form is used especially when the second part clarifies or comments on the first. Such sentences need a semicolon (not a comma!) to separate the two parts; this sentence is an example. Do not use a semicolon before a conjunction; in particular, there should never be a semicolon before "and", "but", "then", or "so".

  68. Excessive commas. A clause requires a subject and a verb. When "and" joins two parts of a sentence that do not both stand on their own with a subject and a verb, there should be no comma before it. The comma in "We will prove the lemma, and then the theorem" is incorrect; one must delete the comma or add a subject and verb to the second part. This example came from a newspaper: "In February, the graduate student in Electrical and Computer Engineering, was awarded the A--B--C Prize"; here there should be no comma before "was". (See further examples of excessive commas in Definitions.)

  69. Serial commas. A serial comma is a comma after the next-to-last element of a list. Wikipedia gives a discussion of its use. It is generally safest to use a comma in this situation. For example, compare "Under the conditions 1≤ i,k≤ r and m even" with "Under the conditions 1≤ i, k≤ r, and m even"; the two sentences have different meanings. The issue arises often when listing three mathematical objects or three authors. (For a list of lists, clarity can be achieved by using semicolons or by changing "and" to "&" within list items.)

    One reason for using the serial comma in lists is to avoid confusion in sentences that do not contain lists. Consider the sentences "Like a, b and c have the same property" and "Later, Early and Jones proved the conjecture". These are not lists, and using a comma would be wrong, but when a document does not use serial commas these examples initially appear to be lists. Similarly, in that context an item in a list that itself joins two subitems with "and" looks like the last two items in a list.

    Omitting the serial comma can also cause confusion mathematically, as in "The value of f is positive at 2, negative at 1 and 0 at 0."

  70. Appositives. An appositive is a noun or noun phrase that renames or substitutes for another noun or noun phrase immediately preceding or following it. It can be recognized by the fact that omitting it would yield a clear and complete sentence and that the additional information in it is not grammatically essential to the statement being made. It should be set off by commas: "His book, the best book on the subject, took years to write. An appositive in the middle of a sentence cannot have a comma on only one side.

    When an appositive is short enough or contains essential information, the commas are omitted: "My friend Bob is a student." In mathematical writing, a similar situation applies when notation is introduced: "The degree d(v) of a vertex v is the number of neighbors of v." Here "d(v)" is a brief appositive. One could argue that the notation for "degree" is not essential to the sense of the sentence, but putting commas around very short appositives can produce very choppy sentences. A speaker need not pause for such appositives, and hence one may omit the commas.

  71. Passive voice. Good writers of English minimize the use of passive voice. This accepted principle applies also in writing mathematics. Active verbs make the exposition more engaging; for example, "It suffices to show" is preferable to "It is sufficient to show". Nevertheless, judicious use of the passive voice can be appropriate.

  72. "Above" and "Below". These words are adverbs; they do not directly modify nouns. Hence "the above graph" and "the below figure" are incorrect. We can write "the graph above" or "the figure below" as a short form for "the graph shown above" or "the figure located below".

  73. "Either". The word either is used to indicate exclusive or. If the alternatives are mutually exclusive by definition, then "either" is unnecessary.

  74. "We have been proving" Do not use the "perfect" tenses, which involve the helping word "have". Phrases like "In Section 3 we have been analyzing" or "in [4] we had shown" are either grammatically wrong or confusing. The simple tenses, as in "In Section 3 we analyzed" or "in [4] we showed", are almost always better. Even the future can be eliminated: "in Section 4 we show" rather than "in Section 4 we will show"; the justification for this is viewing the entire article as a unit, happening in the present.

  75. Hyphenation of words containing "non" When a word in English initially has a negation introduced by prefixing "non", the resulting word is hyphenated. The initial sense is the negation, so the hyphen is appropriate. As decades pass and the word is accepted on its own, it becomes a positive concept incorporating the "non". This and familiarity lead to dropping the hyphen. Some of the most familiar examples in mathematics are "nonsingular", "nontrivial", "nonzero", and "nonconstructive". Adding hyphens to these words is now jarring to more readers than is the absence of hyphens. I also use "nonempty", "nonnegative", "nonneighor", and "nonadjacent". However, I would keep the hyphen in "non-word" and "non-edge", for clarity and infrequency.

    Mathematical English for non-native speakers

  76. "Bound of". Many non-native speakers use "bound of" when they mean "bound on". If x≤ k, then we have an upper bound of k on x. Using "bound of x" for "bound on x" can become confusing when comparing parameters. We do not want to say that the maximum degree Δ(G) is a bound of the chromatic number χ(G); when Δ(G)=k we want to say that Δ(G) establishes a bound of k on the chromatic number of the graph. (Writes from Asia typically overuse the preposition "of" when many others are more accurate, such as "on", "for", "about", etc.)

  77. "few" vs "a few". In English, "few" means "not many", while "a few" means "several". The sentence "In this paper we prove few good results" means that the paper does nothing worthwhile, while "In this paper we prove a few good results" means that it is worth reading.

  78. "Usual". It is a quirk of English that the word "usual" as an adjective usually requires the definite article "the". We cannot say "In this section we consider only usual chromatic number"; it must be "In this section we consider only the usual chromatic number". (This is a common error by speakers of languages that do not have articles.)

  79. "Partial case". In English, we do not say that one result is a "partial case" of another. We say that it is a "special case". (This is an issue of translation from one language to another.) However, it is correct to say that proving a special case of a conjecture is a partial result.

  80. "Pass" vs. "Pass through". In English, the word "pass" means "go by without entering". Thus a path that passes a vertex does not visit that vertex. To say that path P visits vertex v, one must say that P passes through v (this is a language translation issue).

  81. "Can not" and "may be". The expression "can not" should not exist in English (I am horrified to hear that it now does appear in some dictionaries). The logical meaning of "can not" is that it is possible for the statement to fail rather than that it must fail, which is the meaning of "cannot". All uses of "can not" must be eliminated, because the cannot be sure of what the author intends.

    The expression "may be" does exist in English, when used as a verb as in "It may be true" or "This may be the only component". However, when it appears at the start of a clause most likely the word "maybe" is intended, as in "Maybe this proof will work. Note that in this situation there is another verb ("work"), and the initial expression means "Possibly", which is not a verb.

  82. "A joint work". We speak of "a theorem" or "a result", since these are definite specific items but "work" is an abstract noun and does not take the indefinite article "a". We say simply "This is work of mine", not "This is a work of mine", and "This is joint work with my colleague," not "This is a joint work". This usage of "work" is different from "a work of art" or "the complete works of Shakespeare". In mathematics, "work" is equivalent to "research"; we do not say "this is a joint research". The same error occurs in "I will have a limited access to my email."

  83. "Evidently". Some nonnative speakers write "Evidently" to mean "Clearly". Although this word is not technically incorrect, it has other connotations to native speakers. Combining this with the fact that they would always write "Clearly" and never "Evidently", they are confused about what the writer means. Always change "Evidently" to "Clearly". ("Evidently" is quite close to "apparently", which in American English means "seems to be true" rather than "is true".)

  84. "as evidenced by". This is not used in English; change to "as shown by".

  85. "Principal" vs. "principle". "Principal" is an adjective meaning "foremost", used mathematically in "principal minor". "Principle" is a noun similar to "idea" or "method", as in "the Pigeonhole Principle".