Stefan Dollinger & Margery Fee
In the introduction to the first edition of the Dictionary of Canadianisms on Historical Principles (DCHP‑1), chief editor Walter S. Avis explained the rationale behind this dictionary, a rationale that has been taken over unchanged in this new edition. The Dictionary's purpose was - and is -
to provide a historical record of words and expressions characteristic of the various spheres of Canadian life during the almost four centuries that English has been used in Canada (Avis 1967: xii).
From the outset, we used Avis and associates' work as a springboard towards a lexicographic process that Avis (1967: xiii) considered "clearly quite different from that of lexicographers in the United States". This difference lies first and foremost in the comparative nature of DCHP. In DCHP‑1, the comparative approach is enshrined in the definition of a Canadianism, which was - and is - defined as
a word, expression, or meaning which is native to Canada or which is distinctively characteristic of Canadian usage though not necessarily exclusive to Canada (Avis 1967: xiii).
While this definition is also the core of DCHP‑2, we have refined what it means for a word or phrase to be "distinctively characteristic of Canadian usage". The study of distinct characteristics necessitates the systematic comparison of usage patterns and semantics of Canadian English with other varieties of English, which was one of the key features of DCHP‑1's editorial work. For the user of DCHP‑1, this comparative method is, unfortunately, utterly hidden, as generally no cross-references to other dictionaries are offered. This is perplexing, as Charles J. Lovell, driving force and first chief editor of DCHP‑1 (from 1958 until his death in 1960), emphasized the importance of "cross-checking in dictionaries" as the "chief factor in developing a knowledge of the specialized Canadian English vocabulary", a point that "cannot be too highly stressed" (1956: 31). DCHP‑1's comparative work can only be seen on the quotation slips, now housed at the University of Victoria Archives. Their premise was that "Canadianness is shown by [a word's] absence from dictionaries published outside Canada" (Görlach 1990: 1484). This "negative" evidence is included for the new entries in DCHP‑2. Today, we need not rely exclusively on non-Canadian dictionaries for claims of Canadianness, as we now can refer to lexicographical and dialectological work done on Canadian English since 1967. Drawing from decades of experience in corpus linguistics and a constantly growing number of digital sources, we can arrive at interesting insights that, however, should not diminish the achievement of the pre-computer pioneers behind DCHP‑1 with its c. 102,000 paper quotation slips.
From a comparative perspective, the question of evidence and documentation is key and the common practice of merely assigning labels to terms and meanings becomes problematic. In Canadian English lexicography, terms and meanings are labelled "Canadian" without a rationale. In DCHP-2, one goal has been to make explicit the rationale behind the inclusion of a particular word or meaning as Canadian. Looking at the entry for parkade from DCHP-1 in Figure 1, at the time of DCHP-1's publication a recent coinage, one might question why the term is Canadian at all (here from DCHP‑1 Online [Dollinger, Brinton and Fee 2013], which is a verbatim copy of DCHP‑1). This problem exists in all dictionaries of Canadian English to date: either the readers are already familiar with the special status of parkade in Canadian English, in which case they find their hunches confirmed, or if not, they are as likely to question the status of parkade as to accept it in the absence of evidence. Of course, one reason for the lack of a rationale was the lack of space in print publications: not only do we have access to vastly greater amounts of text online, but also we have more space to lay out supporting evidence for each entry.
Figure 1: Entry parkade from DCHP-1 Online
When the revision project was taking shape in the winter and spring of 2006, it had become clear that DCHP‑1 was "out of date to an extent that severely compromises its use today" (Dollinger 2006: section 1). Katherine Barber, at the time editor of the Canadian Oxford Dictionary, considered the lack of a historical perspective for many newer Canadian terms a huge problem (Barber & Considine 2010: 143). DCHP‑2 now contains some of the information she wanted to know then, such as when the term mangia-cake was first used (the earliest attestation is now 1988). However, we were not sure just where to put our energies in bringing the dictionary into the 21st century.
One of the first questions to be asked about the editing of a dictionary is, in Sidney Landau's words, "how fully" the dictionary should "cover the lexicon" (2001: 28). John Considine's comparison of historical dictionaries led him to project that a new edition might have twice as many entries as DCHP‑1 and at least double the quotation count (Barber & Considine 2010: 146). Until 2008, we felt that an increase by that measure would be feasible (Dollinger & Brinton 2008: 49-50). We realized soon after the initial data collection from 2008 to 2010 that we faced a greater degree of complexity than anticipated, and that a different route of action would be required. Because we decided to give evidence for our decision to label a word, meaning or expression "Canadian", we opted for more fully researched entries that would allow the reader to follow our reasoning. This decision entailed fewer but better argued entries. In order to facilitate classification, we developed a typology of various sub-types of Canadianisms as well as a methodology that would allow us to test our hypotheses on the status of a given meaning. The first such typology was published in Dollinger & Brinton (2008: 51-52), and, after two revisions, a final version in Dollinger (2015a).
In DCHP-2, meanings are the smallest unit. Each meaning is assigned to one of six dominant Types of Canadianism. These six classes are complemented by a class "Non-Canadian", which is used for meanings that were previously classified as Canadian or that were suspected to be Canadian but turned out not to be. The six types are:
Origin (Type 1) includes forms such as all-dressed, garburator, frosh week, humidex and parkade. Examples of Preservations (Type 2) include drinking box, dunch, invigilate and pencil crayon. Semantic change in Canadian English (Type 3) includes to acclaim 'elected by virtue of being the sole candidate', toque and Canuck. Culturally salient terms (Type 4) include terms relating to universal healthcare, e.g. health care card, hockey terms, e.g. backhander, goalie mask, and others relating to the Canadian experience, such as biculturalism, multiculturalism, Native Canadian and culturally modified tree.
Frequency (Type 5) typifies terms that occur in Canada at a higher frequency than elsewhere in the normalized internet searches we provide for most terms (see below under "Frequency Index" and Dollinger 2016a). These include advanced green, cube van, ginger group and to table legislation. This merger of comparative corpus linguistics methods with lexicography is perhaps the most important methodological innovation of DCHP‑2. We hope that the inclusion of Frequency Charts in DCHP‑2 will be seen as a useful addition to the traditional entry format, just as were the display maps in DARE (Cassidy & Hall 1985-2013) that indicated the location of all occurrences of a variant.
Memorial terms (Type 6), finally, was the last addition to the typology scheme, in 2013, and is the flip side of Cultural Salience (Type 4): Canadians take pride in the terms we mark as culturally salient, whereas terms categorized as Memorial connect to negative episodes and events in Canada's history. This category relates to the Aboriginal colonial experience, e.g. residential school, assimilation, potlatch ban, and includes terms of abuse now considered racist, e.g. chink, wop, iron chink, or terms that are, if not racist, now considered outdated, such as Eskimo. We therewith follow a practice that has a longer tradition in German than in English lexicography, where the OED-1, bound by Victorian values, tended to omit terms considered offensive. As the Brothers Grimm put in Volume 1 of their historical dictionary of German: "The dictionary, if it wishes to be worth its salt, is not here to hide words, but to show them" (Grimm 1854: xxxiii, transl. SD). In other words, we include with Type 6 terms that were once widespread and form part of Canada's negative heritage today.
The category Non-Canadian includes terms that were thought to be Canadian, either in previous works or by the editorial team. This category is an integral part of a contrastive dictionary and includes terms such as ASA 'acetylsalicylic acid' (labelled Canadian in COD-2), candy floss (COD-2), friendship cake (ITP Nelson), icing sugar ("Cdn, Brit., Austral. & NZ," in COD-2), and the expression Bob's your uncle (which we suspected to be Canadian). Examples of other terms the editors suspected, incorrectly, to be Canadian include to write an exam vs. to take an exam, or clawback 'funds taken back', which proved to be most frequent in the US. The label "Non-Canadianism" is set in the top right of the entry on top of a red background to distinguish the two classes of entries (see the section on "Background colours" below).
If a term or meaning is deemed Canadian, only one of the six types was assigned although some terms belong to more than one category. In other words, the "more profound" type trumps, which often concerns the choice between Frequency and Origin. For instance, Scarberia would clearly qualify on the grounds of Frequency as a Canadianism, but Origin is highly likely and thus trumps Frequency. In other cases, the assessment is based on negative evidence. While bear-pit session is clearly Type 5 - Frequency, in the absence of any evidence of this word in this sense in any media originating from outside Canada, the type was "upped" to Type 1 - Origin. Here the inference was inspired by (considerable) negative evidence. Such assessments are, of course, subject to corrections once more non-Canadian data becomes available, e.g. in the case of a term that was invented in the US, adopted early in Canada, and spread here while the term fell into disuse in the US. If such cases come to our attention, we will be happy to adapt our ruling. Where two possible classifications were considered, we explain the rationale behind our classification in the Word Story.
Below is the DCHP-2 entry for parkade, a building [...] serving as a parking area for motor vehicles. It includes the header DCHP‑2 on a blue background (to distinguish it from the entry for parkade in DCHP-1 Online, see Image 1). The headword parkade follows, for which the most common present-day variant has been chosen. The main entry is found under the current spelling, but cross-references are inserted. If available, a derivation follows after '<', in this case from park(ing) (arc)ade, which is followed by an IPA transcription of terms that need disambiguation (e.g. SQ for Sûreté du Québec generally as [es'kju:] for the general pronunciation in the Quebec English of Anglophones rather than the French [es'kjy]). On the right side is the date of the last revision.
Figure 2: Entry parkade from DCHP-2 (Part 1 of 2) |
The next item is the heart of the entry, composed of the Typology, in this case Type 1 - Origin, followed by the Word Story, which offers a rationale for the Type choice. The Type and Word Story are shown here for the reader's convenience:
Type
1 - Origin: Parkade is
probably of Canadian origin, linked to the Hudson Bay department
stores, which first appeared in western Canada (see the first 1958
quotation). Boberg (2010: 179), with data gathered from self-reports,
considers it primarily a Prairie and BC term (from Ontario
eastwards parking
garage is more
frequently reported than parkade). Parkade is
also the majority term in PEI (Boberg 2010). Chart 2 substantiates
this finding. The term is most common in Alberta (see Chart 2),
confirming overall the Prairie dominance from Boberg (MB, SK, AB) and
BC, which can be partly explained by the western Canadian connection
to the HBC department stores, as the first six of which, the
"original six", opened in Victoria, Vancouver, Edmonton, Calgary,
Saskatoon and Winnipeg (see HBC reference), where the term is
frequent to this day.
Apart
from Canada (see Chart 1), the term has currency in South Africa,
where it is most likely an independent development. Some US locations
have adopted parkade (see
the one shown in Image 2, from Spokane, Washington). From a North
American perspective, the term is Canadian also by virtue of
frequency (Type 5).
See
COD-2, which labels the term "Cdn". See Gravol and day
parole for other
terms with a Canada/South Africa parallel.
The Word Story refers to the supporting material, which can be Images (two for parkade, one from Canada and one from the US, to show the term's recent dissemination south across the border, though only the image from Canada is shown here).
After the Word Story come the cross-references to other entries, under the heading "See also" which link to parallel or similar cases referred to in the Word Story. Literature references, such as Boberg (2010), and external hyperlinks, e.g. to encyclopedia pages, are found at the end of the Quotation Paragraph. A click will either open a new tab and link directly to an online source, or show the bibliographic details in the right margin. Similarly, in the Quotation Paragraph, one or two symbols are found after each quotation: a book symbol and, if the source is online, a square symbol. The book symbol calls up the bibliographic reference of the source. In Figure 3, a click on the book symbol of the 1957 quotation, for instance, will show in the right margin the following information under the heading Bibliography:
|
|
Figure 3: Bibliographic details in the right-hand margin for parkade (1957 quotation) |
|
The 1957 quotation is from the University of Alberta student publication The Gateway, from 25 January 1957, p. 5. The URL is shown as well. A click on the square symbol behind the quotation will open the page with the quotation in the original source, provided that the user has access rights and the link is still functional.
Since DCHP-1, the linguistic study of Canadian English has made great strides. While not especially focused on the area of lexis, this tradition has offered many points of comparison. We have included results from the Survey of Canadian English (Scargill and Warkentyne 1972, e.g. chesterfield), the Dialect Topography of Canada (Chambers 1994, e.g. anymore), the North American Regional Vocabulary Survey (Boberg 2005, e.g. bank machine) and smaller surveys, including our own data such as the BC Linguistic Survey (Dollinger 2014), e.g. P.E. for physical education.
The Gateway is an open access publication. In other cases, such as Canadian Newsstream or other commercial databases, the user might be blocked by a pay wall. With some other links, the URL may have changed since 2007, when the first quotations were entered in the Bank of Canadian English (Dollinger, Brinton and Fee 2006-2016). Nonetheless, we considered it useful to offer the link we followed at the time the quotation was entered into the Bank.
Frequency Charts, if part of an entry, are found after any images. Frequency Charts are site-restricted web searches that have been normalized to allow for comparison of "index points" (see next section). In the case of parkade, both international and provincial charts have been included (see Figure 4). Other terms may show one or more charts, while others have none, as not all meanings are searchable this way.
Figure 4: Entry for parkade from DCHP-2 (Part 2 of 2) |
In some cases we would have liked to include charts, but the meanings overlapped with others, defying attempts to narrow and isolate the desired ones. Such terms include country marriage or transfer payment, where Google searches ignore sentence punctuation, listing hits where the first word ends a sentence and the second begins a new one. Another typical problem occurs with rez ((2)) 'reserve', which is conflated with rez ((1)) 'student dorm, residence', and no common phrase could be found that would clearly disambiguate the two meanings in practice (restrictions of the type "rez +student -Native" or "rez +Native -student", for instance, would yield too much noise). Likewise, examples such as Metro 'metropolitan region' and metro 'subway in Montreal', or joe job and bailiff proved impossible to disambiguate. Occasionally, we actually read the examples that the search produced in order to make a decision.
In most cases, however, we were successful in producing Frequency Charts (see Figure 4) and more than 57% of meanings include at least one. The international Frequency Chart (Chart 1 in Figure 4) shows the .ca domain for Canada as leading with 6.0 index points (a normalized measure in relation to the estimated size of the .ca domain), followed by South Africa (.za with 5.2 points). The other domains, for the United Kingdom (.uk with 0.1 points), Ireland (.ie with 0.3), New Zealand (.nz 0.0), Australia (.au 0.0) and the US (merging counts for the .edu, .gov, .mil and .us domains) is a mere 0.2. With Chart 1, which was produced on 24 May 2012, we have evidence that parkade is currently in use primarily in Canada and South Africa.
The provincial Frequency Chart (Chart 2 in Figure 4) uses governmental provincial web domains to gauge the term's distribution within Canada (.bc.ca for British Columbia, .ab.ca for Alberta and so forth). The results suggest a base of the term in Alberta with 30.8 points, where it was first used according to the historical record, followed with considerable difference by Prince Edward Island, Saskatchewan, Manitoba and British Columbia, with Ontario, Nova Scotia and the Yukon showing only negligible counts. While the text types reflected in the provincial charts are much less diverse than in the top-level national domains, the data still offers interesting insights and most often striking congruence with anecdotal and other evidence.
A feature from DCHP-1 is the Fist Note, which is not shown in parkade. Generally a Fist Note adds information that would help the general reader to contextualize the definition as in the Fist Notes in the first edition (Avis et al. 1967: xix) with the exception that in DCHP‑2, the more complex connections are explained in the Word Story. One such Fist Note is found under adhesion, 'agreement to join an existing treaty':
Treaties are only signed with the Crown, the federal government, yet provincial governments often play a role in the negotiations.
We give pronunciations only for those words that present a demonstrated problem. Transcriptions are given in the International Phonetic Alphabet above the horizontal line, next to the headword. For instance, coffee row has been transcribed as [rou] (to distinguish it from [rau]). We transcribe phonemically, not phonetically, which means that phonetic phenomena such as Canadian Raising or the Canadian Vowel Shift are not shown, as there are better sources outlining these phenomena (see, e.g., Chambers 2006 and Hoffman 2010).
The standard parts of speech have been marked as n. (noun), v. (verb), adj. (adjective), adv. (adverb) and expression. Where necessary we went beyond this basic classification and added transitive or phrasal verb, verbal n. (the latter, e.g. skidoo, meaning 2) or exclamation (e.g. padiddle). We also mark proprietary terms. The majority of terms in DCHP are nouns, many of which can be used as adjectives as well. In many cases we opted for the shortcut "n. & adj." for one meaning, e.g. matriculation, for which we listed quotations giving both nominal, e.g. in the way of matriculation, and adjectival uses, e.g. matriculation class.
As the Word Story is the primary vehicle for the semantic development of a term or meaning, the etymology field above the horizontal line, or in the case of isolated meanings, following the usage labels, is used to signal a donor language or dialect, such as in guichet 'bank machine' which shows [< Canadian French guichet 'counter'] as the source form, or square head 2 n. 'an English Canadian', which is labelled as [< loan translation of Canadian French 'tête carrée'].
All quotations are taken from Canadian materials or material by travellers reporting terms, meanings or usage that was heard or read in Canada. We went by the principle that words or meanings that have come into existence since World War II should be documented in 10-year steps, while older words were documented if possible in 25-year steps. Quotations come from the Bank of Canadian English (Dollinger, Brinton and Fee 2006-2016, Dollinger 2010). This rule was not hard-and-fast; we opted, more often than not, on the side of inclusion. Thus, the quotation paragraph for parkade in Figure 2 contains a total of 11 quotations between 1957 and 2016. Following the principles for quotation selection from DCHP-1 (Avis et al. 1967: xviii) we still list the earliest and (one of) the latest uses to offer explanations of the term and appropriate contexts in which the term occurs. Unlike DCHP‑1, however, we do not direct readers to a quotation in place of a definition.
As in DCHP-1, when we use non-Canadian English sources, we include the quotation text in square brackets, [ ], in order to show the development of the Canadian dimension. Such cases are found in, e.g., parks officer, bilingual, demob, where in the latter, demob (2), the 1919 quotation does not yet use the shortened noun form, but only the shortened verb form. In the earliest quotation for parks officer, we can see that the form likely derives from the juxtaposition of Niagara Parks with officer, but that the compound noun had not yet been formed, as shown in the 1960 quotation from the Toronto Star:
[The vest type lifejacket which Roger Woodward wore on his ride over Niagara Falls must be credited in part with saving his life, according to Niagara Parks officer Percy Baker.]
Quotations in square brackets provide important predecessor forms that would otherwise be lost, as can be seen in the 1867 quotation for bilingual. Taken from the British North America Act, the concept of bilingualism is described even though the term bilingual (2) 'pertaining to official use of English and French' is not yet used:
1867 (1868) [133. Either the English or the French language may be used by any person in the debates of the houses of the parliament of Canada and of the houses of the legislature of Quebec; and both those languages shall be used in the respective records and journals of those houses; and either of those languages may be used by any person or in any pleading or process in or issuing from any court of Canada established under this act, and in or from all or any of the courts of Quebec. The acts of the parliament of Canada and of the legislature of Quebec shall be printed and published in both those languages.]
Sometimes, we use square brackets to show a non-Canadian dimension. For instance, under laneway, the earliest quotation from 1873 is included in brackets in order to highlight the Irish connection in a reprint of an Irish term in a Canadian paper.
Quotations in historical dictionaries are selected as examples of language use, not because they contain accurate factual information or support the opinions of the editors. Therefore, readers should not assume that they can rely on these quotations as evidence to support any arguments except linguistic ones. Although we do try to indicate the general context for word meanings in our entries, anyone who needs practical information related to the terms in the dictionary, particularly financial or legal information, should turn to more specialized resources. Quotations are used for their illustration of linguistic and cultural use and, in many cases, report incorrectly or are factually wrong. For instance, the 1989 quotation on tax-free growth of RESPs (s.v. "Registered Education Savings Plan") is incorrect. Likewise, so is the 1956 quotation of qulliq, which likens this lamp of great cultural significance to the Inuit peoples and to the Territory of Nunavut to a propane gas camping stove.
Sometimes, despite our best efforts, we could not decipher isolated words or characters in a given source, particularly when using unclear microfilm or manuscript sources. In that case, we follow the convention from the Corpus of Early Ontario English, which marks with "[xxx]" every word that was not decipherable (Dollinger 2008: 99-119). In rare cases, we include a quotation for a variant form that is, strictly speaking, not a variant of the headword, such as in the 2011 quotation in idiot string, which only includes "idiot mittens".
Quite frequently, one can find two dates in the quotations, such as in the quotation above for bilingual, with "1867 (1868)". Here, the boldfaced year is the year of composition or utterance of the text, while the year in round brackets is the year of publication. Following established practice in DCHP‑1, we provide both dates. While for bilingual the difference is not great, the first quotation of Canuck, meaning 1a, was published in 1948, but written more than a century earlier, in that case around 1830.
As an online dictionary, we made extensive use of cross-referencing. Using the tool from DCHP‑1, the "See also" reference, we link to all terms that are mentioned in the Word Story as well as to terms that are in a direct relationship with the headword, which can generate more extensive lists, such as in the case of residential school, which links to nine related terms:
Sixties Scoop Truth and Reconciliation Commission residential school survivor reconciliation week assimilation reconciliation missing and murdered women First Nations language aboriginal language
All linked terms also link back to residential school, so that the reader can browse by topic as well as by the alphabet.
How did we arrive at the Frequency Index that allows the comparison of internet-generated data across domains? Computational linguists prefer to work with a corpus, that is, a body of text that has been selected based on a set of principles, such as national origin, genre, and the like. Corpora using material taken from the web are usually "cleaned", that is, filtered in some way to adhere to these selection principles. While these cleaned web corpora may be big, often in the 10 or 20 billion word range today, searches for open class lexical items require resources that are much bigger, by several magnitudes. (An open class word is a word that is easily added to the lexicon, such as nouns for new inventions, like smart phone, or verbs for new actions, like upload in reference to the web. Closed class words are those with a grammatical function, like prepositions or modal verbs.) Canadianisms are by definition already limited nationally, and some of them are quite rare: searches on the open web are, for the time being, our best shot at finding enough evidence to track the geographical dimension crucial to our comparative focus in DCHP‑2.
Computational linguists have good reasons for compiling their own corpora that revolve around well-known problems with commercial search engines such as Google, Bing and Yahoo. There are five major problems that compromise any open web search, and they require workaround procedures to minimize them. The problems stem from the following features of commercial search engines:
There are other problems, such as non-lemmatization. A lemma comprises all forms of a given item, e.g. the lemma play (v.) would include the forms plays, playing, played as well as play and to play. As lemmatization is not available in commercial search engines, one must carry out multiple searches per lemma. In the light of these issues, computational lexicographers (Kilgarriff 2007, Pomikálek et al. 2012) have made a well-argued case against "googling" and for web-crawled and cleaned corpora that remedy the problems listed above.
As we felt we needed to work with commercial search engines nonetheless, we developed procedures that are anchored in estimating the size of URL domains for English-speaking countries and provinces on the one hand and, on the other hand, in monitoring the search engine behaviour at the time of particular searches (for full details see Dollinger 2016a). Google has the biggest search index and thus gives us the best chance at retrieving enough tokens to address a term's national and regional dimension.
The biggest problem in using web searches for comparative purposes is that we do not know the relative size of each domain, such as .ca. or .uk, although we do know that the populations for the countries the domains stand for vary immensely in size from Ireland, the smallest, to the US, the largest. We tested an idea to address this problem with the GloWbE corpus (Davies 2013), which offers sub corpora between 35 and 390 million words for twenty varieties of English. If a term could be found that remains constant in relation to the domain size of its variety, we could use this term to gauge the size of the domain (see Figure 5). For instance, the term shall occurs 354,939 times in the US domain of GloWbE, which has a total of 386,809,355 words. By dividing the number of hits by the overall size of the US data we get a quotient:
Quotient for shall in US = 354939 / 386809355 = 0.00092.
We applied this process to a number of terms over the 20 domains. Figure 5 shows the comparison of the lexical items could, shall, and must and like. If the line for a particular term was fully horizontal it would mean that the term's frequency is proportionally identical in all varieties and thus a perfect normalizer for the web. The goal must be to come as close as possible to a fully horizontal line.
Figure 5: Visual comparison of four normalizing terms
In Figure 5 we see that shall, which is more infrequent than the others, shows variation between 0.00010 in the UK and 0.00050 in the Philippines. Here we can see that shall is relatively bad as a normalizer because the fluctuation is a striking 400 percent (= 100/0.00010*(0.00050-0.00010). Table 1 charts 10 terms and their relative fluctuation:
snow
|
shall |
pool |
cat
|
sex
|
rain
|
like
|
must
|
house
|
could
|
728%
|
400%
|
273%
|
225% |
200% |
173% |
78% |
70% |
69% |
53% |
WORST (LEFT) TO BEST (RIGHT) NORMALIZERS BASED ON GloWbE
Grammarians distinguish between open class words, which are content words that can be easily added to the lexicon (e.g. smart phone, to upload something to the web) and closed class words, i.e. terms that serve a grammatical function (e.g. prepositions to, up, on; modal auxiliaries shall, will and the like) that cannot be added at will. Worst in this sample is open class term snow with 728%, while best is closed class term could with 53% fluctuation. Could was the normalizing term used in DCHP‑2 (see Dollinger 2016a: 76-77, Table 3 for discussion of a possibly better normalizer, the definite article and most frequent word in English: the).
After having found a way to compare findings across different domains, it remains important to monitor the behaviour of the commercial search engine over time. Figure 6 shows such monitoring of google.ca, via the documented number of hits for normalizer could in the domains used in DCHP-2's Frequency Charts, from late 2012 and early 2013:
Figure 6: Top-level domain tracking (Nov. 2012-Jan. 2013) with normalizer could
Figure 6 is used to confirm or discard searches made on a particular day: as can be seen in the spikes in .au (Australia) and in the .gov (American government) domains on 27 and 29 Nov. 2012. Google manipulated their index for Australia and on 28 Nov. 2012 for .gov domains without increasing the index size. Results on that day must be thrown out. If the higher level had been maintained (which would have been possible with an increase of the size seen in the .gov domain), the web index would have been permanently increased, which means that Google would have permanently increased the size of its search index for a particular domain.
Normalization and web monitoring are the two main procedures that underlie the DCHP‑2 charts and they make us fairly confident that the charts deliver reliable data (see Dollinger 2016a: 89-92 and Robert Lew's concerns in Dollinger 2016b for words of caution). In certain cases, Google's number estimates over-represents the actual results. This problem can be circumvented by clicking at one of the later hits, which leads Google to adjust the count (e.g. poutine). Like all statistical data, they should be contextualized with other information if they are to be used as evidence, something that we do in the Word Stories.
The precise search terms are always shown in the chart headings. Multi-part words were always searched for with quotation marks, so the chart for entry Sobey's bag was created by entering:
"sobeys bag" site:.ca --> followed by other site searches, e.g. site:.edu
If the chart under toque reads toques (Image 4) as the headline, the search term was the plural form, which provided better information than the singular form, in this case because the term is part of the name of a monkey species (the toque macaque monkey).
Due to the polysemous nature of some terms, searches were narrowed by adding or excluding search terms, at times using more specialized phrases, which are reproduced in double quotation marks, such as in tick 'credit' with the search term "buy on tick" (see Image 4), or for off-reserve in the phrase "off-reserve population", which produced a usable number of hits. Other meanings were isolated by adding Boolean search terms (AND, NOT), as can be seen in the chart for day parole, arrived at via "day parole" AND "prisoner" or unemployment, as in she's on unemployment, which required us to exclude the term insurance (hence NOT insurance). The decision to narrow a search was made by reading through the quotations and deciding whether the targeted meaning and only (or almost only) the targeted meaning had been produced. We do not explain why a certain search term combination was used and not another one, as a discussion of such methodological questions could easily take up more space than the main part of the entry.
We aimed to make the data as comparable as possible. All search dates are offered in the captions. The Frequency Index, discussed in detail in Dollinger (2016a: 79-87), is the quotient of
The multiplier is always offered in the y-axis label (here 100,000 for sobeys bag, but 10,000 for toques). Different multipliers are used to make the charts more readable on the page, while allowing the cross-comparison across charts.
Figure 7: Reading Frequency Chart headers |
|
The method at times produced gaps in the documentation. For instance, scrob, a Newfoundland term, is listed only with an international chart, in which is Canada leading by an overwhelming margin. We did not include a provincial chart because the provincial searches yielded fewer than five tokens per provincial domain, too few for meaningful conclusions (in addition, the results often include typos, such as for scrub, reducing the low-token count further). If a provincial chart appears to be missing, it is for that reason.
In some rare cases, e.g. dozy 'dim-witted', we were unable to produce Frequency Charts; however, our assessment rests on some corpus data. In the case of dozy, we list negative American search results, as our classification of the meaning rests on a lack of the form in the US. Although we are aware that using negative evidence based on corpora might simply mean the corpora do not contain the right text-types or are not big enough, in particularly strong cases we have decided to build our argument on it. Given that one of our corpora is the open web, we can be fairly confident that the biggest text collection of written contemporary genres is free of the form.
Occasionally, when no satisfactory search term combinations were found for the headword, we devised feasible workarounds with a related term. As the search for cork boot proved impossible because of too much interference from related terms, we searched for the related form "caulk boots" instead and were able to show the BC provenance of the concept and thus, we believe, also for the headword proper.
In very few cases, we used a "wildcard" search. For "hang up his/her/their/the etc. skates", we used the asterisk to search for any term that occurred before skates. Thus "hang up * skates" returns results for "hang up her skates", "hang up his skates", "hang up the skates" and the like (including the erroneous result "hang up John's skates").
Our most common charts show normalized web frequency, yet occasionally we use other kinds of charts. Where meanings rather than forms are Canadian, we occasionally applied a more fine-grained analysis of a smaller sample by providing the results of a semantic analysis of unambiguous examples, such as for the term to table (legislation etc.), which can mean either 'to postpone' or 'to bring forward' for discussion.
Figure 8 shows the results of a reading of 100 unambiguous examples of "tabled" in each three years, 1990-2013, in Canadian and US newspaper data. As can be clearly seen, the Canadian meaning is 'bring forward' in around 90% of all cases, while the dominant US meaning is 'to postpone'.
Figure 8: Semantic analysis for tabled |
|
In cases when all or the main competing variants were known, we opted to offer, in addition to the Frequency Chart, a close-up of the situation on the .ca domain, such as in Figure 9 (left) for dish soap:
Figure 9: Main competitors for dish soap on .ca domain (left) and Frequency Chart (right) |
These data show that dish soap, although not distinguished by frequency compared to Ireland or New Zealand, is the preferred term here and does have a Canadian dimension in the North American context (Frequency Index points: Ireland 473, Canada 61, US 27, UK 12), as shown in Figure 9 on the right.
The quantitative data behind these various visualizations show the Canadian dimension of previously unidentified Canadianisms, such as take up meaning 9 (added in COD-1 & COD-2 but not marked as Canadian), joe job, 'menial job' (listed in Gage-3, Gage-5 but not marked), or reveal a regional preference, e.g. table cream 'coffee cream' (PEI), Sobey's bag (Maritimes, esp. NS, 'plastic shopping bag'), or strata (BC, 'collective real estate ownership'). A range of new Canadianisms, e.g. squarehead (meaning 2) or padiddle, was also found, and the previous identification of terms or meanings as Canadian was given further backing, e.g. backcatcher, gem jar, Molson muscle and poutine.
The project's first public milestone was the launch in 2013 of the digitized version of the first edition, DCHP‑1 Online (Dollinger, Brinton and Fee 2013). Digitizing legacy data has not been a glorious task, which holds even more true for the lexicographical world with its especially thick volumes. When OED-2 was published on CD-ROM in 1992, and later on the internet, it integrated the contents of the 12-volume OED-1 with five "supplement" and three "additions" volumes and added only a small percentage of new entries. This pioneering venture in digital humanities made possible the revision and update of the third edition, which was begun in 2000.
With DCHP-1, some of the same steps had to be retraced in a much smaller and different context. After initial digitization, which was carried out by the University of British Columbia Archives and Special Collections, the text was proofread by UBC students in 2010-11 (for a list of staff and volunteers see here). After a period of copyright negotiations, DCHP‑1 Online was made available in open access in 2013, free of charge for everyone at www.dchp.ca/dchp1. By that time, DCHP‑1 had been out of print for decades and had fallen into oblivion. Today, users can easily access the online version and bibliophiles may order remaindered paper copies of the 1991 reprint here).
DCHP‑2 contains 1002 new lexemes, which increases the headword count of DCHP‑2 by about ten percent (DCHP‑1 Online: 10,974 headwords). The Bank of Canadian English, which is the quotations database, comprises (2 Dec. 2016) 51,345 newly collected quotations in addition to the 24,753 quotations from DCHP‑1, for a total of 76,098. In total, 1239 additional meanings are included in the update, for which 8713 quotations were selected as evidence from the Bank of Canadian English (Dollinger, Brinton & Fee 2006-2016). The update increases the quotation count by just over 35%.
The 1239 meanings comprise 1103 Canadianisms and 136 Non-Canadianisms, which makes almost 11 percent of all meanings, or more than one in ten, non-Canadian. This finding is an important corrective to previous knowledge. Among the 1103 Canadian meanings, the six types are distributed as follows (percent of Canadianisms in the update):
The update includes a total of 241 Images, 861 Frequency Charts, and 2502 references to secondary literature. We refer to a total of 405 sources, including dictionaries and encyclopedias. The Frequency Charts consist of 694 international charts using the top-level internet domains and 167 regional charts based on the 13 provincial and territorial Canadian government internet subdomains. DCHP‑1 to a large degree focused on the historical dimension of words and meanings now obsolete: aboiteu, Digby chicken, splake or Yankee fix, Canadian terms no longer known to many Canadians. The overall goal of the update was to reflect changes since World War II, although we do include some earlier Canadianisms, including some that are revisions of entries in DCHP‑1. While aiming to build a case for the status of Canadian English lexis as a unique collection of terms and meanings, we also wanted to eliminate "false Canadianisms". Charles Boberg argues that the strongest case for the distinctiveness of Canadian English rests on vocabulary that excludes "objects or cultural phenomena found only or mostly in Canada" (Boberg 2010: 115). While we include such vocabulary in the update, e.g. to table (v.), take up (phr. v.), toque or tick, we also list words that are socially, culturally, politically, topographically and geographically tied to Canada, e.g. throat singing, Timbits, treaty Indian and two solitudes. Both kinds of Canadianisms are of interest to dictionary users.
The increase around a thousand headwords may seem modest; however, there is quite an increase in detail and evidence in each of the new entries when compared to the legacy data from DCHP‑1. A comparison between the original entry for landed immigrant in DCHP‑1 and DCHP‑2 shows the difference visually (Figures 10a and 10b):
|
|
Figure 10a: Entry for landed immigrant in DCHP-1 (DCHP-1 Online) |
|
|
|
Figure 10b: Entry for landed immigrant in DCHP-2 |
|
A comparison between versions a and b of Figure 10 shows that only the 1964 quotation and parts of the definition were taken over from DCHP‑1. The count of 1002 headwords was not an utterly arbitrary choice. As the Introduction to OED-2 put it "Omission should not be equated with exclusion" (xxii), given funding and time constraints. In fact, the editorial files contain unfinished drafts for additional Canadian terms and meanings. The choice of 1002 headwords is therefore a symbolic nod of appreciation to DARE - the Dictionary of American Regional English - which offers the most precise regional dimension of any dictionary, and which is based on fieldwork study in 1002 American communities. DARE - the longest-running National Endowment for the Humanities project in history - and DCHP‑2, funded with a string of three grants from the Social Sciences and Humanities Research Council and a patchwork of small grants - are very different dictionaries in both resources and scope, yet both follow the principles of documenting historical semantic changes in the most precise and feasible way in each regional context (for a comparison between the two dictionaries, see Dollinger & von Schneidemesser 2011).
Readers will notice that encyclopedic content, that is, terms such as CANDU, Canadian Pacific Railway, or VQA, is found quite often in the DCHP‑2 update. In this respect, DCHP‑2 continues the trend set in DCHP‑1 that in turn followed the practice established in Noah Webster's lexicographical tradition. The 1933 edition of the Oxford English Dictionary (OED-1) speaks in this context of
an indefinite number of Proper or mere denotative names, outside the province of lexicography, yet touching it in thousands of points, at which these names, and still more the adjectives and verbs formed upon them, acquire more or less of connotative value. Here also limits more or less arbitrary must be assumed. (OED-1, General Explanations, xxvii)
In the Canadian contexts we decided to cast our net rather wide. We had good reason to do so, beyond following an established practice in the field, because what some consider encyclopedic content that is not of primary interest to lexicography often acquires a new dimension in the Canadian context where a number of more generally-used Canadian terms have their origin in proper names, such as Genie (Award), Robertson screw, or Vancouver Special. Most of these terms are related to the area of onomastics, the study of proper nouns, which are one type of encyclopedic entry that refers to a specific public event or otherwise known fact that comes "with many complex associations" (Landau 2001: 212). Landau gives perhaps the best reason for the inclusion of encyclopedic entries:
It would be impossible to define such associations, since they differ from person to person, are often emotional, and depend on nonlinguistic cultural phenomena, but the central fact of the event is that it is shared by many people, and because of the importance and frequency of such allusive words in the language, dictionaries cannot ignore them. (Landau 2001: 212)
In a dictionary of Canadianisms, such events are obviously very important for collective memory. Examples of such uses include the meanings of Gretzky effect, Trudeaumania, and Sixties Scoop. No matter how one views Trudeaumania or the Gretzky effect, they are part of what makes Canada, Canada. Leaving such terms out would have resulted in an overly narrow, and indeed distorted view of the Canadian vocabulary.
At the same time, we did not include proper names per se, unless they have acquired metaphorical uses, such as Howe Street for Vancouver's banking and finance circle, and Vi-Co as a generic Saskatchewan term for chocolate milk, or unless they had a culturally salient dimension, such as Anik, Canada Pension Plan and Red Fife. As in DCHP‑1 (Avis 1967: xv), we exclude the names for Aboriginal peoples as headwords in general, except when they refer to other entities, e.g. Salish (for Salish apple) is included, while Salish for Salish people is not.
Armed with the typology and the methodological tools presented above, we aimed to navigate the complex task, in Avis's words (1983: 4), of preparing a dictionary of Canadian English that is "complicated by a great deal of unsettled and divided usage" of Canadian, American and British speakers, a complication that is further compounded the further one goes back in time. As these tools have only been applied to the newly written headwords and meanings, which were integrated with unaltered entries from DCHP‑1 Online, we opted for a colour scheme to distinguish clearly between entries in the two editions when they are linked by cross-references:
WHITE background = DCHP-2 update entry |
YELLOW background = unchanged DCHP‑1 legacy entry |
Figure 11: White and yellow background colours |
DCHP-1 legacy entries are visually marked by a yellow background, here on the right for demoiselle, from the new DCHP‑2 update content, on the left, which appears on white background (here for denticare plan, with Frequency Chart omitted). In addition, legacy entries contain the label "DCHP-1 (pre-1967)" in the upper right corner, and the warning "THIS ENTRY MAY CONTAIN OUTDATED INFORMATION, TERMS and EXAMPLES" below it to clarify that we do not endorse outdated or discriminatory wording in DCHP‑1. For the new entries, the top right corner, inspired by OED‑3, shows the month and year the entry was last saved.
A third background colour is used to differentiate non-Canadian terms from the Canadianisms in the DCHP‑2 update. The colour red and the label "Non-Canadianism" in the top right corner mark this category.
|
|
Figure 12: Non-Canadianisms marked by RED background |
|
For multicult, we found no evidence that it was a Canadianism (although it was labelled as such in COD-2), while the long form multiculturalism, to which this entry links, is indeed Canadian.
In entries for Canadianisms where one or more meanings are not Canadian, we used the dagger device, borrowed from DCHP-1, to mark meanings "which are not Canadian at all" (Avis 1967: xiii). Meanings labelled with a dagger are not Canadian, but are included because related meanings are. The entry giveaway in Figure 13, for instance, has two meanings. Meaning 1, 'something […] given away free of charge' is labelled 1 'superscript dagger' n. & adj., marking this meaning as non-Canadian, while Meaning 2 (only the definition shown below) 'loss of the puck […] resulting in goal' is a Canadianisms of Type 4 - Culturally Significant.
|
Figure 13: Entry for giveaway with dagger in Meaning 1.
Following conventional lexicographic practice, homonyms, i.e. forms that are accidentally identical but not historically related, are listed separately. As a result of the integration of DCHP‑1 headwords, which use ( ) to show variants, we render superscript numbers in double round parentheses, (( )).
Therefore, noddy ((1)) and noddy ((2)) are equivalent to noddy1 and noddy2. Likewise blackberry ((1)) and blackberry ((2)) are unrelated forms rather than two meanings of one and the same lexeme.
We understand by spelling not just differences in the sequences of letters, e.g. colour vs. color, but also variation in upper or lower case, e.g. bi and bi vs. Bi and Bi is considered a "spelling difference". Sometimes we include related forms that may be used as synonyms.
As anticipated by Lovell, the comparison with other dictionaries is important. Pratt (2004) was proven right that DARE would be most valuable source of comparison, as are the older American dictionaries of national terms, such as DAE (Craigie and Hulbert 1938-44) and DA (Mathews 1951), as the example of Cowtown 'Calgary' shows. The OED‑3 is, of course, often the first stop in any comparative quest. To establish the historical component we operated with a set of dictionaries that were checked for all terms:
In addition to these, more specialist dictionaries were consulted where warranted, e.g. the Dictionary of Alaskan English (Tabbert 1991) for Northern terms. We conducted a contemporary comparison with the three general desk dictionaries in Canadian English:
In the interest of readability, we applied a shorthand system that allows the tracing of each meaning as Canadian in the literature. But rather than offering long lists (e.g. included in Gage-1 (1967), Gage-2 (1973), Gage-3 (1983), Gage-4 (1990) and Gage-5 (1997) and reprints), we generally list the first dictionary in a series that lists it. The information "Gage-3, which marks the meaning Cdn" in bilingual (2), means that Gage-3 is the first edition in that dictionary series that identifies the meaning as Canadian and that Gage-4 and Gage-5 mark it this way too. The purpose of this method is to give credit to the earliest dictionary source where available. These dictionaries vary widely in which words they label as Canadian, which shows the importance of checking all three (e.g. scrum (2) is only documented in the ITP Nelson, bloody-minded 'stubborn' only in COD-2; Mountie is marked "Cdn" only in Gage-1 and subsequent editions, while washroom 'public toilet', is not labelled "Cdn" in Gage-5 and ITP Nelson.
Where only the reference COD-2 is found, the reader can be certain that the COD series, starting in 1998, was the first dictionary and the only Canadian dictionary to document the meaning as Canadian. If a dictionary is not listed, it means that neither the term nor any of its meanings are marked as Canadian in any way. The comparison of the three main desk dictionaries of Canadian English reveal their strengths and weaknesses with respect to labelling of Canadianisms, which was not, of course, their major purpose.
DCHP-1 did not document many expressions and we document only a few more, including back of the net, hang up one's skates, or done like dinner. There are a total of 20 expressions in the DCHP‑2 update, with quotations ranging from the year 1900 to 2016.
DCHP-1 did not consider "terms for which there is only oral evidence", in order to comply maximally with OED principles of the day (Avis 1967: xii), which led to the exclusion of much interesting material. DCHP‑2 includes such terms, yet is limited to the extent that no systemic oral data collection was possible. The resources of spoken language that exist are data collections from sociolinguists or corpora such as the International Corpus of English (see ICE Project), among others, which are designed for the study of grammatical phenomena and are much too small for lexical searches of the kind we require. Spoken language attestations, of which we give some dozen examples, come in most cases from transcriptions of spontaneous speech or, occasionally, from audio or video recordings.
Spoken language, which provides some of the most interesting material, remains one of the challenges in lexicography. One example is Game 7, for which despite detailed searches, we do not have enough evidence to warrant listing a figurative meaning, which doubtless exists in speech and may indeed be widespread. We offered a Fist Note explaining the current state of knowledge. A written questionnaire survey (Dollinger 2015b) would be the best means to get to the bottom of the usage, yet attestations would need to be found by other means. Another example is positive anymore, for which we could only muster two very recent dated attestations of a receding construction that was more widespread earlier in the 20th century.
Since this is a dictionary of Canadian English, we have generally taken an Anglophone perspective on Canadian matters. Nonetheless we aim to give equal voice to conflicting opinions, especially with respect to terms that relate to the English-French and mainstream-Aboriginal dimensions. Whether one considers sovereignty (meaning 1) or separation (meaning 1) as the more appropriate term in the Canadian French context is a matter of political perspective. In the Aboriginal context sovereignty acquires a different meaning. We try to provide an even-handed account of these differences, sometimes by including quotations that reveal strong political differences. Based on the relevance for some political science concepts and their complex interpretations, some Word Stories tend to be longer (e.g. Meech Lake Accord) than others (e.g. Rest of Canada).
Generally, we link abbreviations, e.g. grow-op, to their long forms, e.g. grow operation, where the main information is offered. In some cases, we have entered both an abbreviated form, e.g. MLA, written as a full entry besides the long form, e.g. Member of the Legislative Assembly, in order to differentiate different uses of each form.
With evidence from the provincial Frequency Charts and the Bank of Canadian English, we were able to foreground the regional dimension in DCHP‑2. Some terms or meanings are more or less confined to provinces, such as the ones listed below:
British Columbia: P.E., dry grad
Alberta: parking stall, health care card
Manitoba: Tundra Buggy, Saskabush
Saskatchewan: bunny hug, coffee row
Ontario: Big Blue Machine, collector lane, take up, 'review
answers'
Quebec: all-dressed, francize, two-and-a-half
Prince Edward Island:table cream, mainland (2)
New Brunswick: redemption centre
Nova Scotia: blueberry buckle, Tintamarre
Newfoundland: after* (perfect construction), screech-in
Yukon: skookum, bush party
Northwest Territories: charter community, parks officer
Nunavut: skidoo, qulliq*
(* not based on a Frequency Chart)
In rare cases where findings allow, we offer more narrow regional labels that go beyond the Frequency Charts:
Vancouver: Frissant*
Toronto & Vancouver: laneway house
Montreal: metro*
Cape Breton: Les Suêtes, mainland* (3)
Gaspé Peninsula: crawfish*
More frequent than attestations that are located in one province, a quite arbitrary regional unit, are patterns that transcend provincial boundaries. Some terms or meanings cross adjacent provincial boundaries, while others show quite discontinuous regional patterns. With the provincial Frequency Chart data, for instance, one can show that leg [lɛdʒ] 'legislature' is a term that is found on the Prairies and in the North, above all in Alberta, Manitoba, the Yukon, the Northwest Territories and Nunavut. Such patterns offer an additional layer of evidence, in some cases a corrective, to existing regional labels. Some examples include:
|
mainland
Canada (Canada except Newfoundland): Newfie joke |
In some cases we notice apparent parallel developments or Canadian influence in some Inner Circle Englishes, of which the Canadian-South African parallel appears to be the most striking, unexpected and unusual. A number of terms, e.g. day parole, parkade, or Gravol show high frequencies in both countries. While the connection is obvious in at least one case, Truth and Reconciliation Commission - the Canadian TRC was inspired in part by the South African ones - others defy such clear rationale. COD-2 noticed a parallel in at least one case, e.g. write, as in to write an exam, which we labelled Non-Canadian in our dictionary, but which was considered "Cdn & South Africa" in COD-2.
Other parallels are more easily explained. A distinct Commonwealth dimension can be found in "bath the baby", not bathe, the former form being more rare in the US and UK but widespread in the rest of the Commonwealth countries studied. The Irish connection is historically particularly strong in parts of the country, which explains perfective after; the Scottish connection explains kitchen party.
The treatment of Canada's most linguistically diverse regions has presented special problems since the dictionary's foundation in the 1950s. The problem is a result of long-established Canadian linguistic enclaves, which result in a highly uneven distribution of distinctly local terms in different parts of the country. While much of mainland Canada, above all the area from Ontario westwards, is comparatively homogeneous, other regions stand out as a result of very different settlement patterns. Newfoundland is by far the biggest such enclave (e.g. Clarke 2010), and the traditional dialect in the Ottawa Valley (Pringle and Padolsky 1983) and Cape Breton Island other widely known cases. Since DCHP-1, scholarly regional dictionaries on three varieties have appeared, with the Dictionary of Newfoundland English (DNE 1982, 1999), the Dictionary of Prince Edward Island English (Pratt 1988) and, most recently, the Dictionary of Cape Breton English (Davey and MacKinnon 2016). The former two played an important part in the writing of DCHP‑2, while the latter will certainly inform future work. As the oldest and most diverse traditional dialect region in Canada, Newfoundland was given special focus in DCHP‑2. As the roughly 5000 lexemes in DNE would have swamped DCHP‑2, we were confronted - as mainlanders - with the difficult task of selection from an existing list. The 132 Newfoundland meanings comprise slightly more than 10 percent of the overall content. We strove to select, with input from our colleagues at Memorial University, a sample that would add a Newfoundland dimension within reasonable limits in the national context. We hope that some of the concerns expressed in Story & Kirwin (1971), which highlighted the local perspective and data richness, could be alleviated as a result of the Newfoundland assistance. We are aware that not all problems pertaining to traditional Newfoundland terms, including some identified in Dollinger (2015c), could be addressed due to practical limitations.
The OED, DNE and DARE must be mentioned as the three historical sources to which we owe the greatest debt: the OED for its sheer scope and for offering a broader framework that would otherwise have been difficult to access; the DNE for allowing us to write more informed entries on Newfoundland terms and, more than once, to refer to their unique spoken or written quotations; and DARE, our "big sister dictionary", which has offered us superb data from the American neighbour for all non-standard terms we were interested in, and which has helped us greatly to improve the entries that are shared. That the overlap between DARE and DCHP is not greater than about 7% of DCHP-2's headword count (Dollinger & von Schneidemesser 2011: 121) is the result of different sampling methods and scopes in both dictionaries. For DCHP-1, the word list focused on pronounced historical periods of Canadian interest, while DARE's word list started in a large-scale nation-wide fieldwork survey which were then traced back in time.
The excellent DARE data confronted us with a special problem when assigning the type of Canadianism as, more often than not, obviously Canadian meanings are first attested in US material. To that end, we considered in our interpretations the fact that the publishing industry was generally more developed in the US than in Canada, especially but not exclusively as far as the West is concerned. For that reason, often the earliest attestations of otherwise clearly Canadian words are found in the US.
Such cases are Canuck, skookum, and toque, the first and third of which are among the most widely recognized Canadian terms today, and for which we were able to offer a correction. Canuck, for instance, is first attested in 1835 by a traveller in Canada who reports on American usage. DARE lists this attestation, correctly, as American, while we list it as non-Canadian evidence of an older, otherwise lost meaning that would likely have applied to Canada as well. Skookum, a loan word from Chinook Jargon, the 19th-century contact language of the Canadian West Coast and US Pacific Northwest, has its earliest attestation in the US in 1847 in a glossary, and in an undated 19th century glossary in Canada published by T.N. Hibben & Co., which was British Columbia's first bookstore, founded in 1858. We label both meanings as Culturally Significant, given the role of Chinook in early British Columbia and the Yukon. "Origin" might be possible for meaning 2, yet we err on the side of caution.
Finally, DARE lists for toque its older spelling tuque, from 1856. Our Canadian antedate (from 1882 in DCHP‑1 to 1865) does not quite match the American source. In this case, tuque is referred to in the US text as the name for a riding hat "as the voyageurs call it", which we consider as solid evidence for the French Canadian connection. Thus, toque is labelled "Origin", with an explanation for an earlier American attestation for Canadian usage in DARE in the Word Story.
The discussion above suggests that the discrimination between Canadian and US origins may become moot the further one goes back in time and the less clearly a Canadian and American identity would have been linguistically expressed. Skookum, for instance, is as much (Pacific Northwestern) American as it is (West Coast) Canadian. After work on DCHP‑1 was completed, Avis writes in no uncertain terms of
the problem of identifying many terms as specifically "American" or "Canadian" [, which] is virtually impossible of solution. Needless to say, the editors of A Dictionary of Canadianisms [DCHP‑1] have struggled with this problem since the project began. In view of the difficulty and, perhaps, pointlessness of trying to identify many words in common use on this continent as being "American" or "Canadian," lexicographers compiling dictionaries of the English used in North America might be well advised to adopt the label North Americanism. (Avis 1967: xiii)
It is indeed unlikely that national identity, rather than local identity, would have been expressed in the earlier periods. Linguistic phenomena of nativization have been dated to the post-Confederation period (1867) based on social and political events (Schneider 2007: 242-244), which is roughly confirmed for Ontario on empirical evidence to around 1850 in urban regions and 1875 in rural ones (Dollinger 2008: 280-283). In the West, national identity formations would happen much later, however, with orientations to the US or the UK (exonormative stabilization in Schneider's model) lasting
much longer than until 1867 [as suggested by Schneider], at least to the 1930s or, in some cases the 1960s, when the first Canadian-made reference books came into wide circulation (the Gage Canadian Dictionaries, see DeWolf et al. 1997 for the latest edition). (Dollinger 2015b: 207)
Despite the problems outlined above relating to American evidence, our typology and data sources have given us considerably more security in our assessments than the editors of DCHP‑1 were afforded. As a result, we can confine the challenging cases, such as the ones listed, to a handful, all of which are discussed in the respective Word Stories. We maintain for cases that would in a different framework warrant the label "North American" an assessment as either Canadian or American not only for the sake of consistency but also to offer a corrective: if dictionaries of dominant language varieties can afford to list joint vocabulary as belonging to their variety, it should be permissible for the non-dominant variety to proceed likewise. The reference to the other variety with an assessment guides the reader to the full context.
The difficulty of cross-border usage is not easy of resolution as long as linguistic projects stop at national borders. From a Canadian perspective it seems legitimate to question the relevance of the label Americanism, given the Canadian evidence for many North American words. For terms and meanings shared between the two countries, the North American vocabulary, such questioning would be a particularly important point. Given the scope of this task, such work would need to be undertaken collaboratively from specialist teams on both sides of the border.
There are two ways to solve this problem: shared terms and meanings may either be treated as both Americanisms in American dictionaries and Canadianisms in Canadian dictionaries, which is more in line with lexicographical traditions and obviously the practice adopted for DCHP‑2, or with a novel cross-border lexicography that would have to start with an systematic investigation establishing the prevalence of each term and meaning across the North American continent. For select terms, a set of some 50 tokens, such an approach has been carried out using written online questionnaires (Boberg 2005), which seems like the obvious methodology to gather such widespread data. While the enormity of the task is apparent, recent border studies are offering tools for such approaches (e.g. Auer & Hinskens 1996, Auer 2005, 2014, Dollinger & von Schneidemesser 2011: 138-43, Dollinger 2016c, Swan 2016, Boberg 2016).
We trust that this introduction will help the interested reader in putting DCHP‑2 to the best use. While the first edition was, in its day, pioneering work, DCHP‑2 is the attempt to bring the principles established then onto a sounder empirical footing. DCHP‑1 was published in the year of the Canadian Centennial; DCHP‑2 will be released in open access in the year of Canada's 150th anniversary. Just as in 1967, when DCHP‑1 was published, we now hope "that this record is sufficiently extensive to constitute a scholarly and valuable contribution" (Avis 1967: xv).
Stefan Dollinger (Gothenburg, Sweden)
& Margery Fee (Vancouver, Canada)
December 2016
Auer, Peter and Frans Hinskens. 1996. The convergence and divergence of dialects in Europe. New and not so new developments in an old area. Sociolinguistica 10(1): 1-30.
Auer, Peter. 2005. The construction of linguistic borders and the linguistic construction of borders. In: Filppula, Markku, Juhani Klemola, Marjatta Palandern and Esa Penttila (eds.). 2005. Dialects Across Borders: Selected Papers from the 11th International Conference on Methods in Dialectology (Methods XI), Joensuu, August 2002, 3-30. Amsterdam: Benjamins.
Auer, Peter. 2014. Enregistering pluricentric German. In: Pluricentricity: Language Variation and Sociocognitive Dimensions, ed. by Augusto Soares Da Silva, 19-48. Berlin: De Gruyter.
Avis Walter S., Robert J. Gregg and Matthew H. Scargill (eds.) 1967. Gage Senior Dictionary [later: Gage Canadian Dictionary]. 1st ed. Toronto: W. J. Gage.
Avis, Walter S. (editor-in-chief), Charles Crate, Patrick Drysdale, Douglas Leechman, Matthew H. Scargill and Charles J. Lovell (eds). 1967. A Dictionary of Canadianisms on Historical Principles. Toronto: Gage.
Avis, Walter S. 1967. Introduction. In: Avis et al. (eds.), xii-xv. Also in DCHP‑1 Online. http://dchp.ca/DCHP-1/pages/frontmatter (5 Dec. 2016)
Avis, Walter S. 1983. Canadian English in the North American context. Canadian Journal of Linguistics 28(1): 3-15.
Avis, Walter S., Patrick D. Drysdale, Robert J. Gregg, Victoria E. Neufeldt and Matthew H. Scargill (eds). 1983. Gage Canadian Dictionary. 3rd ed. Toronto: Gage.
Barber, Katherine and John Considine. 2010. Revising the Dictionary of Canadianisms: views from 2005. In Current Projects in Historical Lexicography, edited by John Considine, 141-149. Newcastle: Cambridge Scholars.
Barber, Katherine. 2004 [1998]. Canadian Oxford Dictionary. 2nd ed. [1st ed.]. Don Mills, ON: Oxford Univeristy Press.
Boberg, Charles. 2005. The North American Regional Vocabulary Survey: new variables and methods in the study of North American English. American Speech 80: 22-60.
Boberg, Charles. 2010. The English Language in Canada: Status, History and Comparative Analysis. Cambridge: Cambridge University Press.
Boberg, Charles. 2016. Newspaper dialectology: harnessing the power of the mass media to study Canadian English. American Speech 91(2): 109-138.
Cassidy, Frederick G. and Joan Houston Hall (eds.). 1985-2013. Dictionary of American Regional English. Volumes I - VI. Cambridge, MA: Belknap Press of Harvard University Press.
Chambers, J. K. 1994. An introduction to Dialect Topography. English World-Wide. 15: 35-53.
Chambers, J. K. 2006. Canadian Raising retrospect and prospect. Canadian Journal of Linguistics 51(2 & 3): 105-118.
COD-1 = see Barber, Katherine (2004, 1998)
COD-2 = see Barber, Katherine (2004, 1998)
Craigie, William and James R. Hulbert (eds.). 1938-44. A Dictionary of American English on Historical Principles. 4 volumes. Chicago: University of Chicago Press.
DA = see Mathews (1951)
DAE = see Craigie and Hulbert (1938-44)
DARE = see Cassidy & Hall (1985-2013)
Davey, William J. and Richard P. MacKinnon (eds.). 2016. Dictionary of Cape Breton English. Toronto: University of Toronto Press.
Davies, Mark. 2013. Corpus of Global Web-Based English: 1.9 billion words from speakers in 20 countries. Available online at http://corpus.byu.edu/glowbe/ (5 Dec. 2016).
DCHP-1 = see Avis et. al (1967)
DCHP-1 Online = see Dollinger, Brinton and Fee (2013)
DeWolf, Gaelan Dodds, Robert J. Gregg, Barbara P. Harris, and Matthew H. Scargill (eds.). 1997. Gage Canadian Dictionary. 5th ed. Toronto: Gage.
DNE = see Story et al. (1982-1999)
Dollinger, Stefan (ed.-in-chief), Laurel Brinton and Margery Fee (eds.). 2006-2016. The Bank of Canadian English. Online database, 1505-2016. 2.7 million words.
Dollinger, Stefan (ed.-in-chief), Laurel J. Brinton and Margery Fee (eds). 2013. DCHP-1 Online: A Dictionary of Canadianisms on Historical Principles Online. Based on Walter S. Avis et al. (1967). (Online dictionary) http://dchp.ca/dchp1/. Vancouver, BC: University of British Columbia.
Dollinger, Stefan and Laurel J. Brinton. 2008. Canadian English lexis: historical and variationist perspectives. Anglistik: International Journal of English Studies. 19(2) Special Issue "Focus on Canadian English", edited by Matthias Meyer, 43-64.
Dollinger, Stefan and Luanne von Schneidemesser. 2011. Canadianism, Americanism, North Americanism? A comparison of DARE and DCHP. American Speech 86(2): 115-151.
Dollinger, Stefan. 2006. Towards a fully revised and extended edition of the Dictionary of Canadianisms on Historical Principles
: background, challenges, prospects. Historical Sociolinguistics/Sociohistorical Linguistics 6. http://www.academia.edu/4591720/ (5 Dec. 2016)Dollinger, Stefan. 2008. New-Dialect Formation in Canada: Evidence from the English Modal Auxiliaries. Amsterdam/Philadelphia: Benjamins.
Dollinger, Stefan. 2010. Software from the Bank of Canadian English as an open source tool for the dialectologist: ling.surf and its features. In: Joseph Wright's English Dialect Dictionary and Beyond: Studies in Late Modern English Dialectology, edited by Manfred Markus, Clive Upton and Reinhard Heuberger, 249-261. Berne: Lang.
Dollinger, Stefan. 2014. BC Linguistic Survey. Class project ENGL 489. Fall 2014, 1304 written questionnaire responses. University of British Columbia.
Dollinger, Stefan. 2015a. How to write a historical dictionary: a sketch of The Dictionary of Canadianisms on Historical Principles, Second Edition. Ozwords 24(2): 1-3 & 6 (October 2015). Canberra: Australian National Dictionary Centre. http://www.academia.edu/18967380/ (5 Dec. 2016)
Dollinger, Stefan. 2015b. The Written Questionnaire in Social Dialectology: History, Theory, Practice. Amsterdam/Philadelphia: Benjamins.
Dollinger, Stefan. 2015c. The Dictionary of Canadianisms on Historical Principles, Second Edition and regional variation: the complex case of Newfoundland. Regional Language Studies . . . Newfoundland 26: 9-20. http://www.academia.edu/16544684/ (6 Dec. 2016)
Dollinger, Stefan. 2016a. Googleology as smart lexicography: big, messy data for better regional labels. Dictionaries: Journal of the Dictionary Society of North America. 37: 60-98.
Dollinger, Stefan (ed.). 2016b. Lexicography and variation: big data via Google? Discussion results and summary of responses, 19 Feb. - mid-March 2016. https://www.academia.edu/25184026/ (5 Dec. 2016)
Dollinger, Stefan. 2016c. TAKE UP #9 as a semantic isogloss on the Canada-US border. World Englishes. Early view. DOI: 10.1111/weng.12212 (8 Dec. 2016)
Friend, David, Julia Keeler, Dan Liebman and Fraser Sutherland (eds.). 1997. ITP Nelson Canadian Dictionary of the English Language: An Encyclopedic Reference. Toronto: ITP Nelson.
Gage-1 = see Avis, Gregg and Scargill (eds.). (1967).
Gage-3 = see Avis et al. (eds.). (1983).
Gage-5 = see DeWolf et al. (1997).
Görlach, Manfred. 1990. The dictionary of transplanted varieties of languages: English. In Wörterbücher: Dictionaries: Dictionnaires: An International Encyclopedia of Lexicography, edited by Franz J. Hausmann, Oskar Reichmann, Herbert E. Wiegand and Ladislav Zgusta (eds.), II: 1475-1499. Berlin: de Gruyter.
Gove, Philip Babcock. 1961. Webster's Third International Dictionary of the English Language. Springfield, MA: Merriam Webster.
Grimm, Jacob. 1854. [Vorrede]. In Deutsches Wörterbuch, Vol. I: A-Biermolke, edited by Jacob Grimm and Wilhelm Grimm, ii-lxvii. Leipzig: S. Hirzel.
Hoffman, Michol F. 2010. The role of social factors in the Canadian Vowel Shift: evidence from Toronto. American Speech 85(2): 121-140.
ICE Project. Coordinated by Gerald Nelson. http://ice-corpora.net/ICE/INDEX.HTM (5 Dec. 2016)
ITP Nelson = see Friend et al. (1997)
Kilgarriff, Adam. 2007. Googleology is bad science. Computational Linguistics 33: 147-151.
Landau, Sidney I. 2001. Dictionaries: the Art and Craft of Lexicography. 2nd ed. Cambridge: Cambridge University Press.
Lighter, J. E. (ed.). 1994, 1997. Random House Historical Dictionary of American Slang. 2 vols. New York: Random House.
Lovell, Charles J. 1956. Whys and hows of collecting for the Dictionary of Canadian English [DCHP]. II. Excerption of quotations. Canadian Journal of Linguistics 2(1): 7-33.
Mathews, Mitford (ed.). 1951. Dictionary of Americanisms on Historical Principles. Chicago: University of Chicago Press.
Murray, James A. H., Henry Bradley, W. A. Craigie and C. T. Onions (eds.). 1933. The Oxford English Dictionary: Being a Corrected Re-issue with an Introduction, Supplement, and Bibliography. Oxford: Clarendon Press.
OED-1 = see Murray et al. (eds.) (1933).
OED-2 = The Oxford English Dictionary. 2nd ed. 1989. Edited by John Simpson and E.S.C. Weiner. Oxford: Oxford University Press.
OED-3 = The Oxford English Dictionary, 3rd ed. 2000--. Edited by John Simpson and Michael Proffitt. www.oed.com
Orsman, Harry W. (ed.). 1997.The Dictionary of New Zealand English: A Dictionary of New Zealandisms on Historical Principles. Auckland: Oxford University Press.
Pickett, Joseph P. (ed.). 2011. American Heritage Dictionary of the English Language. 5th ed. Boston: Houghton Mifflin, Harcourt.
Pomikálek, Jan, Miloš Jakubícek and Pavel Rychly. 2012. Building a 70 billion word corpus of English from ClueWeb. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), edited by Nicoletta Calzolari et al., 502-506. Istanbul, Turkey: European Language Resources Association (ELRA).
Pratt, T. K. 2004. Review of "Dictionary of American Regional English, Vol. IV: P-Sk." Canadian Journal of Linguistics 49(1): 127-130.
Pratt, Terrence K. (ed.). 1988. Dictionary of Prince Edward Island English. Toronto: University of Toronto Press.
Pringle, Ian and Enoch Padolsky. 1983. The linguistic survey of the Ottawa Valley. American Speech 58(4): 327-344.
Ramson, W. S. (ed.). 1988. Australian National Dictionary: A Dictionary of Australianisms on Historical Principles. Melbourne: Oxford University Press.
Scargill, Matthew H. and Henry J. Warkentyne. 1972. The survey of Canadian English: a report. The English Quarterly. A Publication of the Canadian Council of Teachers of English 5(3, Fall): 47-104.
Schneider, Edgar W. 2007. Postcolonial English: Varieties around the World. Cambridge: Cambridge University Press.
Silva, Penny (ed.). 1996. A Dictionary of South African English on Historical Principles. Oxford: Oxford University Press.
Story, G. M and William Kirwin. 1971. National dictionaries and regional homeword. Regional Language Studies . . . Newfoundland 3: 19-22.
Story, G. M., W. J. Kirwin, and J. D. A. Widdowson (eds.). 1999. [1990, 1982]. Dictionary of Newfoundland English. 3rd ed. [2nd ed., 1st ed.]. Toronto: University of Toronto Press. http://www.heritage.nf.ca/dictionary/ (6 Dec. 2016)
Swan, Julia. 2016. Canadian English in the Pacific Northwest: a phonetic comparison of Vancouver, BC and Seattle, WA. In Proceedings of the 2016 Annual Conference of the Canadian Linguistic Association, edited by Lindsay Hracs, 1-15. https://t.co/0pp1QGQPNm (8 Dec. 2016).
Tabbert, Russell (ed.). 1991. Dictionary of Alaskan English. Juneau, AL: Denali.
Wright, Joseph (ed.). 1898-1905. The English Dialect Dictionary. London: Henry Frowde.