Phonetic Symbol Processing in LaTeX
Phonetic Symbol Processing in LaTeX
Fukui Rei
Department of Asian and Pacic Linguistics, Institute of Cross-Cultural Studies, Faculty of Letters, University of Tokyo, Hongo 7-3-1, Bunkyo-ku, TOKYO 113 Japan fkr@tooyoo.l.u-tokyo.ac.jp
Introduction TIPA1 is a system for processing IPA (International A Phonetic Alphabet) symbols in L TEX. It is based 2 on TSIPA but both METAFONT source codes and A L TEX macros have been thoroughly rewritten so that it can be considered as a new system. Among many features of TIPA, the following are the new features as compared with TSIPA or any other existing systems for processing IPA symbols. A new 256 character encoding for phonetic symbols (T3), which includes all the symbols and diacritics found in the recent versions of IPA and some non-IPA symbols. A Complete support of L TEX 2 . Roman, slanted, bold, bold extended and sans serif font styles. Easy input method in the IPA environment. Extended macros for accents and diacritics.3 A exible system of macros for tone letters. An optional package (vowel.sty) for drawing vowel diagrams.4 A slightly modied set of fonts that go well when used with Times Roman and Helvetica fonts.
1 TIPA stands for T X IPA or Tokyo IPA. The primary E ftp site in which the latest version of TIPA is placed is ftp://tooyoo.L.u-tokyo.ac.jp/pub/TeX/tipa, and also it is mirrored onto the directory fonts/tipa of the CTAN archives. 2 TSIPA was made in 1992 by Kobayashi Hajime, Fukui Rei and Shirakawa Shun. It is available from a CTAN archive. One problem with TSIPA was that symbols already included in OT1, T1 or Math fonts are excluded, because of the limitation of its 128 character encoding. As a result, a string of phonetic representation had to be often composed of symbols from dierent fonts, disabling the possibility of automatic inter-word kerning. And also too many symbols had to be realized as macros. 3 These macros are now dened in a separate le called exaccent.sty in order for the authors of other packages to be able to make use of them. The idea of separating these macros from other ones was suggested by Frank Mittelbach. 4 This package (vowel.sty) can be used independently from the TIPA package. Documentation is also made separately in vowel.tex so that no further mention will be made here.
TIPA Encoding Selection of symbols The selection of TIPA phonetic symbols5 was made based on the following works. Phonetic Symbol Guide [9] (henceforth abbreviated as PSG). The ocial IPA charts of 49, 79, 89 and 93 versions. Recent articles published in the JIPA6 , such as Report on the 1989 Kiel Convention [6], Further report on the 1989 Kiel Convention [7], Computer Codes for Phonetic Symbols [3], Council actions on revisions of the IPA [8], etc. An unpublished paper by J. C. Wells: Computer-coding the IPA: a proposed extension of SAMPA [10]. Popular textbooks on phonetics. More specically, TIPA contains all the symbols, including diacritics, dened in the 79, 89 and 93 versions of IPA. And in the case of the 49 version of IPA, which is described in the Principles [5], there are too many obsolete symbols and only those symbols that had had some popularity at least for some time or for some group of people are included. Besides IPA symbols, TIPA also contains symbols that are useful for the following areas of phonetics and linguistics. Symbols used in the American phonetics (e.g. , , , , etc.). Symbols used in the historical study of IndoEuropean languages (e.g. , , , , , , and accents such as , e, etc.). a Symbols used in the phonetic description of languages in East Asia (e.g. , , , , , etc.). Diacritics used in extIPA Symbols for Disordered Speech [4] and VoQS (Voice Quality Symbols) [1] (e.g. n, f, m, etc). "" It should be also noted that TIPA includes all the necessary elements of tone letters, enabling
5 In the case of TSIPA, the selection of symbols was based on Computer coding of the IPA: Supplementary Report [2]. 6 Journal of the International Phonetic Association.
102
all the theoretically possible combinations of the tone letter system. In the recent publication of the International Phonetic Association tone letters are admitted as an ocial way of representing tones but the treatment of tone letters is quite insucient in that only a limited number of combination is allowed. This is apparently due to the fact that there has been no portable way of combining symbols that can be used across various computer environments. Therefore TEXs productive system of macro is an ideal tool for handling a system like tone letters. In the process of writing METAFONT source codes for TIPA phonetic symbols there have been many problems besides the one with the selection of symbols. One of such problems was that sometimes the exact shape of a symbol was unclear. For example, the shapes of the symbols such as (Stretched C), J (Curly-tail J) dier according to sources. This is partly due to the fact that the IPA has been continuously revised for the past few decades, and partly due to the fact that dierent ways of computerizing phonetic symbols on dierent systems have resulted in the diversity of the shapes of phonetic symbols. Although there is no denite answer to such a problem yet, it seems to me that it is a privilege of those working with METAFONT to have a systematic way of controlling the shapes of phonetic symbols. Encoding The 256 character encoding of TIPA is now ocially called the T3 encoding.7 In deciding this new encoding, care is taken to harmonize with existing other encodings, especially with the T1 encoding. Also the easiness of inputting phonetic symbols is taken into consideration in such a way that frequently used symbols can be input with small number of keystrokes. Table 1 shows the layout of the T3 encoding. The basic structure of the encoding found in the rst half of the table (character codes 000-177) is based on normal text encodings (ASCII, OT1 and T1) in that sectioning of this area into several groups such as the section for accents and diacritics, the section for punctuation marks, the section for numerals, the sections for uppercase and lowercase letters is basically the same with these encodings. Note also that the T3 encoding contains not only phonetic symbols but also usual punctuation marks that are used with phonetic symbols, and in such cases the same codes are assigned as the normal
7 In a discussion with the L T X 2 team it was suggested A E that the 128 character encoding used in WSUIPA would be refered to as the OT3 encoding.
0 00x
Accents and diacritics 04x 05x 06x 07x 10x Punctuation marks Basic IPA symbols I (vowels) Diacritics, etc. Basic IPA symbols II 13x 14x Pct. Diacritics, etc. Basic IPA symbols III (lowercase letters) Diacr. Tone letters and other suprasegmentals Old IPA, non-IPA symbols 27x 30x 33x 34x 37x Extended IPA symbols Gmn. Basic IPA symbols IV Gmn.
Pct. = Punctuation marks, Diacr. = Diacritics, Gmn. = Symbols for Germanic languages.
text encodings. However it is a matter of trade-o to decide which punctuation marks are to be included. For example : and ; might have been preserved in T3 but in this case : has been traditionally used as a substitute for the length mark : so that I decided to exclude : in favor of the easiness of inputting the length mark by a single keystroke. The encoding of the section for accents and diacritics is closely related to T1 in that the accents commonly included in T1 and T3 have the same encoding. The sections for numerals and uppercase letters are lled with phonetic symbols that are used frequently in many languages, because numerals and uppercase letters are usually not used as phonetic symbols. And the assignments made here are used as the shortcut characters, which will be explained in the section entitled Ordinary phonetic symbols (page 105).
103
Fukui Rei ASCII TIPA ASCII TIPA ASCII TIPA ASCII TIPA ASCII TIPA : 0 @ J T ; 1 A K U " 2 B L V 3 C M W 4 D N X 5 E O Y 6 F P Z 7 G Q | 8 H R 9 I S For a table of the T3 encoding, see Appendix C (page 114). TIPA fonts This version of TIPA includes two families of IPA fonts, tipa and xipa. The former family of fonts A is for normal use with L TEX, and the latter family is intended to be used with times.sty(PSNFSS). They all have the same T3 encoding as explained in the previous section. tipa Roman: tipa8, tipa9, tipa10, tipa12, tipa17 Slanted: tipasl8, tipasl9, tipasl10, tipasl12 Bold extended: tipabx8, tipabx9, tipabx10, tipabx12 Sans serif: tipass8, tipass9, tipass10, tipass12, tipass17 Bold: tipab10 xipa Roman: xipa10 Slanted: xipasl10 Bold: xipab10 Sans serif: xipass10 All these fonts are made by METAFONT, based on the Computer Modern font series. In the case of the xipa series, parameters are adjusted so as to look ne when used with Times Roman (in the cases of xipa10, xipasl10, xipab10) and Helvetica (in the case of xipass10). Usage Declaration of TIPA package In order to use TIPA, rst declare TIPA package at the preamble of a document. \documentclass{article} \usepackage{tipa} Encoding options The above declaration uses OT1 as the default text encoding. If you want to use TIPA symbols with T1, specify the option T1. \documentclass{article} \usepackage[T1]{tipa} If you want to use a more complex form of encoding, declare the use of fontenc package by yourself and specify the option noenc. In this case the option T3, which represents the TIPA encoding, must be included as an option to the fontenc package. For example, if you want to use TIPA and the University Washington Cyrillic (OT2) with the T1 text encoding, the following command will do this.
: ; "
0 1 2 3 4 5 6 7 8 9 @ A B C D E F G H I T U V W X Y Z | J K L M N O P Q R S
Table 2: TIPA shortcut characters As for the section for uppercase letters in the usual text encoding, a series of discussion among the members of the ling-tex mailing list revealed that there seem to be a certain amount of consensus on what symbols are to be assigned to each code. For example they were almost unanimous for the assignments such as A for A, B for B, D for D, S for S, T for T, etc. For more details, see table 2. The encoding of the section for numerals was more dicult than the above case. One of the possibilities was to assign symbols based on the resemblance of shapes. One can easily think of assignments such as 3 for 3 for 6, etc. But the resemblance of shape alone does not serve as a criteria for all the assignments. So I decided to assign basic vowel symbols to this section.8 Fortunately the resemblance of shape is to some extent maintained as is shown in table 2. The encoding of the section for lowercase letters poses no problem since they are all used as phonetic symbols. Only one symbol, namely g, needs some attention because its shape should be g, rather than g, as a phonetic symbol.9 The second half of the table (character codes 200-377) is divided into four sections. The rst section is devoted to the elements of tone letters and other suprasegmental symbols. Among the remaining three sections the last section 340-377 contains more basic symbols than the other two sections. This is a result of assigning the same character codes as latin-1 (ISO8859-1) and T1 encodings to the symbols that are commonly included in TIPA, latin-1 and T1 encoded fonts.10 These are the cases of , , , c and . And within each section symbols are arraneged largely in alphabetical order.
8 This idea was inuenced by the above mentioned article by J. C. Wells [10]. 9 But the alternative shape is preserved in other section and can be used as \textg. 10 This is based on a suggestion by Jrg Knappen. o
104
\documentclass{article} \usepackage[T3,OT2,T1]{fontenc} \usepackage[noenc]{tipa} By default, TIPA includes the fontenc package internally but the option noenc suppresses this. Using TIPA with PSNFSS In order to use TIPA with times.sty, declare the use of times.sty before declaring tipa packages. \documentclass{article} \usepackage{times} \usepackage{tipa} Font description les T3ptm.fd and T3phv.fd are automatically loaded by the above declaration. Other options TIPA can be extended by the options tone, extra. If you want to use the optional package for tone letters, add tone option to the \usepackage command that declares tipa package. \usepackage[tone]{tipa} And if you want to use diacritics for extIPA and VoQS, specify extra option. \usepackage[extra]{tipa} Finally there is one more option called safe, which is used to suppress denitions of some possibly dangerous commands of TIPA. \usepackage[safe]{tipa} More specically, the following commands are suppressed by declaring the safe option. Explanation on the function of each command will be given later. \s (equivalent to \textsyllabic) \* (already dened in plain TEX) A \|, \:, \;, \! (already dened in L TEX) Input Commands for Phonetic Symbols Ordinary phonetic symbols TIPA phonetic symbols can be input by the following two ways. 1. Input macro names in the normal text environment. 2. Input macro names or shortcut characters within the follwoing groups or environment. \textipa{...}11 {\tipaencoding ...} \begin{IPA} ... \end{IPA} (These groups and environment will be henceforth refered to as the IPA environment.)
11 I personally prefer a slightly shorter name like \ipa rather than \textipa but this command was named after A the general convention of L TEX 2 . The same can be said to all the symbol names beginning with \text.
A shortcut character refers to a single character that is assigned to a specic phonetic symbol and that can be directly input by an ordinary keyboard. In TIPA fonts, the character codes for numerals and uppercase letters in the normal ASCII encoding are assigned to such shortcut characters, because numerals and uppercase letters are usually not used as phonetic symbols. And additional shortcut characters for symbols such as , , may also be used if you are using a T1 encoded font and an appropriate input system for it. The following pair of examples show the same phonetic transcription of a English word that are input by the above mentioned two input methods. Input1 : [\textsecstress\textepsilon kspl \textschwa\textprimstress ne \textsci\textesh\textschwa n] Output1 : [ Ekspl@"neIS@n] Input2 : \textipa{[""Ekspl@"neIS@n]} Output2 : Ekspl@"neIS@n It is apparent that inputting in the IPA environment is far easier than in the normal text environment. Moreover, although the outputs of the above examples look almost the same, they are not identical, exactly speaking. This is because in the IPA environment automatic kerning between symbols is enabled, as is illustrated by the following pair of examples. Input1 : v\textturnv v w\textsca w y\textturny y [\textesh] Output1 : v2v w w yLy [S] Input2 : \textipa{v2v w\textsca w yLy [S]} Output2 : v2v w w yLy S Table 2 shows most of the shortcut characters together with the corresponding characters in the ASCII encoding. Naming of phonetic symbols Every TIPA phonetic symbol has a unique symbol name, such as Turned A, Hooktop B, Schwa.12 Also each symbol has a corresponding control sequence name, such as \textturna, \texthtb, \textschwa. The name used as a control sequence is usually an abbreviated form of the corresponding symbol name with a prex \text. The conventions used in the abbreviation are as follows. Suxes and endings such as -ive, -al, -ed are omitted.
12 The naming was made based on the literature listed in the section entitled Selection of Symbols (page 102). And users of TSIPA should be careful because TIPAs naming is slightly modied from that of TSIPA.
105
Fukui Rei Symbol name Turned A Glottal Stop Right-tail D Small Capital G Hooktop B Curly-tail C Crossed H Old L-Yogh Ligature Beta Macro name Symbol \textturna 5 \textglotstop P \textrtaild \textscg \texthtb \textctc C \textcrh \textOlyoghlig \textbeta B The macro \* is used in three dierent ways. First, when this macro is followed by one of the letters f, k, r, t or w, it results in a turned symbol. 14 Input: \textipa{\*f \*k \*r \*t \*w} Output: Secondly, when this macro is followed by one of the letters j, n, h, l or z, it results in a frequently used symbol that has otherwise no easy way to input. Input: \textipa{\*j \*n \*h \*l \*z} Output: Thirdly, when this macro is followed by letters other than the above cases, they are turned into the symbols of the default text font. This is useful in the IPA environment to select symbols temporarily from the normal text font. Input: \textipa{\*A dOg, \*B k\ae{}t, ma\super{\*{214}}} Output: A dOg, B k t, ma214 The remaining macros \;, \: and \! are used to make small capital symbols, retroex symbols, and implosives or clicks, respectively. Input: \textipa{\;B \;E \;A \;H \;L \;R} Output: Input: \textipa{\:d \:l \:n \:r \:s \:z} Output: Input: \textipa{\!b \!d \!g \!j \!G \!o} Output: Punctuation marks The following punctuation marks and text symbols that are normally included in the text encoding are also included in the T3 encoding so that they can be directly input in the IPA environment. Input: \textipa{! ( ) * + , - .\ / = ? [ ] } Output: ! ' * + , - . = ? ` All the other punctuation marks and text symbols that are not included in T3 need to be input with a prex \* explained in the last section when they appear in the IPA environment. Input: \textipa{\*; \*: \*@ \*\# \*\$ \*\& \*\% \*\{ \*\}} Output: ; : @ # $ & % { } Accents and diacritics Table 4 shows how to input accents and diacritics in TIPA with some examples. Here again, there are two kinds of input methods; one for the normal text environment, and the other for the IPA environment. In the IPA environment, most of the accents and diacritics can be input more easily than in the
14
Table 3: Naming of TIPA symbols right, left are abbreviated to r, l respectively. For small capital symbols, prex sc is added. A symbol with a hooktop is abbreviated as ht... A symbol with a curly-tail is abbreviated as ct... A crossed symbol is abbreviated as cr... A ligature is abbreviated as ...lig. For an old version of a symbol, prex O is added. Note that the prex O (old) should be given in uppercase letter. Table 3 shows some examples of correspondence between symbol names and control sequence names. Ligatures Just like the symbols such as , , , , , are realized as ligatures by inputting , , --, ---, fi, ff in TEX, two of the TIPA symbols, namely Secondary Stress and Double Pipe, and double quotation marks13 can be input as ligatures in the IPA environment. Input: \textipa{" "" | || } Output: " | `` '' Special macros \*, \;, \: and \! TIPA denes \*, \;, \: and \! as special macros in order to easily input phonetic symbols that do not have a shortcut character explained above. Before explaining how to use these macros, it is necessary to note that these macros are primarily intended to be used by linguists who usually do not care about things in math mode. And they can be dangerous in A that they override existing L TEX commands used in the math mode. So if you want to preserve the original meaning of these commands, daclare the option safe at the preamble.
13 Although TIPA fonts do not include the symbols and , a negative value of kerning is automatically inserted between and , and , so that the same results can be obtained as in the case of the normal text font.
106
Input in the normal Input in the IPA Output text environment environment \a \a a \"a \"a a \ a \~a a \r{a} \r{a} a \textsyllabic{m} \s{m} m " \textsubumlaut{a} \"*a a \textsubtilde{a} \~*a a \textsubring{a} \r*a a \textdotacute{e} \.e e \textgravedot{e} \.e e a \textacutemacron{a} \=a \textcircumdot{a} \^.a a \texttildedot{a} \~.a a \textbrevemacron{a} \u=a a Table 4: Examples of inputting accents normal text environment, especially in the cases of subscript symbols that are normally placed over a symbol and in the cases of combined accents, as shown in the table. As can be seen by the above examples, most of the accents that are normally placed over a symbol can be placed under a symbol by adding an * to the corresponding accent command in the IPA environment. The advantage of IPA environment is further exemplied by the all-purpose accent \|, which is used as a macro prex to provide shortcut inputs for the diacritics that otherwise have to be input by lengthy macro names. Table 5 shows examples of such accents. Note that the macro \| is also dangerous in that it has been already dened as A a math symbol of L TEX. So if you want to preserve the original meaning of this macro, declare safe option at the preamble. Finally, examples of words with complex accents that are input in the IPA environment are shown below. Input: \textipa{*\|c{k}\r*mt\om *bhr\=at\=er}
o Output: *kmtm *bhrt r a e For a full list of accents and diacritics, see Appendix A Superscript symbols In the normal text environment, superscript symbols can be input by a macro called \textsuperscript, which has been newly A introduced in the recent version of L TEX 2 . This macro takes one argument which can be either a symbol or a string of symbols, and can be nested.
Input in the normal Input in the IPA Output text environment environment \textsubbridge{t} \|[t t \textinvsubbridge{t} \|]t t \textsublhalfring{a} \|(a a \textsubrhalfring{a} \|)a a
k \textroundcap{k} \|c{k} \textsubplus{o} \|+o o \textraising{e} \|e e \textlowering{e} \|e e \textadvancing{o} \|<o o \textretracting{a} \|>a a \textovercross{e} \|x{e} e \textsubw{k} \|w{k} k \textseagull{t} \|m{t} t Table 5: Examples of the accent prex \|
Since the name of this macro is too long, TIPA provides an abbreviated form of this macro called \super. Input1 : t\textsuperscript h k\textsuperscript w a\textsuperscript{bc} a\textsuperscript{b% \textsuperscript{c}} Output1 : th kw abc ab
c
th kw abc ab
These macros automatically select the correct size of superscript font no matter what size of the text font is used. Tone letters TIPA provides a exible system of macros for tone letters. A tone letter is represented by a macro called \tone, which takes one argument consisting of a string of numbers ranging from 1 to 5. These numbers denote pitch levels, 1 being the lowest and 5 the highest. Within this range, any combination is allowed and there is no limit in the length of combination. As an example of the usage of tone letter macro, the four tones of Chinese are show below. Input: \tone{55}ma mother, \tone{35}ma hemp, \tone{214}ma horse, \tone{51}ma scold
107
Fukui Rei Roman \textipa{f@"nEtIks} f@"nEtIks Slanted \textipa{\slshape f@"nEtIks} f@"nEtIks or \textipa{\textsl{f@"nEtIks} f@"nEtIks or \textsl{\textipa{f@"nEtIks} f@"nEtIks Bold extended \textipa{\bfseries f@"nEtIks} f@"nEtIks or f@"nEtIks or \textipa{\textbf{f@"nEtIks} \textbf{\textipa{f@"nEtIks} f@"nEtIks Sans Serif \textipa{\sffamily f@"nEtIks} f@"nEtIks or \textipa{\textsf{f@"nEtIks} f@"nEtIks or \textsf{\textipa{f@"nEtIks} f@"nEtIks Table 6: Examples of font switching Input: Output: Input: Output: \textsl{\textipa{\{\"{\u*{e}}}}} e
\textsl{\textdoublebaresh} S (This symbol is composed by a macro.)
Internal commands Some of the internal commands of TIPA are dened without the letter @ in order to allow a user to extend the capability of TIPA. \ipabar Some TIPA symbols such as \textbarb b , \textcrtwo 2 are dened by using an internal macro command \ipabar. This command is useful when you want to make barred or crossed symbols not dened in TIPA. This command requires the following ve parameters to control the position of the bar. #1 the symbol to be barred #2 the height of the bar (in dimen) #3 bar width #4 left kern added to the bar #5 right kern added to the bar Parameters #3, #4, #5 are to be given in a scaling factor to the width of the symbol, which is equal to 1 if the bar has the same width with the symbol in question. For example, the following command states a barred b ( b ) of which the bar position in the y-coordinate is .5ex and the width of the bar is slightly larger than that of the letter b. % Barred B \newcommand\textbarb{% \ipabar{{\tipaencoding b}}% {.5ex}{1.1}{}{}} Note that the parameters #4 and #5 can be left blank if the value is equal to 0. And the next example declares a barred c (c ) of which the bar width is a little more than half as large as the letter c and it has the same size of kerning at the right. % Barred C \newcommand\textbarc{% \ipabar{{\tipaencoding c}}% {.5ex}{.55}{}{.55}} More complex examples with the \ipabar command are found in T3enc.def. \tipaloweraccent, \tipaupperaccent These two commands are used in the denitions of TIPA accents and diacritics. They are special forms of the commands \loweraccent and \upperaccent that are dened in exaccent.sty. The dierence between the commands with the prex tipa and the ones without it is that the former commands select
How easy to input phonetic symbols? Let us briey estimate here how much easy (or dicult) to input phonetic symbols with TIPA in terms of the number of keystrokes. The following table shows statistics for all the phonetic symbols that appear in the 93 version of IPA chart (diacritics and symbols for suprasegmentals excluded). It is assumed here that each symbol is input within the IPA environment and the safe option is not specied. keystrokes number examples 1 65 a, b, @, A, B, etc. 2 2 , 3 30 , , , , etc. 5 1 more than 5 7 , , , , etc. As is shown in the table, about 92% of the symbols can be input within three keystrokes. Changing font styles This version of TIPA includes ve styles of fonts, i.e. roman, slanted, bold, bold extended and sans serif. These styles can be switched in much the same way as in the normal text fonts (see table 6). The bold fonts are usually not used within the A standard L TEX class packages so that if you want to use them, it is necessary to use low-level font A selection commands of L TEX 2 . Input: {\fontseries{b}\selectfont abcdefg \textipa{ABCDEFG}} Output: abcdefg ABCDEFG Note also that slanting of TIPA symbols should correctly work even in the cases of combined accents and in the cases of symbols made up by macros.
108
accents from a T3 encoded font while the latter ones do so from the current text font. These commands take two parameters, the code of the accent (in decimal, octal or hexadecimal number) and the symbol to be accented, as shown below. Input: \tipaupperaccent{0}{a} a Output: , Optionally, these commands can take a extra parameter to adjust the vertical position of the accent. Such an adjustment is sometimes necessary in the denition of a nested accent. The next example shows TIPAs denition of the Circumex Dot Accent (e.g. a). % Circumflex Dot Accent \newcommand\textcircumdot[1]% {\tipaupperaccent[-.2ex]{2}% {\tipaupperaccent[-.1ex]{10}{#1}}} This denition states that a dot accent is placed over a symbol thereby reducing the vertical distance between the symbol and the dot by .1ex and a circumex accent is placed over the dot and the distance between the two accents is reduced by .2ex. If you want to make a combined accent not included in TIPA, you can do so fairly easily by using these two commands together with the optional parameter. For more examples of these commands, see tipa.sty and extraipa.sty. \tipaLoweraccent, \tipaUpperaccent These two commands dier from the two commands explaind above in that the rst parameter should be a symbol (or any other things, typically an \hbox), rather than the code of the accent. They are special cases of the commands \Loweraccent and \Upperaccent and the dierence between the two pairs of commands is the same as before. The next example makes a schwa an accent. Input: \tipaUpperaccent[.2ex]% {\lower.8ex\hbox{% \textipa{\super@}}}{a} @ Output: a Acknowledgments First of all, many thanks are due to the co-authors of TSIPA, Kobayashi Hajime and Shirakawa Shun. Kobayashi Hajime was the main font designer of TSIPA. Shirakawa Shun worked very hard in deciding encoding, checking the shapes of symbols and writing the Japanese version of document. TIPA was impossible without TSIPA. I would like to thank also Jrg Knappen whose o insightful comments helped greatly in many ways
the development of TIPA. I was also helped and encouraged by Christina Thiele, Martin Haase, Kirk Sullivan and many other members of the ling-tex mailing list. At the last stage of the development of TIPA Frank Mittelbach gave me precious comments on how to incorporate various TIPA commands into the NFSS. I would like to thank also Barbara Beeton who kindly read over the preliminary draft of this document and gave me useful comments. References [1] Martin J. Ball, John Esling, and Craig Dickson. VoQS: Voice Quality Symbols. 1994, 1994. [2] John Esling. Computer coding of the IPA: Supplementary report. Journal of the International Phonetic Association, 20(1):2226, 1990. [3] John H. Esling and Harry Gaylord. Computer codes for phonetic symbols. Journal of the International Phonetic Association, 23(2):83 97, 1993. [4] ICPLA. extIPA Symbols for Disorderd Speech. 1994, 1994. [5] IPA. The Principles of the International Phonetic Association, 1949. [6] IPA. Report on the 1989 Kiel Convention. Journal of the International Phonetic Association, 19(2):6780, 1989. [7] IPA. Further report on the 1989 Kiel Convention. Journal of the International Phonetic Association, 20(2):2224, 1990. [8] IPA. Council actions on revisions of the IPA. Journal of the International Phonetic Association, 23(1):3234, 1993. [9] Georey K. Pullum and William A. Ladusaw. Phonetic Symbol Guide. The University of Chicago Press, 1986. [10] John C. Wells. Computer-coding the IPA: a proposed extension of SAMPA. Revised draft 1995 04 28, 1995.
Appendix
A A List of TIPA Symbols For each symbol the following information is shown: (1) the symbol, (2) input method in the normal text environment (and a shortcut method that can be used within the IPA environment in parenthesis), (3) the name of the symbol. Vowels and Consonants a a Lower-case A
109
Fukui Rei
5 A 6 2 b
b B c c c C d d dz d D e @ 9 E 3
\textturna (5) Turned A \textscripta (A) Script A \textturnscripta(6) Turned Script A \ae Ash \textsca (\;A) Small Capital A15 \textturnv (2) Turned V16 b Lower-case B \textsoftsign Soft Sign \texthardsign Hard Sign \texthtb (\!b) Hooktop B \textscb (\;B) Small Capital B \textcrb Crossed B \textbarb Barred B \textbeta (B) Beta c Lower-case C \textbarc Barred C \texthtc Hooktop C \v{c} C Wedge \c{c} C Cedilla \textctc (C) Curly-tail C \textstretchc Stretched C17 d Lower-case D \textcrd Crossed D \textbard Barred D \texthtd (\!d) Hooktop D \textrtaild (\:d) Right-tail D \textctd Curly-tail D \textdzlig D-Z Ligature \textdctzlig D-Curly-tail Z Ligature \textdyoghlig D-Yogh Ligature \textctdctzlig Curly-tail D-Curly-tail Z Ligature \dh (D) Eth e Lower-case E \textschwa (@) Schwa \textrhookschwa Right-hook Schwa \textreve (9) Reversed E \textsce (\;E) Small Capital E \textepsilon (E) Epsilon \textcloseepsilon Closed Epsilon \textrevepsilon(3) Reversed Epsilon \textrhookrevepsilon Right-hook Reversed Epsilon \textcloserevepsilon Closed Reversed Epsilon
f g g g
G 7 h H 4 i 1
I j J
k l
f Lower-case F \textg (g) Lower-case G \textbarg Barred G \textcrg Crossed G \texthtg (\!g) Hooktop G g (\textg) Text G \textscg (\;G) Small Capital G \texthtscg (\!G) Hooktop Small Capital G \textgamma (G) Gamma \textbabygamma Baby Gamma \textramshorns (7) Rams Horns h Lower-case H \texthvlig H-V Ligature \textcrh Crossed H \texthth (H) Hooktop H \texththeng Hooktop Heng \textturnh (4) Turned H \textsch (\;H) Small Capital H i Lower-case I \i Undotted I \textbari (1) Barred I \textiota Iota \textlhti Left-hooktop I18 \textlhtlongi Left-hooktop Long I \textvibyi Viby I19 \textraisevibyi Raised Viby I \textsci (I) Small Capital I j Lower-case J \j Undotted J \textctj (J) Curly-tail J20 \textscj (\;J) Small Capital J \v{\j} J Wedge \textbardotlessj Barred Dotless J \textObardotlessj Old Barred Dotless J \texthtbardotlessj(\!j) Hooktop Barred Dotless J21 k Lower-case K \texthtk Hooktop K \textturnk (\*k) Turned K l Lower-case L
15 This symbol is fairly common among Chinese phoneticians. 16 In PSG this symbol is called Inverted V but it is apparently a mistake. 17 The shape of this symbol diers according to the sources. In PSG and recent articles in JIPA, it is stretched toward both the ascender and descender regions and the whole shape looks like a thick staple. In the old days, however, it was streched only toward the ascender and the whole shape looked more like a stretched c.
18 The four symbols , , and are mainly used among Chinese linguists. These symbols are based on det svenska landsm alsalfabetet and introduced to China by Bernhard Karlgren. The original shapes of these symbols were in italic as was always the case with det svenska landsm alsalfabetet. It seems that the Chinese linguists who wanted to continue to use these symbols in IPA changed their shapes upright. 19 I call this symbol Viby I, based on the following description by Bernhard Karlgren: Une voyelle tr`s analogue e a ` se rencontre dans certains dial. sudois; on lappelle i de e Viby.(Etudes sur la phonologie chinoise, 191526, p. 295) 20 In the ocial IPA charts of 89 and 93, this symbol has a dish serif on top of the stem, rather than the normal sloped serif found in the letter j. I found no reason why it should have a dish serif here, so I changed it to a normal sloped serif. 21 In PSG the shape of this symbol slightly diers. Here I followed the shape found in IPA 89, 93.
110
m M W n n N o 8
p F q r R
\textltilde (\|~l) L with Tilde \textbarl Barred L \textbeltl Belted L \textrtaill (\:l) Right-tail L \textlyoghlig L-Yogh Ligature \textOlyoghlig Old L-Yogh Ligature \textscl (\;L) Small Capital L \textlambda Lambda \textcrlambda Crossed Lambda m Lower-case M \textltailm (M) Left-tail M (at right) \textturnm (W) Turned M \textturnmrleg Turned M, Right Leg n Lower-case N \textnrleg N, Right Leg \~n N with Tilde \textltailn Left-tail N (at left) \ng (N) Eng \textrtailn (\:n) Right-tail N \textctn Curly-tail N \textscn (\;N) Small Capital N o Lower-case O \textbullseye (\!o) Bulls Eye \textbaro (8) Barred O \o Slashed O \oe O-E Ligature \textscoelig (\OE) Small Capital O-E Ligature \textopeno (O) Open O \textturncelig Turned C(Open O)-E Ligature \textomega Omega \textscomega Small Capital Omega \textcloseomega Closed Omega p Lower-case P \textwynn Wynn \textthorn (\th) Thorn \texthtp Hooktop P \textphi (F) Phi q Lower-case Q \texthtq Hooktop Q \textscq (\;Q) Small Capital Q22 r Lower-case R \textfishhookr (R) Fish-hook R \textlonglegr Long-leg R \textrtailr (\:r) Right-tail R \textturnr (\*r) Turned R \textturnrrtail(\:R) Turned R, Right Tail \textturnlonglegr Turned Long-leg R
K s s
22 Suggested by Prof S. Tsuchida for Austronesian languages in Taiwan. In PSG Female Sign and Uncrossed Female Sign(pp. 110111) are noted for pharyngeal stops, as proposed by Trager (1964). Also Im not sure about the dierence between an epiglottal plosive and a pharyngeal stop.
\textscr (\;R) Small Capital R \textinvscr (K) Inverted Small Capital R s Lower-case S \v{s} S Wedge \textrtails (\:s) Right-tail S (at left) S \textesh (S) Esh S \textdoublebaresh Doube-barred Esh \textctesh Curly-tail Esh t t Lower-case T \texthtt Hooktop T \textlhookt Left-hook T \textrtailt (\:t) Right-tail T tC \texttctclig T-Curly-tail C Ligature \texttslig T-S Ligature \textteshlig T-Esh Ligature \textturnt (\*t) Turned T \textctt Curly-tail T C \textcttctclig Curly-tail T-Curly-tail C Ligature T \texttheta (T) Theta u u Lower-case U 0 \textbaru (0) Barred U U \textupsilon (U) Upsilon \textscu (\;U) Small Capital U v v Lower-case V V \textscriptv (V) Script V w w Lower-case W \textturnw (\*w) Turned W x x Lower-case X X \textchi (X) Chi y y Lower-case Y L \textturny (L) Turned Y Y \textscy (Y) Small Capital Y \textvibyy Viby Y23 z z Lower-case Z \textcommatailz Comma-tail Z \v{z} z Z Wedge \textctz Curly-tail Z \textrevyogh Reversed Yogh \textrtailz (\:z) Right-tail Z Z \textyogh (Z) Yogh \textctyogh Curly-tail Yogh 2 \textcrtwo Crossed 2 P \textglotstop (P) Glottal Stop \textraiseglotstop Superscript Glottal Stop \textbarglotstop Barred Glottal Stop \textinvglotstop Inverted Glottal Stop \textcrinvglotstop Crossed Inverted Glottal Stop Q \textrevglotstop(Q) Reversed Glottal Stop \textbarrevglotstop Barred Reversed Glottal Stop
23
111
Fukui Rei
| = !
a a a s a t d n
o o k g t e O E e u @ e e u e e e e
m t" & t^ t_ @~ b k k p
Suprasegmentals " \textprimstress(") Vertical Stroke (Superior) \textsecstress ("") Vertical Stroke (Inferior) : \textlengthmark(:) Length Mark ; \texthalflength(;) Half-length Mark \ \textvertline Vertical Line ] \textdoublevertline Double Vertical Line \textbottomtiebar(\t*{}) Bottom Tie Bar a \textglobfall Downward Diagonal Arrow ` \textglobrise Upward Diagonal Arrow ^ \textdownstep Down Arrow24 _ \textupstep Up Arrow
, e e e e e e e e e e e e e
Accents and Diacritics \e Grave Accent \e Acute Accent \^e Circumex Accent \~e Tilde \"e Umlaut \H{e} Double Acute Accent \r{e} Ring \v{e} Wedge \u{e} Breve \=e Macron \.e Dot \c{e} Cedille \textpolhook{e}(\k{e}) Polish Hook (Ogonek Accent) e \textdoublegrave{e} (\H*e) Double Grave Accent e \textsubgrave{e}(\*e) Subscript Grave Accent e \textsubacute{e}(\*e) Subscript Acute Accent e \textsubcircum{e}(\^*e) Subscript Circumex Accent g \textroundcap{g}(\|c{g})
Round Cap \textacutemacron{a} (\=a) a Acute Accent with Macron a \textvbaraccent{a} Vertical Bar Accent a \textdoublevbaraccent{a} Double Vertical Bar Accent e \textgravedot{e}(\.e) Grave Dot Accent e \textdotacute{e}(\.e) Dot Acute Accent \textcircumdot{a}(\^.a) a Circumex Dot Accent
24 The shapes of \textdownstep and \textupstep dier according to sources. Here I followed the shapes found in the recent IPA charts.
\texttildedot{a}(\~.a) Tilde Dot Accent \textbrevemacron{a}(\u=a) Breve Macron Accent \textringmacron{a}(\r=a) Ring Macron Accent \textacutewedge{s}(\vs) Acute Wedge Accent \textdotbreve{a} Dot Breve Accent \textsubbridge{t}(\|[t) Subscript Bridge \textinvsubbridge{d}(\|]t) Inverted Subscript Bridge \textsubsquare{n} Subscript Square \textsubrhalfring{o}(\|)o) Subscript Right Half-ring25 \textsublhalfring{o}(\|(o) Subscript Left Half-ring \textsubw{k} (\|w{k}) Subscript W \textoverw{g} Over W \textseagull{t} (\|m{t}) Seagull \textovercross{\e}(\|x{e}) Over-cross \textsubplus{\textopeno}(\|+O) Subscript Plus26 \textraising{\textepsilon}(\|E) Raising Sign \textlowering{e}(\|e) Lowering Sign \textadvancing{u}(\|<u) Advancing Sign \textretracting{\textschwa}(\|>@) Retracting Sign \textsubtilde{e}(\~*e) Subscript Tilde \textsubumlaut{e}(\"*e) Subscript Umlaut \textsubring{u} (\r*u) Subscript Ring \textsubwedge{e}(\v*e) Subscript Wedge \textsubbar{e} (\=*e) Subscript Bar \textsubdot{e} (\.*e) Subscript Dot \textsubarch{e} Subscript Arch \textsyllabic{m}(\s{m}) Syllabicity Mark \textsuperimposetilde{t}(\|~{t}) Superimposed Tilde t\textcorner Corner t\textopencorner Open Corner \textschwa\rhoticity Rhoticity b\textceltpal Celtic Palatalization Mark k\textlptr Left Pointer k\textrptr Right Pointer p\textrectangle Rectangle27
25 Diacritics \textsubrhalfring and \textsublhalfring can be placed after a symbol by inputting, for example, [e\textsubrhalfring{}] e . 26 The diacritics such as \textsubplus, \textraising, \textlowering \textadvancing and \textretracting can be placed after a symbol by inputting [e\textsubplus{}] e , for example. 27 This symbol is used among Japanese linguists as a diacritical symbol indicating no audible release (IPA ^), because the symbol ^ is used to indicate pitch accent in Japanese.
112
gb \texttoptiebar{gb}(\t{gb}) '
Top Tie Bar Apostrophe \textrevapostrophe Reversed Apostrophe . . Period \texthooktop Hooktop $ \textrthook Right Hook \textpalhook Palatalization Hook ph p\textsuperscript{h}(p\super h) Superscript H kw k\textsuperscript{w}(k\super w) Superscript W tj t\textsuperscript{j}(t\super j) Superscript J tG t\textsuperscript{\textgamma}(t\super G) Superscript Gamma dQ d\textsuperscript{\textrevglotstop} (d\super Q) Superscript Reversed Glottal Stop dn d\textsuperscript{n}(d\super n) Superscript N dl d\textsuperscript{l}(d\super l) Superscript L Tone letters The tones illustrated here are only a representative sample of what is possible. For more details see the section entitled Tone Letters (page 107).
Ts m .. a s n n n s s s J
\sliding{\ipa{Ts}} Right Arrow \crtilde{m} Crossed tilde \dottedtilde{a} Dotted Tilde \doubletilde{s} Double Tilde \partvoiceless{n} Parenthesis + Ring \inipartvoiceless{n} Parenthesis + Ring \finpartvoiceless{n} Parenthesis + Ring \partvoice{s} Parenthesis + Subwedge \inipartvoice{s} Parenthesis + Subwedge \finpartvoice{s} Parenthesis + Subwedge \sublptr{J} Subscript Left Pointer J \subrptr{J} Subscript Right Pointer B Symbols not included in TIPA
P P| | P| P| P| T| X| U| U| |
\tone{55} \tone{44} \tone{33} \tone{22} \tone{11} \tone{51} \tone{15} \tone{45} \tone{12} \tone{454}
Extra High Tone High Tone Mid Tone Low Tone Extra Low Tone Falling Tone Rising Tone High Rising Tone Low Rising Tone High Rising Falling Tone
Diacritics for extIPA, VoQS In order to use diacritics listed in this section, it is necessary to specify the option extra at the preamble (See the section entitled Other options on page 105). Note also that some of the diacritics are dened by using symbols from fonts other than TIPA so that they may not look quite satisfactory and/or may not be slanted (e.g. \whistle{s} ). s
There are about 40 symbols that appear in PSG but are not included or dened in TIPA (ordinary capital letters, Greek letters and punctuation marks excluded). Most of such symbols are the ones that have been proposed by someone but never or rarely been followed by other people. Some of such symbols can be realized by writing appropriate macros, while some others cannot be realized without resorting to the Metafont. This section discusses these problems by classifying such symbols into three categories, as shown below. 1. Symbols that can be realized by TEXs macro level and/or by using symbols from other fonts. 2. Symbols that can be imitated by TEXs macro level and/or by using symbols from other fonts (but may not look quite satisfactory). 3. Symbols that cannot be realized at all, without creating a new font. The following table shows symbols that belong to the rst category. For each symbol, an example of input method and its output is also given. Note that barred or crossed symbols can be easily made by TIPAs \ipabar macro. Script Lowe-case F {\itshape f} f Barred Small Capital I \ipabar{\textsci}{.5ex}{1.1}{}{} I Barred J \ipabar{j}{.5ex}{1.1}{}{} j Crossed K \ipabar{k}{1.2ex}{.6}{}{.4} k Barred Open O \ipabar{\textopeno}{.5ex}{.6}{.4}{} O Barred Small Capital Omega \ipabar{\textscomega}{.5ex}{1.1}{}{} Barred P
s v n t f "" v s^
\spreadlips{s} Left Right Arrow \overbridge{v} Overbridge \bibridge{n} Bibridge \subdoublebar{t} Subscript Double Bar \subdoublevert{f} Subscript Double Vertical Line \subcorner{v} Subscript Corner \whistle{s} Up Arrow
113
Fukui Rei \ipabar{p}{.5ex}{1.1}{}{} Half-barred U \ipabar{u}{.5ex}{.5}{}{.5} Barred Small Capital U \ipabar{\textscu}{.5ex}{1.1}{}{} Null Sign $\emptyset$ Double Slash /\kern-.25em/ Triple Slash /\kern-.25em/\kern-.25em/ Pointer (Upward) k\super{\tiny$\wedge$} Pointer (Downward) k\super{\tiny$\vee$} Superscript Arrow k\super{\super{$\leftarrow$}} p u C Inverted Lower-case Omega Reversed Esh with Top Loop T with Upper Left Hook Turned Small Capital U Bent-tail Yogh Turned 2 Turned 3
/ / // / k k k
Symbols that belong to the second category are shown below. Note that slashed symbols can be in fact easily made by a macro. For example, a slashd b i.e. b can be made by \ipaclap{b}{/}. The reason / why slashed symbols are not included in TIPA is as follows: rst, a simple overlapping of a symbol and a slash does not always result in a good shape, and secondly, it doesnt seem signicant to devise ne-tuned macros for symbols which were created essentialy for typewriters. Right-hook A Slashed B Slashed C Slashed D Small Capital Delta Right-hook E Right-hook Epsilon Small Capital F Female Sign Uncrossed Female Sign Right-hook Open O Slashed U Slashed W a$ b / / c d / e$
E$
f
u / w /
And nally, symbols that cannot be realized at all are as follows. Reversed Turned Script A A-O Ligature Inverted Small Capital A Small Capital A-O Ligature D with Upper-left Hook Hooktop H with Rightward Tail Heng Hooktop J Front-bar N
00x 01x 02x 03x 04x 05x 06x 07x 10x 11x 12x 13x 14x 15x 16x 17x 20x 21x 22x 23x 24x 25x 26x 27x 30x 31x 32x 33x 34x 35x 36x 37x
,
0 8 @ H P X ` h p x P X Z b
! 1 9 A I Q Y a i q y Q Y [ c
& . 6 F N V ^ f n v ~ V
'
7 ? G O W _ g o w W a
" * 2 : B J R Z b j r z R \
+ 3 ; C K S c k s S ]
$ , 4 D L T d l t | T ^
5 = E M U e m u U _
114