Browsing by Subject "Multilingual"

Now showing 1 - 4 of 4

Computational modeling of politeness across diverse languages
(2023-04-26) Srinivasan, Anirudh; Choi, Eunsol
We study politeness phenomena in nine typologically diverse languages. Politeness is an important facet of communication and is sometimes argued to be cultural-specific, yet existing computational linguistic study is limited to English. We create TyDiP, a dataset containing three-way politeness annotations for 500 examples in each language, totaling 4.5K examples. We evaluate how well multilingual models can identify politeness levels -- they show a fairly robust zero-shot transfer ability, yet fall short of estimated human accuracy significantly. We further study mapping the English politeness strategy lexicon into nine languages via automatic translation and lexicon induction, analyzing whether each strategy's impact stays consistent across languages. Lastly, we empirically study the complicated relationship between formality and politeness through transfer experiments. We hope our dataset will support various research questions and applications, from evaluating multilingual models to constructing polite multilingual agents. The data and code is publicly available at on GitHub https://github.com/Genius1237/TyDiP and HuggingFace https://huggingface.co/datasets/Genius1237/TyDiP.
Examining the multilingual and multimodal resources of young Latino picturebook makers
(2013-08) Zapata, Maria Angelica; Roser, Nancy; Fránquiz, María E.
The purpose of this qualitative research was to better understand the multilingual and multimodal composition resources appropriated by students during a study of Latino children’s picturebooks within a predominantly Latino, third grade classroom. A conceptual framework guided by socio-cultural perspectives, a social semiotic theory of communication, and Composition 2.0 studies was employed to investigate the ways in which students remixed multilingual and multimodal composition resources and manifested identities in texts. This research was guided by both design-based and case study methods and drew upon constant-comparative, discourse, and visual discourse analytic methods to examine the data. Analysis was also located in the literature on identity and texts so as to better understand the socio-cultural histories and identities attached to the children's picturebooks. Data collection was focused on both the multilingual and multimodal resources students appropriated to compose and the ways students orchestrated those resources during the classroom picturebook study. Analysis was structured by two interrelated strands. The first strand explores more broadly the composition resources in use during the classroom picturebook study, and the second analyzes explicitly the ways two focal students remixed composition resources within their picturebook productions and sedimented identities in texts. Three findings generated from the two related strands of analysis provided insights into the potential of a picturebook study as a viable multilingual and multimodal composition curriculum. First, in the context of the teacher and researcher co-designed curriculum and instruction, students appropriated literary, illustrated, material, and picturebook form resources from Latino children’s picturebooks in diverse ways. Second, in the act of picturebook making, students invoked other socio-cultural texts as mentors and remixed composition resources from diverse sources to craft their own picturebooks. Finally, students manifested aspects of their identities within the material worlds and languages reflected within their picturebooks. Together, these findings situate picturebook study and picturebook making as creative and intellectual acts for students. Moreover, this study features Latino children’s picturebooks as culturally responsive mentor texts. Several pedagogical implications related to composition instruction for young writers and diverse population are also discussed.
Facebook as a multilingual communication site
(2013-05) Olsen, Carolyn Anne; Doty, Philip
As Facebook grows beyond a billion users (Zuckerberg, 2012), a decreasing percentage of those users are English-only speakers. Facebook provides a platform for multilingual conversation to occur, which requires that Facebook display non-Latin scripts. Because of the hegemony of English and the Latin alphabet on the Web, non-Latin scripts are often “ASCII-ized.” Displaying non-Latin scripts well facilitates communication for multilingual users and creates a place where they can explore their identity linguistically as they post on Facebook. This study examines what factors contribute to multilingual Facebook users making linguistic posting choices. Many have named Facebook as a successful multilingual Web site, thus it is reasonable to expect that Facebook is an exemplar of multilingual social networking sites. This study is an examination and critique of Facebook’s multilingual translations. To address questions of how Facebook’s interface facilitates or impedes multilingual conversation, the researcher recruited twelve active, multilingual Facebook users to participate in individual interviews and a small focus group. Besides English, these users spoke and posted in the world’s four other most widely spoken languages: Chinese, Spanish, Arabic and Hindi. The researcher found that multilingual Facebook users did not always have a choice in what language they would post. Users faced obstacles ranging from the Facebook app distorting script display to hardware bias limiting users’ text entry. Furthermore, participants’ linguistic presentation was not dichotomous between two languages; multilingual users and their friends are accustomed to operating in a multilingual space. The larger implication of these findings is that Facebook, despite pioneering massive translation projects, has not solved the problem of linguistic representation for social networking sites. Facebook’s solution is not scalable to less widely spoken languages because even languages with many millions of speakers, such as Spanish, have flawed implementations on Facebook.
Marfa : a culturally respectful Perso-Arabic and Latin multi-script typeface
(2017-08) Karimifar, Mohamad; Gorman, Carma; Walker, James
Globalization and the need for cross-cultural communication have created a need for multi-script typefaces suitable for both print and digital applications. However, there are far fewer Perso-Arabic typefaces in existence than Latin ones, and even fewer Perso-Arabic/Latin multi-script typefaces. Of the Perso-Arabic/Latin multi-script typefaces in existence, very few are visually unified enough to use for multilingual typesetting. And the very few visually unified multi-script typefaces that do exist usually achieve that visual unity by forcing the calligraphic forms of Perso-Arabic glyphs into the norms of Latin typography. With a few exceptions, multi-script Perso-Arabic/Latin typefaces have not been very respectful to Perso-Arabic calligraphic traditions: their forms are visible artifacts of western influence, and they therefore do not strike a particularly effective tone for cross-cultural communication. In response to the need for visually unified, less obviously “colonizing” Perso-Arabic/Latin multi-script typefaces, I have designed the Marfa type family. Marfa is a visually cohesive multi-script type family (which will eventually include eight different weights and styles) that respects the calligraphic origins of both Perso-Arabic and Latin writing systems. Marfa not only adds to the number of available visually cohesive Perso-Arabic/Latin multi-script typefaces—therefore giving typographers new, less overbearing “tones of voice” to work with—but also, by refusing to force Perso-Arabic characters into Latin typographic norms, provides a model for how to begin “decolonizing” other multi-script typefaces.