From ebb5505e9b55f126c75ec2818ce9a1a8121457d8 Mon Sep 17 00:00:00 2001 From: George Moroz Date: Tue, 12 Dec 2023 10:31:12 +0300 Subject: [PATCH] add v6 --- data.csv | 8 ++++---- visualizing_scripts.R | 10 ++++++++-- 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/data.csv b/data.csv index 18aadf7..2dfe719 100644 --- a/data.csv +++ b/data.csv @@ -1,7 +1,7 @@ year,month,day,author,title,abstract 2023,12,12,"Natalia Logvinova (HSE, ILI RAS)",Concord in Russian close appositional constructions: a quantitative study,"In this talk I will discuss case concord in Russian close appositional constructions, which manifests itself in optional case concord of the proper name (v rek-e Don-e/ v rek-e Don ‘in the river Don’). The study provides an in-depth corpus analysis of more than 15,000 examples, using a logistic regression statistical model to predict the choice between presence and absence of concord. The results indicate concord is most likely to occur in constructions with structurally simple and frequent proper names that exhibit adjectival properties and match the common noun in grammatical gender. Proper names with the Goal semantic role show concord with a higher probability than proper names with other roles. It is proposed that all relevant factors refer to frequency or convenience. A diachronic investigation shows that concord has become a much less preferred option over time. It is argued that concord is of low functional significance and suggest that this may explain the gradual loss of concord over time." 2023,12,12,"Polina Nasledskova (HSE, IL RAS)",Topological relations in Kina Rutul,"Kina variety of Rutul (< Lezgic < East Caucasian) has several different means for describing spatial relations. In this study, I analyze the way topological relations are described in Kina Rutul by means of spatial cases, spatial adverbs and postpositions, and spatial verbal prefixes. The data for this study was collected in field in 2019 and is based on a questionnaire ""Topological relations picture series"" by Bowerman&Pederson (1992). This talk depicts my first attempt at analyzing the collected data and this work is still in progress. My main objective is to determine in which contexts the spatial meanings of case, adverb/postposition and verbal prefix are different and what aspects of topological relations each of these elements relate to." -2023,12,5,"Sofia Oskolskaya (ILS RAS, HSE) in collaboration with Anna Smetina (IL RAS) and Natalya Stoynova (University of Hamburg)",Analysis of the Gorin Nanai texts from A. P. Putintseva’s text collection (1935-1936),"Gorin is the most northern Nanai dialect which is spread along the Gorin river, to the North from Komsomolsk-on-Amur. The ancestors of the Gorin speakers used to speak a Northern Tungusic language and shifted to Nanai about 150 years ago. Gorin Nanai is highly endangered and very poor-documented. A. P. Putintseva collected 18 notebooks of texts in Nanai during her work in 1935-1936. More than a half of the texts were recorded from Gorin Nanai speakers. Her manuscripts contain a lot of her own corrections. In our talk, we will focus on analysis of these corrections. We believe that some of them may reveal underdescribed dialectal features of Gorin Nanai." +2023,12,5,"Sofia Oskolskaya (ILS RAS, HSE), Anna Smetina (IL RAS), Natalya Stoynova (University of Hamburg)",Analysis of the Gorin Nanai texts from A. P. Putintseva’s text collection (1935-1936),"Gorin is the most northern Nanai dialect which is spread along the Gorin river, to the North from Komsomolsk-on-Amur. The ancestors of the Gorin speakers used to speak a Northern Tungusic language and shifted to Nanai about 150 years ago. Gorin Nanai is highly endangered and very poor-documented. A. P. Putintseva collected 18 notebooks of texts in Nanai during her work in 1935-1936. More than a half of the texts were recorded from Gorin Nanai speakers. Her manuscripts contain a lot of her own corrections. In our talk, we will focus on analysis of these corrections. We believe that some of them may reveal underdescribed dialectal features of Gorin Nanai." 2023,12,5,Andrey Chirkin (HSE),Reading group: Kilu von Prince (2019) Counterfactuality and past,"Many languages have past-and-counterfactuality markers such as English simple past. There have been various attempts to find a common definition for both uses, but I will argue in this paper that they all have problems with (a) ruling out unacceptable interpretations, or (b) accounting for the contrary-to-fact implicature of counterfactual conditionals, or (c) predicting the observed cross-linguistic variation, or a combination thereof. By combining insights from two basic lines of reasoning, I will propose a simple and transparent approach that solves all the observed problems and offers a new understanding of the concept of counterfactuality." 2023,11,28,"Vladimir Plungian (MSU, IL RAS, RLI, HSE)",Quechua “restrictive” marker *=lla*: semantic and morphosyntactic properties,"“Restrictive” markers (like Latin solum, English only etc.) represent an important type of units involved in the organization of discourse: being almost pervasive in the world’s languages, they often have unobvious patterns of polysemy, as well as non-trivial morphosyntactic properties. @@ -24,7 +24,7 @@ Tellings, Jos. 2014. Only and focus in Imbabura Quichua. Annual Meeting of the B " 2023,11,28,"Ivan Osorgin (HSE), Konstantin Filatov (HSE, ILS RAS)",Parabible: a researcher’s tool for small-scale parallel Bible studies ,"This talk summarizes the recent progress in the Parabible project. The machine-readable Massively Parallel Bible Corpus of Mayers & Cysouw (2014) is specifically designed for large scale quantitative analysis of Scripture, especially for purposes of grammatical typology. However, using this database as is, seems to be quite inconvenient for small-scale qualitative research. Our main aim for creating the Parabible tool was to facilitate the use of the database for unsophisticated researchers. We will present the current state of affairs, as well as discuss future paths of development. " 2023,11,21,Asya Alekseeva (HSE),Inclusive/exclusive distinction in personal pronouns in East Caucasian languages,"In this talk I will present the results upon my project in TALD on inclusive/exclusive distinction in personal pronouns. I will show the idioms where there is such a distinction and where there is none. Also the morphological relation between the pronouns will be taken into account: are the forms for 1PL related to ones for 1SG or, probably, for 2SG or 2PL? In addition, I will say a couple of words about the diachrony of personal pronouns systems in East Caucasian languages." -2023,11,21,Nastia Ivanova (HSE),Question marking strategies in East Caucasian languages,"In East Caucasian languages, various strategies are used for coding questions. During the talk, we will discuss the data on question marking, which was collected for TALD. Both polar and content questions, as well as interrogative and indirect questions will be taken into accoun. Additionally, we will briefly address meditative questions (a distinct semantic type of (non-canonical) questions often posed in the absence of an addressee and within the speaker’s inner speech) and the issues encountered during the data collection process." +2023,11,21,Anastasiya Ivanova (HSE),Question marking strategies in East Caucasian languages,"In East Caucasian languages, various strategies are used for coding questions. During the talk, we will discuss the data on question marking, which was collected for TALD. Both polar and content questions, as well as interrogative and indirect questions will be taken into accoun. Additionally, we will briefly address meditative questions (a distinct semantic type of (non-canonical) questions often posed in the absence of an addressee and within the speaker’s inner speech) and the issues encountered during the data collection process." 2023,11,21,Natalia Koshelyuk (HSE),LingvoDoc as a system for documenting and analyzing languages,"In this talk I will present the LingvoDoc platform, a multifunctional linguistic system designed for compiling, analyzing and storing dictionaries and corpora of various languages and dialects. It was developed under the guidance of Yu. V. Normanskaya and programmers of ISP RAS in 2012 as one of the electronic libraries of endangered languages. But with time, it became possible to conduct phonological, morphological, lexical and other types of analysis of linguistic data using special tools installed on the LingvoDoc. During the talk, I will give an idea of what features this platform has, what options and tools are installed in the system and how else it can be useful to researchers." 2023,11,14,"Nina Sumbatova (HSE, IL RAS), Svetlana Toldova (HSE)",Accessibility and morphological complexity: locative forms in Dargwa,"In many works, there is a discussion on the connection between certain sociolinguistic and even geographical characteristics of languages and the complexity of their phonological and/or morphological systems. This talk presents a case study where we check a possible connection of this type. @@ -42,7 +42,7 @@ Comrie, Bernard. 2006. Transitivity pairs, markedness, and diachronic stability. Nichols, Johanna, David A. Peterson & Jonathan Barnes. 2004. Transitivizing and detransitivizing languages. Linguistic Typology 8(2). 149–211. " -2023,10,24,"A. Alekseeva (HSE), N. Beklemishev (Universität Tübingen / Leibniz-Zentrum Allgemeine Sprachwissenschaft (ZAS)), M. Daniel (Collegium de Lyon / Laboratoire Dynamique du Langage), N. Dobrushina (CNRS), A. Ivanova (HSE), K. Filatov (HSE / ILS RAS), T. Maisak (HSE / IL RAS), M. Melenchenko (HSE), I. Netkachev (HSE), G. Moroz (HSE), I. Sadakov (HSE)",Atlas of Rutul dialects ,"The purpose of this talk is to show the Atlas of Rutul dialects, a database with visualized grammatical features for 12 Rutul idioms. Traditionally, five dialects of Rutul (Lezgic < Nakh-Dagestanian) are distinguished: Mukhad, Shinaz, Myukhrek, Ikhrek and Borch-Khnov, and so called “mixed” dialect for a number of villages (Ibragimov 2004). During our trip to South Dagestan in July 2022, we visited 12 Rutul villages. Everyone from our team gathered their questionnaire from at least two speakers in each village. The questionnaires were designed for different domains of Rutul phonology, morphology (nominal, verbal, pronominal, etc.), vocabulary and two discourse formulas. In this talk we will present some preliminary results of our work, and also we will talk about some difficulties and problems which arose during data processing. " +2023,10,24,"Asya Alekseeva (HSE), Nikita Beklemishev (Universität Tübingen / Leibniz-Zentrum Allgemeine Sprachwissenschaft (ZAS)), Michael Daniel (Collegium de Lyon / Laboratoire Dynamique du Langage), Nina Dobrushina (CNRS), Anastasiya Ivanova (HSE), Konstantin Filatov (HSE / ILS RAS), Timur Maisak (HSE / IL RAS), Maksim Melenchenko (HSE), Ivan Netkachev (HSE), George Moroz (HSE), Ilya Sadakov (HSE)",Atlas of Rutul dialects ,"The purpose of this talk is to show the Atlas of Rutul dialects, a database with visualized grammatical features for 12 Rutul idioms. Traditionally, five dialects of Rutul (Lezgic < Nakh-Dagestanian) are distinguished: Mukhad, Shinaz, Myukhrek, Ikhrek and Borch-Khnov, and so called “mixed” dialect for a number of villages (Ibragimov 2004). During our trip to South Dagestan in July 2022, we visited 12 Rutul villages. Everyone from our team gathered their questionnaire from at least two speakers in each village. The questionnaires were designed for different domains of Rutul phonology, morphology (nominal, verbal, pronominal, etc.), vocabulary and two discourse formulas. In this talk we will present some preliminary results of our work, and also we will talk about some difficulties and problems which arose during data processing. " 2023,10,24,"Polina Bychkova (University of Ljubljana), Daria Ryzhova, Polina Padalka, Aleksandra Martynenkova (HSE)",Multilingual Pragmaticon: towards the typology of pragmaticalization,"In this talk, we will present a database of discourse formulae ---` idiomatic multiword constructions serving as positive or negative answers to the previous utterance (cf. Be my guest or By no means). By the moment, the database contains more than 2000 items from 10 languages. We describe their pragmatic functions and reconstruct constructions they emerge from, classifying their source semantics. This way we attempt to reveal recurrent paths of pragmaticalization within the domain of 'yes' and 'no' answers." 2023,10,17,"Anastasia Yakovleva, Natalia Koshelyuk, George Moroz (HSE)",Preposition drop in bilingual and standard speakers of Russian: A corpus-based study,"In this talk, we present our corpus-based study of preposition drop in the speech of Mari-Russian and Besermyan Russian bilinguals compared with the speech of monolinguals. On the basis of the data from the ConLab’s collection of spoken corpora (cf. http://lingconlab.ru/resources.html), we demonstrate that the preposition ‘v’ is omitted in the speech of bilinguals more often than in monolingual speech and propose some possible explanations for the variation across different bilingual speakers. We will also highlight some methodological problems of p-drop studies and discuss ways of solving such problems." 2023,10,17,Aleksandra Martynenkova (HSE),Reading group: Judith Aissen (2023). Documenting topic and focus,"In the paper, four information structure relations are discussed: two types of topic (non-contrastive and contrastive) and two types of focus (information focus and contrastive focus). All four depend crucially on discourse context. Although topic and focus are sometimes viewed as complementary relations, they belong to distinct dimensions of information structure, with one (focus) having to do with the locus of new information in an utterance, and the other (topic) with the entity that the utterance is about. The following questions are specified in the article: @@ -57,7 +57,7 @@ The utility of various techniques for documenting these relations is observed, i 2023,10,10,Maria Starodubtseva (HSE),Distinction between transitive and intransitive imperatives in Daghestanian languages,"Formation of imperative and its connection to transitivity is relatively well studied for the Nakh-Daghestanian family at the level of individual languages, but no areal view is presented so far. The aim of this research is to describe the formal distinction between transitive and intransitive imperatives on the basis of Nakh-Daghestanian languages. A significant number of languages of the family show this distinction in morphology. Transitivity may also bear on the marking of the number of addresses. The chapter contains two interactive maps based on the data collected from the grammars. The current research has been carried out within the framework of the TALD project." 2023,10,10,Vasiliy Zerzele (HSE),Plural marking of imperatives and prohibitives in the languages of Daghestan,"My study within the TALD (Typological Atlas of the Languages of Daghestan) project is concerned with plural marking on imperatives and prohibitives among the languages of Daghestan. I will discuss the types of plural marking (found among ~55% of Daghestanian languages) and the problems with classifying these markers. Then I will discuss the same classification issues with prohibitives, as well as look at examples of languages with a transitive split. Finally, I will summarize it by discussing the geographical distribution of these features within Daghestan." 2023,10,10,"Chiara Naccarato (HSE), George Moroz (HSE)",Non-standard numeral constructions in L2 Russian: A corpus-based study,"In this talk we present our corpus-based study of numeral constructions in the Russian speech of bilinguals from different regions of Russia. Data from the ConLab's collection of spoken corpora of L2 Russian (cf. http://lingconlab.ru/resources.html) show that non-standard encoding of numeral constructions is not infrequent, e.g., četyre brat’ja instead of Standard Russian četyre brata. Variation is attested in all corpora, but to different extents (it is higher in Daghestan as compared to corpora from other regions), and generally much lower as compared to results obtained for other varieties of bilinguals' Russian (cf. Stoynova 2021 on Nanai and Ulcha Russian). The attested variation can only partly be explained as the result of pattern borrowing from the speakers' L1s, and correlations with the type of numeral involved (higher variation with paucals and collectives) would point to incomplete or non-standard L2 learning as a more viable hypothesis. For most of the corpora, variation seems to characterize the speech of few speakers-outliers, so we cannot extrapolate our conclusions to the whole population." -2023,10,3,"S. V. Knyazev (RLI), G. Moroz (HSE), S. Dyachenko (RLI)",Корпус ПРуД,"В докладе речь пойдет о создаваемом нами корпусе ""Просодия русских диалектов"". Мы расскажем о его устройстве, а также о тех вопросах, решению которых, на наш взгляд, могут способствовать представленные в нем данные: +2023,10,3,"Sergey V. Knyazev (RLI), George Moroz (HSE), Svetlana Dyachenko (RLI)",Корпус ПРуД,"В докладе речь пойдет о создаваемом нами корпусе ""Просодия русских диалектов"". Мы расскажем о его устройстве, а также о тех вопросах, решению которых, на наш взгляд, могут способствовать представленные в нем данные: Какова вариативность русских диалектных просодических систем? diff --git a/visualizing_scripts.R b/visualizing_scripts.R index 05c614f..9367f07 100644 --- a/visualizing_scripts.R +++ b/visualizing_scripts.R @@ -3,7 +3,7 @@ library(lubridate) library(tidytext) library(ggwordcloud) -df <- readxl::read_xlsx("data.xlsx") +df <- read_csv("data.csv") Sys.setlocale("LC_TIME", "en_US.UTF-8") df |> mutate(date = lubridate::make_date(year = 2020, month = month, day = day)) |> @@ -67,9 +67,15 @@ df |> word == "andi" ~ "Andi", TRUE ~ word), n = log(n)) |> - View() ggplot(aes(label = word, size = n))+ geom_text_wordcloud(rm_outside = TRUE, grid_margin = 2, seed = 42, shape = "square", max_grid_size = 138) + theme_minimal() + + +# number of seminars +df |> + mutate(date = lubridate::make_date(year = 2020, month = month, day = day)) |> + distinct(date) |> + nrow()