Построение профиля студента на базе данных социальных сетей
Целью данной статьи является определение специфических особенностей студентов бакалавриата ВШМ СПбГУ на основе анализа данных аккаунтов студентов в социальной сети ВКонтакте для предоставления администрации Школы инсайтов на основе полученных результатов. Основными задачами исследования являются сбор данных, кластеризация студентов по методу k-means с учетом их интересов, определение специфических особенностей студентов с высокой успеваемостью и выявление особенностей интересов студентов и коммуникативных паттернов в контексте их специализации, успеваемости и курса обучения. В работе также приведен подробный анализ существующих подходов к анализу данных социальных сетей и построению профилей студентов. Методологической основой исследования являются такие методы профилирования пользователей и интеллектуального анализа данных, как описательная статистика, статистический анализ, ft-idf, кластеризация k-средних. Основными выводами являются различия в коммуникативном поведении студентов HR и маркетинга, демонстрирующих совершенно разный уровень экстраверсии; перечень специфических различий между интересами первокурсников и студентов последнего курса, подтверждающих выдвинутую гипотезу о влиянии обучения в ВШМ на интересы студентов. Также были определены ключевые особенности студентов с высокой успеваемостью, такие как интерес к профессиональной академической тематике и разница в количестве друзей и подписчиков в социальной сети. На основании результатов даны управленческие рекомендации, такие как внедрение дополнительных курсов, взаимодействие со студентами с помощью коротких видеороликов и конкурсы пользовательского контента для продвижения Школы, а также освещены перспективы дальнейших исследований.
Table of contents 6
Introduction 7
CHAPTER 1. THEORETICAL BACKGROUND OF STUDENT PROFILING 10
1.1. Prerequisites for student profiling in terms of trends in modern education 10
1.2. Student profiling and social network research approaches 12
1.3. Data mining techniques in social networks analysis 17
1.4. Summary of Chapter 1 23
CHAPTER 2. APPLYING CHOSEN METHODS FOR STUDENT PROFILING IN ONLINE SOCIAL NETWORKS 25
2.1. Data collection and first steps of data processing 25
2.2. Hypotheses testing and results 37
2.3. Summary of Chapter 2 48
Conclusion 49 Main findings 49
Managerial implications of the results and proposals 51
Prospects for future research 52
List of references 54
Appendix. Code. 59
Nowadays the vast majority of Planet Earth’s population is using information and communications technologies during their day to day activity. People are leaving digital footprints, which are then collected and analyzed. Companies are spending enormous sums of money on digital marketing in order to offer the best possible deals to their audience. In the sphere of higher education universities are on the business side, whereas enrollees and students are on the customers’ side. It is no secret that universities aim to attract not only the most gifted students, but also those who would be a perfect fit for their programs. This requires an understanding of which
marketing strategies to use, which, in turn, raises questions about the unique characteristics of the audience. And as youngsters (the so called Generation Z), including prospective and current university students, are even more likely to be engaged in such type of technological communication as online social networks, the data that could be extracted from their profile pages could be of a great help to complete this task.
Graduate School of Management of St. Petersburg State University (GSOM SPbU), the only Russian business school that is in the top 95 of the best European schools in the Financial Times ranking and has prestigious international accreditation AMBA and EQUIS, has reached out to us with a problem to solve. There are over 700 students enrolled in the undergraduate programs of GSOM. To no surprise, most of them are registered in various online social networks, including Russian social network Vkontakte, which is used by the School administration as one of the means of feedbacks and for urgent announcements. According to GSOM’s hypothesis, an analysis of the data from students’ profiles can become a source of valuable insights about their interests and preferences. Moreover, some dependencies could be traced by combining this data with the information about students’ academic progress. This may help to come up with some noteworthy conclusions on which categories of students are better prepared to study at GSOM and how to
allure them.
On the meeting with the company representative (Vitaly V. Mishuchkov, Director of
Bachelor and Master Programmes Office) were discussed the key questions that were expected to be answered:
1. What are the typical interests of current GSOM bachelor students?
2. What is their common behavior in online social networks (how many friends and subscribers they have, what their interests are, including what communities they are
following)?
3. Do they tend to take part in related to GSOM activities in online social networks
(communities, conversations, etc.)?
7
4. Are there any features that distinguish students with better academic progress from those with lower results?
The emphasis on undergraduate students is made due to their very nature – most students go to universities directly from school, and not all of them are confident in their choice. Master students, in contrast, are older and know better what they want. Therefore, for GSOM it is of a greater importance to understand the peculiarities of bachelor students. Moreover, the sample of latter is 3-4 times higher than the one of the graduate students (whose enrollment is less and studies last 2 years instead of four), which will make the results less biased and more reliable.
The project goal, therefore, was set as following: to define specific features of GSOM undergraduate students based on analysis of data from students’ profiles from online social network Vkontakte in order to provide the School’s administration with insights based on analysis of these features.
In order to reach the above-mentioned project goal we have set up the following project objectives:
1. Develop a project plan.
2. Receive the information on the academic progress of students from the Client company
coordinator (CC).
3. Obtain students’ profiles data from the online social network Vkontakte.
4. Build an image of an undergraduate student based on available data.
5. Classify bachelor students based on the data from Vkontakte.
6. Investigate students’ specific features based on different dimensions.
7. Develop proposals to GSOM administration based on analysis of these features.
The relevance of this project is based on two pillars. Firstly, on the impact of the rapid development of information technologies in terms of transparency and accessibility of data posted
by people on the Internet. Secondly, on no less rapidly changing approaches in modern education, including the concept of Education as a Service, and the strive of educational organizations not to lose in this race.
The object of the study is an image of a student, based on information from online social networks, which allows classifying students according to the collected data. The subject of the study is the data of undergraduate students of the Graduate School of Management at St. Petersburg State University, which they post on online social networks, as well as information about their academic performance.
Theoretical significance of the thesis lies in the answer to the question: is it possible build a picture describing current students’ profiling in online social networks? Under the practical
8
significance of the project is understood the opportunity to implement the model by any educational organization in Russia and even abroad (with adequate adjustments, of course).
The thesis will consist of a title page, statement on the independent nature of the work, annotation, table of contents, introduction, main part, conclusion, reference list and applications. The study itself consisted of the two big parts that are reflected in the organization of chapters and contents of the paper. The first stage included the theoretical overview of all relating topics such as the modern trends in education, approaches to social networks analysis and student profiling and data mining tools used for these purposes that are resulted in chapter one. Here the broad
justification on the choice of approach and methodology is given. The second part of the study is practical implementations of the methodological issues from the first chapter. The second chapter represents two stage of the practical part of the study: primary data analysis including data collections, exploratory data analysis and attempt to build k-means based classifier on the whole set of students, the second part consists of testing the set of hypothesis within the frameworks of certain dimensions describing students (year of study, academic progress and concentration).
As the group of researchers consisted of two people, the responsibilities were distributed as follows:
− Sergey Babushkin was responsible for global and managerial aspects including communication with customer and academic adviser, setting goals, frameworks; theoretical parts regarding overview of modern educational trends, setting research in the framework of GSOM strategy and other related theoretical issues, in practical part he was responsible for clustering students based their academic progress, labeling students according to their concentrations, formulation and testing of some hypotheses regarding field of study and interests, etc. and elaboration of final recommendations for GSOM administration.
− Marina Talianskaia was responsible mainly for more specific and technical issues, thus, theoretical overview of existing approaches to profiling, social networks analysis and data mining techniques; practical issues such as data collection from VK, NLP, exploratory data analysis and testing and formulating some hypothesis mainly regarding communicative patterns and defining specific features of well- performing students; final findings and further research prospects formulation.
Последние выполненные заказы
Хочешь уникальную работу?
Больше 3 000 экспертов уже готовы начать работу над твоим проектом!