Psychometrics and Data Science with R and Python

Feb. 9, 2024

Lexicon-Based Sentiment Analysis Using R

In this post, we will uncover the power of lexicon-based sentiment analysis using R. I demonstrate how to harness the capabilities of lexicons like NRC and Bing to decipher the emotional pulse of your text data. With practical examples, you'll gain the skills to analyze sentiment scores and extract valuable insights from your textual data sets. (12 min read)

Jan. 4, 2024

Okan Bulut

Introduction to Psychometric Network Analysis

psychometrics

network

correlation

Psychometric network analysis (PNA), also known as network psychometrics, has emerged as a new framework to analyze the relationships among observed variables to better understand the underlying structure of psychological phenomena. Psychometric network models are particularly useful when dealing with complex and multidimensional data. In this post, I provide a brief introduction to psychometric network models, particularly Gaussian graphical models (GGMs), and demonstrate how to estimate them using R. (7 min read)

May 2, 2022

Sevilay Kilmen

Okan Bulut

Text Vectorization Using Python: Word2Vec

data science

text mining

text vectorization

natural language processing

python

In the first two part of this series, we demonstrated how to convert text into numerical representation (i.e., text vectorization) using the term-document matrix and term frequency-inverse document frequency (TF-IDF) approaches. In the last part of the series, we focus on a more advanced approach, Word2Vec, that can capture the meaning and association of words within a text. First, we will briefly explain how Word2Vec works and then demonstrate how to use Word2Vec in Python. (7 min read)

Jan. 16, 2022

Sevilay Kilmen

Okan Bulut

Text Vectorization Using Python: TF-IDF

data science

text mining

text vectorization

natural language processing

python

In the first part of this text vectorization series, we demonstrated how to transform textual data into a term-document matrix. Although this approach is fairly easy to use, it fails to consider the impact of words occuring frequently across the documents. In the second part of the series, we will focus on term frequency-inverse document frequency (TF-IDF) that can reduce the weight of common words while emphasizing unique words that are more important for each document. First, we will explain how TF-IDF can adjust the weights of the words based on their frequency in the documents and then demonstrate the use of TF-IDF in Python. (9 min read)

April 8, 2021

Jinnie Shin

Okan Bulut

Text Vectorization Using Python: Term-Document Matrix

data science

text mining

natural language processing

Text vectorization is an important step in preprocessing and preparing textual data for advanced analyses of text mining and natural language processing (NLP). With text vectorization, raw text can be transformed into a numerical representation. In this three-part series, we will demonstrate different text vectorization techniques using Python. The first part focuses on the term-document matrix. (8 min read)

March 23, 2021

Okan Bulut

Seyma Nur Yildirim-Erbasli

Visualizing Machine Learning Models

machine learning

classification

data visualization

Data visualization plays an important role when evaluating the performance of machine learning models. In this post, we demonstrate how to use the **DALEX** package for visualizing machine learning models effectively. Visualizations with **DALEX** can facilitate the comparison of machine learning models and help researchers understand which model works better and why. (10 min read)

March 4, 2021

Okan Bulut

5 Ways to Effectively Visualize Survey Data

survey

data visualization

When presented visually, survey results become much more interesting than some numbers squeezed into a boring table. Data visualizations can help your audience view and understand key insights in the results. There are many data visualization tools to present survey results visually, including bar charts, pie charts, and line charts. In this post, I demonstrate 5 alternative ways to visualize survey results. (13 min read)

Feb. 20, 2021

Okan Bulut

Building a Computerized Adaptive Version of Psychological Scales

digital assessments

psychological scales

CAT

Computerized adaptive testing (CAT) is a sophisticated methodology to create measurement instruments that are highly accurate and efficient. In this post, I explain how to evaluate the feasibility of creating a computerized adaptive version of a psychological instrument. (12 min read)

Feb. 12, 2021

Okan Bulut

Guher Gorgun

A Polytomous Scoring Approach Based on Item Response Time

psychometrics

response time

IRT

In this post, we introduce a polytomous scoring approach based on the optimal use of item response time. This approach provides an easy and practical way to deal with not-reached items in low-stakes assessments. First, we describe how the polytomous scoring approach works and then demonstrate how to implement this approach using R. (12 min read)

Jan. 19, 2021

Okan Bulut

Hatice Cigdem Bulut

How to Shorten a Measurement Instrument Automatically (Part II)

psychometrics

machine learning

optimization

Do you think there are too many questions on your survey? Are you worried that participants may get tired of responding to the questions in the middle of the survey? In this two-part series, I demonstrate how to shorten measurement instruments such as surveys automatically in R. The second part focuses on the use of two optimization algorithms (genetic algorithm and ant colony optimization) for reducing the number of questions in surveys and similar instruments. (13 min read)

Jan. 4, 2021

Okan Bulut

How to Shorten a Measurement Instrument Automatically (Part I)

psychometrics

machine learning

test development

Reducing the number of items in an existing measurement instrument (e.g., tests, surveys, questionnaires) is almost as tedious as creating a new instrument. Going through all of the items one by one and choosing the appropriate ones based on personal judgment could be highly laborious and inefficient. In this two-part series, I will demonstrate how to shorten a measurement instrument automatically in R. The first part focuses on the use of automated test assembly and recursive feature elimination for the automatic selection of items. (9 min read)

Dec. 21, 2020

Okan Bulut

Testing for Measurement Invariance in R

psychometrics

factor analysis

measurement invariance

Researchers conduct measurement invariance analysis to ensure that the interpretations of latent construct(s) being measured with their measurement instruments (e.g., scales, surveys, and questionnaires) are valid across subgroups of a target population or multiple time points. In this post, I demonstrate how to test for measurement invariance (i.e., configural, metric, scalar, and strict invariance) of an instrument using R. (11 min read)

Dec. 14, 2020

Okan Bulut

Explanatory IRT Models in R

psychometrics

IRT

eirm

Explanatory item response modeling (EIRM) is a flexible framework that allows researchers to model both person and item characteristics as predictors together. In this post, I demonstrate how to use the `eirm` package to estimate explanatory IRT models in R. (4 min read)

Psychometrics and Data Science with R and Python

Lexicon-Based Sentiment Analysis Using R

Introduction to Psychometric Network Analysis

Text Vectorization Using Python: Word2Vec

Text Vectorization Using Python: TF-IDF

Text Vectorization Using Python: Term-Document Matrix

Visualizing Machine Learning Models

5 Ways to Effectively Visualize Survey Data

Building a Computerized Adaptive Version of Psychological Scales

A Polytomous Scoring Approach Based on Item Response Time

How to Shorten a Measurement Instrument Automatically (Part II)

How to Shorten a Measurement Instrument Automatically (Part I)

Testing for Measurement Invariance in R

Explanatory IRT Models in R

Categories

Psychometrics and Data Science with R and Python

Reuse

Psychometrics and Data Science with R and Python

Lexicon-Based Sentiment Analysis Using R

Introduction to Psychometric Network Analysis

Text Vectorization Using Python: Word2Vec

Text Vectorization Using Python: TF-IDF

Text Vectorization Using Python: Term-Document Matrix

Visualizing Machine Learning Models

5 Ways to Effectively Visualize Survey Data

Building a Computerized Adaptive Version of Psychological Scales

A Polytomous Scoring Approach Based on Item Response Time

How to Shorten a Measurement Instrument Automatically (Part II)

How to Shorten a Measurement Instrument Automatically (Part I)

Testing for Measurement Invariance in R

Explanatory IRT Models in R

Subscribe

Categories

Psychometrics and Data Science with R and Python

Reuse