Skip to content

Qualitative analysis of social media trace data concerning online peer support for adolescent sexting

Authors: Heidi Hartikainen, Afsaneh Razi, Pamela Wisniewski

Please cite as: Hartikainen, H., Razi, A. & Wisniewski, P. (2022): Qualitative analysis of social media trace data concerning online peer support for adolescent sexting. In S. Kotilainen (Ed.), Methods in practice: Studying children and youth online (chapter 6). Retrieved DD Month YYYY, from, doi:

Please allow external media cookies to access this content.

Video: Talking abstract - Qualitative analysis of social media trace data concerning online peer support for adolescent sexting

If you are experiencing issues with the video player, please watch the video here on our YouTube channel. We are in the process of fixing this issue. Please excuse the inconvenience.

Adolescents use the internet to seek support concerning sex as it is accessible, interactive and allows for anonymity. In this chapter, we discuss qualitative analysis of a social media dataset from a peer support platform, catering to adolescents and young adults. The licensed dataset included over 5 million posts, 15 million comments, and metadata. The dates of posts and comments ranged from 2011 to 2017. We identify challenges concerning 1) sensitive, potentially triggering data, 2) scoping the dataset for analysis, and 3) working with a geographically dispersed team analyzing posts and comment threads regarding adolescent sexting.

Our research protocol was evaluated by the university’s Institutional Review Board, which determined the dataset exempt from human subjects’ review as personally identifiable information was removed. Because the dataset included sensitive topics like sex and self-harm, research team still completed training for working with human subjects. When reporting findings (see (Hartikainen et al., 2021a, 2021b; Razi et al., 2020), we anonymized and paraphrased quotes to ensure they are untraceable. Data was disguised by removing quotes and pseudonyms, and introducing fictitious details that did not change the context (Bruckman, 2002). As data was potentially disturbing, research team was encouraged to take breaks and discuss any concerns.

Preparing the dataset for qualitative analysis was challenging due to large size and unstructured nature. To scope the data, we ran a query to identify posts a) by adolescents aged 13-17 b) containing online and sexual terms. We used teen social media and sexual jargon (Bissel, 2021) and added terms when reading through posts iteratively. This allowed us to downsize posts to 0.2% of those in the dataset. 

We coded posts for relevancy in pairs. A post was relevant if 1) it sought support 2) involving an online sexual experience. We defined support seeking to direct support seeking (asking for help) and indirect support seeking (hinting problems) (Barbee & Cunningham, 1995). Later we scoped the dataset further to posts where sexual experience involved 3) someone the poster knew (Hartikainen et al., 2021a, 2021b), as we found adolescents have more difficulty rejecting sexual solicitations from known others (Razi et al., 2020). 

When conducting qualitative analysis, we coded data in three phases: 1) Posts with codes emerging from data, 2) peer comments with codebook based on classification of social support (Cutrona & Suhr, 1992), and 3) poster replies with codes emerging from data. Afterwards we used axial coding (Glen, 2014) to merge similar codes, group codes by theme, and identify patterns. We calculated Interrater-rater reliability to check the quality of annotation to be acceptable for all codes (Glen, 2014), and prepared a narrative synthesis illustrating results.

We observed a decrease in Inter-rater reliability after Covid-19 restrictions. We previously worked in the same space and discussed any issues, and while we continued to chat online, less discussions emerged organically. This led to less uniform coding. Problems might have been avoided by virtual working meetings, where coders discuss in real time. Another challenge was that digital trace data is not structured to find answers to research questions the same way as e.g. interview data. This makes it challenging to synthesize. Codebooks with clear definitions and example cases, perhaps based on established frameworks, help prevent disagreement, as do having coders complete parts of annotation together. 

Lessons learned

  • Researchers should complete training for working with human subjects and submit research protocol for evaluation to the institution’s IRB. 

  • Automatic approaches like query searches with relevant keywords help scope the dataset to a feasible size for qualitative analysis.

  • To make sure the process is valid and robust, code for relevancy and use an iterative process for selecting keywords to search.

  • Social media trace data is unstructured, and coding labor-intensive and time-consuming. Clear codebook and discussing together ease the process. 

  • When coding, take care of the mental health of the research team by encouraging  taking breaks, and to voice out concerns. 

  • If coding in geographically dispersed teams, arranging online coding sessions and discussing issues as they arise help ensure IRR.

  • When reporting findings, disguise and anonymize quotations so they are not traceable. 

In the end, while challenging, we found analyzing digital trace data especially valuable concerning topics adolescents might not be comfortable talking about, as it provides a researcher independent glimpse into the topic. In addition to providing a snapshot in time, as datasets like this contain historical data, they could be used to study for example self-presentation over time and during life transitions. 


Dr. Wisniewski’s research on adolescent online safety is partially supported by the U.S. National Science Foundation grants IIP-1827700 and IIS-1844881 and by the William T. Grant Foundation grant #187941. Opinions, findings, conclusions or recommendations expressed are those of the authors and do not necessarily reflect the views of research sponsors. 

Download the full handbook here: PDF.

  1. Barbee, A. P., & Cunningham, M. R. (1995). An experimental approach to social support communications: Interactive coping in close relationships. Annals of the International Communication Association, 18(1), 381–413.

  2. Bissel, J. (2021). 2021 Teen Slang Meanings Every Parent Should Know. Bark.

  3. Bruckman, A. (2002). Studying the amateur artist: A perspective on disguising data collected in human subjects research on the Internet. Ethics and Information Technology, 4(3), 217–231.

  4. Cutrona, C. E., & Suhr, J. A. (1992). Controllability of Stressful Events and Satisfaction With Spouse Support Behaviors. Communication Research, 19(2), 154–174.

  5. Glen, S. (2014). Cohen’s Kappa Statistic. Statistics How To.

  6. Hartikainen, H., Razi, A., & Wisniewski, P. (2021a). Safe Sexting: The Advice and Support Adolescents Receive from Peers regarding Online Sexual Risks. Proc. CWCW '21, 5(CSCW1), 42:1-42:31.

  7. Hartikainen, H., Razi, A., & Wisniewski, P. (2021b). If You Care About Me, You’ll Send Me a Pic’—Examining the Role of Peer Pressure in Adolescent Sexting. CSCW ’21 Companion, 6 pages.

  8. Razi, A., Badillo-Urquiola, K., & Wisniewski, P. (2020, April 25). Let’s Talk about Sext: How Adolescents Seek Support and Advice about Their Online Sexual Experiences. Proc. CHI ’20. 1-13.

Cookie preferences

We use cookies on our website. Some of them are essential, while others help us to improve this website and your experience.