WebVSD: Discovering Visual Stimuli on Dynamic Web Interfaces
Usability analysis on the Web traditionally assumes that each page view is a single, static stimulus. Modern Web interfaces break this assumption: menus expand, carousels rotate, and content fades in response to user interaction. When multiple users encounter different dynamic states of the same page, aligning their interaction data (eye gaze paths, mouse traces, clicks) for aggregate analysis becomes difficult.
WebVSD addresses this with a semi-automated framework for visual stimuli discovery. Video recordings of user sessions are split into stimulus shots, i.e., contiguous sequences of frames that a human observer would consider visually almost indistinguishable, and then clustered across users into visual stimuli that serve as shared reference images for analysis. The splitting and merging are driven by a visual change classifier trained on 53 computer vision features (edge-based, signal-based, SIFT, and pixel-value features proved most informative; optical flow and text recognition contributed little). Evaluation on 48 user sessions across 12 real-world websites shows that the discovered visual stimuli reduce the pixel volume a usability expert must inspect to under 4% of the original video recordings while correctly representing over 92% of frames on average. Case studies and a survey of external usability experts confirm that the method integrates naturally into existing analysis workflows.
Steering Stereotypes: Inference-Time Intervention on Social Group Descriptions
Large language models increasingly generate content at Web scale (news summaries, product descriptions, chatbot responses) that may reflect and propagate harmful stereotypes about social groups. A central question in LLM interpretability is whether these stereotypes are encoded in a structured, manipulable way.
Using the Agency-Beliefs-Communion (ABC) model from social psychology, which defines 16 stereotype dimensions via antonym pairs (e.g., powerless-powerful, insincere-sincere), the authors construct a probing dataset and apply mass-mean probing to attention head outputs in Mistral-7B-Instruct and Llama-2-7B-chat. The result: all 16 stereotype dimensions are linearly encoded across both models. This linear structure enables a lightweight inference-time intervention: adding a scaled direction vector to selected attention heads produces controlled shifts in how LLMs describe social groups, steering outputs between the positive and negative poles of each dimension without modifying model weights. The approach offers a transparent, computationally inexpensive alternative to retraining or fine-tuning for stereotype mitigation, particularly relevant for Web-scale content generation systems.
On the Meaning of the Web as an Object of Study
The third contribution takes a step back to ask a foundational question: what does it mean to study "the Web" in 2026? Drawing on Herbert Simon's theory of artificial systems, the paper distinguishes technological tools, machines, and environments, and argues that the Web has completed a transition from a focused technological object to a universal digital environment. This very success has fragmented its academic community. The Web Conference, once focused on core Web technologies, now receives thousands of submissions from adjacent fields like AI and NLP, many of which neither require nor meaningfully engage with core Web concepts.
The paper identifies four interconnected pressures driving this situation: an academic tragedy of the commons (prestige attracts off-topic submissions, eroding disciplinary identity), the environmental nature of the Web itself (as a universal space, virtually anything can be framed as Web-relevant), a crisis of data access (platform restrictions on social media datasets push researchers toward adjacent topics), and the disruptive force of AI (LLMs and autonomous agents make nearly every topic seem tangentially Web-related). The authors conclude that a fundamental community discussion is needed to define what it means to study the Web now that it has become the universal infrastructure for global digital activity.
References
[1] Raphael Menges, Steffen Staab, Christoph Schaefer, Tina Walber, and Chandan Kumar. 2025. What Did My Users Experience? Discovering Visual Stimuli on Graphical User Interfaces of the Web. ACM Transactions on the Web 19, 2, Article 11. https://doi.org/10.1145/3715881 (Journal publication presented in the journal track of WebSci 2026)
[2] Farane Jalali Farahani and Steffen Staab. 2026. Steering Stereotypes: Inference-Time Intervention on Social Group Descriptions Generated by Large Language Models. In 18th ACM Web Science Conference (WebSci Companion’26), May 26–29, 2026, Braunschweig, Germany. ACM, New York, NY, USA. https://doi.org/10.1145/3795513.3807434. (https://drive.google.com/file/d/1wsolN8-8B3tA314AgICBnpFmlG15FMAL/view?usp=sharing)
[3] Claudio Gutierrez and Daniel Hernández. 2026. On the Meaning of the Web as an Object of Study. In 18th ACM Web Science Conference (WebSci Companion’26), May 26–29, 2026, Braunschweig, Germany. ACM, New York, NY, USA. https://doi.org/10.1145/3795513.3807425. (https://arxiv.org/abs/2604.12756)