KIT - Information Systems II - Environmental & Sustainable Information Systems (ESIS) - Teaching - Diploma/Thesis - An Empirical Study of LLM-Assisted Reward Specification for Multi-Agent Reinforcement Learning in Electricity Markets

An Empirical Study of LLM-Assisted Reward Specification for Multi-Agent Reinforcement Learning in Electricity Markets

Type:Master's Thesis
Date:Open
Supervisor:
Julius Grams

Background

Reinforcement learning is increasingly used to simulate strategic bidding in electricity markets, whether to test market designs before deployment or to study how actors behave under new regulation. The practical bottleneck in this work is reward design. Specifying a useful reward function requires both deep market knowledge and hands-on experience with RL methodology, a rare combination of skills that currently limits who can apply Multi-Agent Reinforcement Learning (MARL) to market analysis.
Recent work shows that Large Language Models can generate reward functions directly from natural-language descriptions, either as reward proxies (Kwon 2023), as generators of executable reward code (Eureka, Text2Reward), or in iterative collaboration with human feedback (REvolve 2024). Existing evaluations, however, focus almost exclusively on technical performance metrics in robotics and game environments. Whether LLM support can genuinely substitute for or complement domain expertise has not been systematically investigated, and no such study exists for the electricity-market MARL context.

What You Will Do

You will work directly with ASSUME, an open-source MARL framework for agent-based electricity-market simulation, and build a reward specification system that translates natural-language inputs into executable reward functions for heterogeneous market agents. The prototype will then be evaluated in a user study comparing how experts and non-experts approach reward design tasks with and without LLM support.

Expected Contributions

Methodological: First systematic test of whether LLM support reduces the expertise requirement in MARL reward design.
Technical: A working prototype for stakeholder-specifiable reward design on ASSUME.
Empirical: Quantitative and qualitative evidence on democratization effects in human–LLM collaboration for technically complex design tasks.

Requirements

Solid Python and machine learning skills, ideally with prior exposure to reinforcement learning. Willingness to learn the fundamentals of electricity-market design. Interest in empirical user research with a mixed-methods design. Initiative in recruiting study participants (with support from the chair).

Application

Please send a cover letter and transcript of records by email to julius.grams∂kit.edu.