Work Package 1:

Machine Learning to optimize performance of advanced synchrotron and FEL light-sources

Q&A with Dr Annika Eichler, HIR³X WP lead on “Machine learning to optimize performance of advanced light sources”

Q: At the start of the project, what aspects of machine learning for the operation were most in need of improvement?

A: When I entered the project I was still very new to the world of particle accelerators. I think at the time, it helped to get an outsider’s perspective on ways to improve things. During the project, I saw that setting up many of the accelerator components involves a lot of manual tuning, even though there are already many optimisers in the control room. We rely on the operators’ experience to know which knobs are turned, when to optimize the performance, and generally keep things running smoothly.

Exploiting machine learning for optimization, as the title suggests, was key in the project. As there is not one solution that solves everything, we concentrated in the different tasks of the WP at different aspects like lasers, free electron lasers and photon diagnostics.

But still, it’s impossible even with machine learning to address all aspects. We started with optimization for specific problems. The final goal would be then to develop a hierarchical strategy that ties the developments together. This can support the operators to approach tuning the accelerators more accurately and efficiently.

This means that researchers can get more beam time for their experiments.

In addition to develop optimization strategies, we also investigated the aspect of virtual diagnostics. Sometimes measurements take a lot of time and need multiple shots for example. Or they might be destructive, for example a screen that is inserted into the beam pipe and gives diagnostics about the beam but stops it. Having better (virtual) diagnostics helps to optimize the accelerator and the beam properties again.

Another aspect of your work in the project was using Large Language Models to tune particle accelerators. What potential do you think LLMs have for particle accelerator tuning in the future?

This was more of a fun side project; we really just wanted to know if it could work! In our research the LLM tried to tune an accelerator subsystem using only natural language from the operator, and we compared its performance to modern optimisation algorithms that we developed as well.

We were surprised and impressed that it actually worked, even if it did not achieve by far as good results as more advanced techniques. It think it should not be used as a numerical optimizer as this is not what these models are made for - there are much stronger optimizers which can be applied where we can get much better results.

I think it has potential as an aid for the operators of particle accelerators, for example they can give help in case of problems and failures, summarize knowledge and experience as in exists in the form of logbooks, and also help in the optimization by proposing what to tackle when and maybe whom to ask.

What results from your work in HIR³X would you like to explore in future research?

There is no solution that fits everything, therefore I think it will also be good for the community which solutions fit which problems. We investigated for example “reinforcement learning” and “Bayesian optimization.” If you just want to tune something very rarely, then you should not use the “reinforcement learning” methods because there’s just so much effort to develop it to solve the specific problem but better use Bayesian optimization. But if you want to have some problem material over and over to do again, then training a machine learning algorithm with “reinforcement learning” might do the job better.

I think one important lesson from this project is that getting enough machine time and data is difficult for these very data-hungry machine learning methods like reinforcement learning. Therefore we developed a user-friendly simulation tool called Cheetah for ultrafast simulations to be able collect large datasets quickly. This simulation tools reduces computation times by orders of magnitude compared to physics simulations at the cost of accuracy. But also here it is important there is not one simulation tool for everything. Cheetah is not meant for physics discovery but optimization and control.

Suggested research:

Kaiser, Jan, Oliver Stein, and Annika Eichler. "Learning-based optimisation of particle accelerators under partial observability without real-world training." International Conference on Machine Learning. PMLR, 2022.
Kaiser, Jan, Anne Lauscher, and Annika Eichler. "Large language models for human-machine collaborative particle accelerator tuning through natural language." Science advances 11.1 (2025): eadr4173.
Mayet, Frank "GAIA: A General AI Assistant for Intelligent Accelerator Operations” arXiv preprint arXiv: 2405.01359 (2024).
Kaiser, Jan, et al. "Reinforcement learning-trained optimisers and Bayesian optimisation for online particle accelerator tuning." Scientific reports 14.1 (2024): 15733.
Kaiser, Jan, et al. "Bridging the gap between machine learning and particle accelerator physics with high-speed, differentiable simulations." Physical review accelerators and beams 27.5 (2024): 054601.

Q&A with Gesa Goetzke

Q: What is your background, and how did you get into the HIR³X project?

A: I studied physics with a focus on machine learning. This is always an interesting question for me, because although I am completing a physics PhD, I work in a field between physics and computer science.

During my physics bachelor, I had a student assistant job at the Fraunhofer Institute, where I started to do some projects with machine learning. In my master thesis, I wanted to apply machine learning for a specific detector. But since I was the first one in that working group working machine learning, I was missing a lot of practical knowledge.

I then went to a machine learning school where I met my now supervisor, who suggested I do a PhD so that my work could be included and better connected in their machine learning groups.

I went to DESY because I wanted to go to a larger institute to get better connections, and three years ago applied for a position researching machine learning at accelerators, which turned out to be the HIR³X project.

Current Work in Machine Learning:

I'm working on in the photon diagnostic group, where I help with pulse length analysis. I try to improve the pulse length diagnostics using machine learning for the different accelerators in the project, including SLS at SLAC and our own FLASH x-ray free-electron laser here at DESY.

I could try to explain the details of the two projects, but basically both of them are based on find patterns in their data sets to help us with our analysis. And this is really something that is ideal for machine learning. In my case, I work with unsupervised machine learning, so I do not need simulations or labels to help with analysis. This is really beneficial for accelerators, where it's sometimes hard to do simulations that are very accurate.

Why so?

There's this term called the “simulation gap” - when you work with simulated data, your machine learning algorithm may get nice results, though the algorithm may struggle when dealing with rea-world data. If you also train on real data, then you do not have this gap between simulations and reality.

What challenges come with this?

Early in my research career, I thought that if you go to a big research centre, everything is organised with building blocks that work together. What surprised me after arriving is that there is no ‘one-button solution’ for data analysis & diagnostics. Oftentimes people rely on an expert who knows exactly how a detector works.

I believe that for visiting researchers, more aspects of data analysis should be ready-to-go and more consistent. Ideally, you could push a button to have a script generate basic information about your experiment.

That’s what I’m working on in HIR³X; I think it's possible to go more in this direction. You could have one network that could work for a specific setup. This would be like my dream for the future. It won't happen during my PhD but in HIR³X we’re pushing the boundaries for specific setups and getting reliable evaluations without relying on experts.

As an early-career researcher, what opportunities did HIR³X offer you?

One opportunity was being able to visit multiple international research facilities. It was really great that I could see so many sites; I was in the USA at SLAC, in Triest and Frascati in Italy, in Paris-Saclay in France , and could go to machine learning schools and workshops.

I was also surprised how open everything was. While at FLASH, I could just go to the accelerator to see what the users were doing at the time.

I also experienced significant freedom to do my research. There was a lot of free space to explore everything. They also allowed me to go for three months to California. This was also partly the aim of the research project, for early career researchers to get international experience by travelling to California.

But other than that, I really appreciated the open structure of the research. You could go to one research group to learn a framework, another to learn how to use a certain tool, look into this library, and so on. We had a lot of freedom to choose our own path.

What’s next for you?

I’m aiming to finish my PhD, and then I think it's quite likely that I will try to get a postdoc position afterwards. I think it would be really nice to do a larger comparison between different methods for pulse diagnostics, because there's been a lot of discussions recently within this research area. On the specific detector that I'm working on, there are multiple ideas now on how to get the desired pulse profiles and pulse lengths, and some are with machine learning.

For future research, it would be interesting to compare these different pulse diagnostic methods. These different researchers could run their diagnostic on one data set and see what comes out.

After the FLASH upgrade there will be more opportunities to compare diagnostics tools. “ For example, the diagnostics tool I'm working on is good at looking at larger pulse length, around 30 to several hundred femtoseconds. Another diagnostic is really good at looking at shorter pulse lengths at around five to 20 femtoseconds. You could run both diagnostics on a pulse around 25 fs to compare how they work. This is what I would love to see.

It's really important for me to not just get a result from a network and consider the job done. I really want to be very confident in it. That’s why I would love to compare these emerging methods.

Work Package 1:

Machine Learning to optimize performance of advanced synchrotron and FEL light-sources

Q&A with Dr Annika Eichler, HIR3X WP lead on “Machine learning to optimize performance of advanced light sources”

Q&A with Gesa Goetzke

Q&A with Dr Annika Eichler, HIR³X WP lead on “Machine learning to optimize performance of advanced light sources”