Another Deep Dive Into AI Autosegmentation

Posted on

How about we take another deep dive into AI autosegmentation? What do you say?

We have a new C3RO dataset to work with featuring CT images for a GYN/female pelvis case. However, because the physician contours collected for this dataset were primarily for target volumes, not organs-at-risk (OAR), I say we do these sessions with a different focus. Let’s use this dataset anyway, but do AI autosegmentation overlays as well building and visualizing the results as 3D consensus maps for each submitted AI structure.

After all, it is very useful (and quite interesting) to understand how much variation we see from one AI model to another. Understanding variation is key to optimizing quality.

As before, for any vendor that “opts in” and submits their AI-generated anatomical structures, I’ll arrange a web-based interview and publish it as a podcast where we will discuss, well, whatever comes up!

Email me at if you want to submit data and participate in an interview. (Note: You can opt to do the former without the latter, if you wish, but the interviews are a lot of fun.)...

Consensus Contours and AI Autosegmentation: Video Podcasts!

Posted on

I’ve released a series of podcasts covering consensus contouring studies and interviews with AI vendors discussing AI validation and comparing “man vs. machine.”

Thanks in particular to Kevin Tierney (Radformation), Josh, Jon, and Carter (Limbus AI), and Mark Gooding (Mirada Medical) for their excellent and interesting interviews on their commercial AI engines.

The entire playlist can be found on YouTube by clicking this link.

Individual videos are as follows:



AI Autosegmentation Podcasts: Invitation and Info

Posted on

Dear AI Autosegmentation Vendors,

I am reaching out to any/all vendors to invite you to take part in periodic, non-funded podcasts. The forum will allow you to introduce yourselves, show your wares, and share your thoughts about how the radiation therapy industry can best validate AI autosegmentation engines/models.

The conversations will center around the imaging datasets and population of radiation oncologists’ manual segmentations that are generated by the non-commercial, not-for-profit project called “C3RO” (Contouring Collaborative for Consensus in Radiation Oncology).

Please see the details below. Email me (Ben Nelms) directly at if you are interested and would like to get on the schedule.


Ben Nelms


An unbiased podcast/interview series focusing on AI autosegmentation of human anatomy, specifically to use as an input to radiation treatment planning.


Goal 1. Elevate the conversation about how best to validate AI outputs, both in the short and long term.

Goal 2. Get to know AI vendors in a casual, scientific, and “non-salesy” forum.

Goal 3. Show cool, real-time results from (1) populations of human experts and (2) AI engines.

Goal 4. Generate some pretty great ideas about how to build gold standards of human anatomy segmentation to use for both education of clinicians as well as validation of AI software.


Host. Ben Nelms, Ph.D. (Canis Lupus LLC)

Guests. Representative(s) from any/all willing AI vendors and research groups who specialize in anatomy autosegmentation

Note: Conversations (likely all of them, or at least the initial ones) will be one vendor at a time. This helps ensure equal airtime and no distractions.


The conversation flow will be casual, with an underlying structure to cover some, if not all, of the following topics.

Intro. Your background, and what brought you to this field? (Optional) How does your group currently do validation of your AI outputs? Is it quantitative, qualitative, or both?(Required)

Data / Results. Generate contours – ideally in real time right before or in the early minutes of the meeting – for the imageset in question. Compare your automated outputs to (1) the population Isoagreement clouds and (2) various expert contours (if available), per structure....

Come Wonk with Me: Digging into C3RO Data

Posted on

The Contouring Collaborative for Consensus in Radiation Oncology (C3RO) is picking up steam. We recently reached a checkpoint after our first ~quarter year – sessions on three different body sites – and asked ourselves, how is it going?

Are our goals clear? Will achieving our goals be impactful in a tangible way? Are our methods sound? Are we providing value to the radiation oncologists participating in the program as well as the industry as a whole? Are we squeezing out the goodness of knowledge out of this mysterious fruit? And are we having any fun?

Then we asked our participants a bunch of questions, too, as an electronic survey. Two of the main messages that came out of the survey were these:

[ 1 ]  People really hunger for detailed “How I created by anatomical contours and why” explanations by invited expert panelists. Rather than try to go fast and cover lots of material and many regions of interest, slow down and talk about them, and debate them, in greater detail.

[ 2 ]  People also are interested in the population statistics, the performance of the “wisdom of the crowd,” and how it relates to potentially deriving or vetting gold standards based on consensus calculations.

Well, you spoke, and we listened! So, we are going to start doing two parallel and complementary tracks in terms of podcasts.

The first track will be hosted by radiation oncologists with radiation oncologists as panelists. The main focus here will be the “how” and “why” of experts’ contours, and hopefully some healthy discussion and negotiation of observed differences.

The second track will be led by yours truly, and we’re going to get unabashedly wonky and nerdy about it. I’m going to try to get interested AI vendors, researchers, and other people who deep-think about these things as my panelists. We’ll talk a lot about statistical variation, what we can learn from it, the challenges it poses, and how to potentially tease out great wisdom from the crowd and get to one of the holy grails of modern radiation therapy: building standard datasets against which AI autosegmentation can be measured and potentially validated....

Opportunity for AI Autosegmentation Vendors

Posted on

Over the past four years, I have had the pleasure to get to know and collaborate with TROG, an organization based in Australia and New Zealand that conducts clinical trial research involving radiotherapy. TROG have used ProKnow software to run plan studies for SRS brain treatment as well as SBRT spine, pancreas, and liver cases. They have also done contouring workshops across multiple body sites.

For the 2022 TROG meeting –  and in conjunction with the annual ASMIRT meeting – TROG is taking it up a notch! For one, they are planning an exciting and important treatment plan study using an unprecedented experimental design that will focus on optimizing a plan based on lung function. Also, and as the main topic of this post, they will be doing a very interesting contouring workshop.

The 2022 contouring workshop is particularly exciting to me because it will do the following: (1) explore method(s) to build consensus across a group of expert physicians, (2) measure and visualize the variation in contouring across a large population of professional treatment planners and anatomists, (3) study population consensus vs. expert consensus, and (4) collect results and measure the accuracy of artificial intelligence (AI) based auto-segmentation engines vs. the gold standard and population on the whole.

TROG is inviting all AI/auto-segmentation vendors to participate. This is a great way to test your engines and showcase your results. Whether you are an established vendor, startup company with works in progress, or an academic software research group, I encourage you to participate. You can contact TROG’s Radiation Therapy Manager Alisha Moore ( to get involved.

This is not my project specifically (other than helping them with design and implementation), but I think studies like this are of utmost importance so I wanted to help TROG cast a wide net and maximize involvement. There’s nothing to lose and everything to gain. After all, we cannot prove (or improve) what we do not measure!


Top 10 Lessons Learned after 10 Years of Plan Studies

Posted on

I couple weeks ago, I gave a webinar for the “Best of QADS” series put on by Sun Nuclear Corporation. The webinar was called “The Top 10 Lessons Learned after 10 Years of International Plan Studies.” The presentation (including the live Q&A session) was recorded. You can watch it by visiting the following link.

Click here to register and watch that presentation. (It’s free of course)....

Invariably Variable: The Art of Contouring

Posted on


An object at rest tends to stay at rest…
Newton’s First Law

I’ll never forget the reactions of those two physicians, those many years ago. Or, at least I won’t let myself. Not yet at least.

Allow me a moment to retell.

It was about seven years ago and I had just given a talk at a regional meeting in the Midwest. This particular audience was made up of medical dosimetrists and radiation therapists, with a smattering of medical physicists and radiation oncologists. My topic: “Variation in Anatomical Contouring.”

One of my first slides was a clumsy cartoon I had sketched together in PowerPoint. It showed a horse (labelled “treatment planning”) pulling a train of carts, each labelled with a specific technology dependent on the preceding one. And under the horse, representing the road on which the horse and all carts depended, was written one big, bold word: CONTOURING. I found that old cartoon and I’ve reproduced it in Figure 1, below.

Figure 1. My slide (circa ~2010) used to say, essentially, “We can talk about the cart and the horse all we want, but let’s not forget the condition of the road…”

My simple argument was that if you don’t get your anatomy volumes defined correctly – both for targets and critical organs – then everything else downstream suffers. Or, following the horse-and-cart metaphor, inaccurate contours make for a really bumpy ride. All the benefits of the elegant technology of radiation therapy – inverse planning and dose optimization, dose calculation, DVH and other plan metrics, image-guidance, and precision delivery – don’t even matter if your patient anatomy blueprint is wrong in the first place. The anatomy contours are the original “design input” to the personalized medicine that is radiation oncology. Get that wrong, and you’re in trouble.

For the talk, I showed some preliminary data on inter-observer anatomical contours over a range of critical organs. These were controlled experiments where all clinicians were given the same CT images, and the variation I was seeing in some organs was shocking. While there was not much variation for some organs like the brain or lung which are easily seen as clearly defined pixel regions, there was very large variation for many other organs like the parotid, sub-mandibular glands, brainstem, larynx, and even the spinal cord (!)....

Choose Your Own Adventure

Posted on

Here is talk I did for the annual AAMD meeting (Atlanta, GA) on June 12, 2016. The response I got and the line of people inspired to tell me “their stories” was a big lift.

Sometimes you just have to let it all hang out. Enjoy.


2016 AAMD Contouring Workshop

Posted on

Dr. Aaron Kusano and I will be leading this year’s AAMD Contouring Workshop. This will be our third workshop of this kind in a row, and we believe this year’s agenda is the best. We have incorporated the most frequent request (namely, for more “hands on” contouring time) while keeping the instruction and testing methods that have been successful the past two years.

Below is a short conversation with both of us that provides a nice summary of what we’re doing this year and why. We hope you sign up and join the workshop! It’s a volunteer effort on our part, and our only motivation is because of our passion and yours.


It’s Here: ProKnow

Posted on

I am proud to be collaborating with a multi-institutional team to build a system called “ProKnow.” ProKnow stands for “Profound Knowledge,” a term that many of you will recognize from Deming’s famous “System of Profound Knowledge” to improve quality and pride in workmanship across any company, team, or project.

ProKnow will allow the worldwide community to study, and ultimately improve, the standard of care in radiation oncology. We have powerful analytical modules to help you: ensure accurate anatomy contouring, quantify and study plan quality metrics, identify best practices, and ultimately correlate your methods and modalities with patient outcomes.

Here is the link to our cloud-based system: Take a look!...

2016’s AAMD Contouring Workshop to be the best yet

Posted on

Pretty soon it will be June and the annual AAMD meeting in Atlanta will be one hot ticket!

We’re pretty proud of the educational agenda we’ve put together for this year’s AAMD Contouring Workshop. It includes elements and strategies that have succeeded in past workshops, combined with pretty much all the suggestions for improvement we have received from the past two years. Firstly, we’re allowing more time so that we maximize the “hands on” skills practice and interaction with each other and the experts. Secondly, the workflow of the workshop is tailor made to optimize learning. You will be contouring critical organs (thanks to MIM to donating workstations and their impressive contouring software), then hearing the incomparable Dr. Aaron Kusano give interactive lectures on each organ, then you will receive immediate feedback of your individual contours vs. the “gold standard” contours as scored by the StructSure software (thanks to Standard Imaging), and finally you will go back to the workstations to directly compare your contours to the gold standards and ask Dr. Kusano questions in real time during your review.

That’s the gist of it. It is going to be great and will set the bar even higher on the value of the workshop.

Here is a link to the AAMD page: AAMD 2016 Workshops...

Project Icarus Alive and Well In “PlanIQ” Product

Posted on

For closure on the “Project Icarus” thread, I’d like to make it clear the the project is alive and well, but is now living and breathing in the real world as the “Feasibility Analysis” (pre-planning and post-planning) tools in the PlanIQ software product now owned by Sun Nuclear Corporation.

For more information or to request a demo or research license, just contact your Sun Nuclear sales representative....

Testing the Accuracy of Your DVH Calculations

Posted on

As an industry, we’ve spent decades talking about testing and improving the accuracy of 3D dose calculations. We’ve invented countless products – both hardware (e.g. dosimeters) and software – to help us along the way, and many AAPM Task Groups have published reports of various flavors and angles on this topic. None of this is too surprising because accurate dose calcs, especially in and around complex tissue shapes and varying densities, are not an easy thing.

3D dose calculation results are primarily used – in conjunction with 3D anatomy contouring – to produce dose volume histograms (DVH) for each target and organ-at-risk. The irony is that for a long time (again, decades), as an industry we’ve forgotten that assembling DVH curves requires calculations itself, and not all of these are created equal. How are anatomy volumes simulated and at what resolution? How are anatomy superior and inferior “end caps” modeled and what is assumed to happen in between axial slices? How does anatomy interplay with an orthogonal 3D dose grid, and what are the effects of that dose grid’s spatial resolution?

DVH calculations are not standard, folks. Not by a far cry.

Together with Vladimir Feygelman, Ph.D. and other scientists at Moffitt Cancer Center (Tampa, FL), we’ve done some really important work studying DVH accuracy and recommending standardized datasets and methods to validate your TPS software. We’ve published our first work in Medical Physics (August, 2015), and you can link to it here.

I hope you can take some time to read our new publication. If you’re a physician, physicist, or dosimetrist (or, just as important, if you’re a TPS software vendor) it will be well worth your time....

AAPM Plan Challenge: SBRT Lung

Posted on

The success and interest level in the “Plan Challenge” has now permeated to the AAPM, where we will be integrating a AAPM-dedicated Plan Challenge into a 2-hour special session on Plan Quality at the 2014 Spring Clinical Meeting.

In this Plan Challenge, we offer our first hypofractionated lung case. Participation is encouraged from physicists and dosimetrists alike, both domestic and international. Results will be presented for the first time on March 15, 2015 in Denver, CO, but we will repeat the results via live webinars and/or videos posted online after the AAPM spring meeting.

Here are some key links regarding the 2014 AAPM Plan Challenge:

– Link to register (which will kick off the ROR portal that will steer you through the process)
– Link to a recorded meeting going over the anatomy, objectives, and other interesting discussions on the particular Plan Quality Algorithm to be used
– For Quality Reports users, here is a link to a downloadable protocol which you can import directly into Quality Reports 1.1 or later, to allow self-scoring. Use the protocol labeled “AAPM Plan Challenge.”

Spread the word!...

New QA Publication is “Most Read” Article in Medical Physics 40(11), November 2013

Posted on

Thanks to my colleagues and co-authors, our newest publication was deemed “Editor’s Pick” for the November issue of the Medical  Physics journal.

The paper is called “Evaluating IMRT and VMAT dose accuracy: Practical examples of failure to detect systematic errors when applying a commonly used metric and action levels” and can be found in Medical Physics, volume 40(11) which was published in November of 2013.

You can link to the abstract and download the paper here....

Quality Reports [EMR]® 1.1 Released

Posted on

Thanks to all the users and research partners who provided great input and validation of the many enhancements in this new version. Quality Reports now makes it even easier and to produce customized and comprehensive documentation of radiation plan quality, and further improves the automation of protocol analysis for meaningful use EMR.

The full release summary of changes and new installers are now available for download via the secure downloads portal (see the “Download” link in the banner).


New Publication: Interplay of Target Motion and Dynamic VMAT Delivery for SBRT

Posted on

Together with Moffitt Cancer Center (Tampa, FL) and the team of Vladimir Feygelman, Ph.D., we continue our ongoing scientific studies with a new publication called “Experimentally studied dynamic dose interplay does not meaningfully affect target dose in VMAT SBRT lung treatments,” just published in Medical Physics 40(9). We are proud to say this paper was selected as one of the “Editor’s Picks” for this issue (which, among other things, means anybody can download the PDF of the article for free – a really nice feature offered by the Medical Physics journal).

Prior to this work, we had been building up our knowledge and toolset for analyzed 4D doses, specifically for volume-modulated arc therapy (VMAT). We applied to clinical SBRT plans (hypo-fractionated with 10 Gy/fraction, 5 total fractions), and at first we were surprised by our results. Then, as we set about understanding the interplay phenomenon for hypo-fractionated VMAT, it started to make more and more intuitive sense why we didn’t see interplay effects. First, the high dose per fraction ultimately allows for many cycles of breathing motion during each daily delivery, which tends to average out the interplay effect. Second, the VMAT segments tend to be less complex at any given time than dynamic IMRT segments (which we know are susceptible to interplay, at least per fraction), also working to drive down interplay. And finally, a strategy of optimization of our VMAT plans was to pump up the dose at the PTV periphery, i.e. purposely induce dose heterogeneity, and the benefits of this for a moving target are evidence in this paper....

New Publication: Bio-Models to Assess Plan Robustness / Dose QA

Posted on

It has my pleasure to work with my colleagues Wolfgang Tomé and Heming Zhen on a multi-phased study on the methods and metrics of patient-specific Dose QA.

Recently, we published our third paper on this topic called “On the use of biomathematical models in patient-specific IMRT dose QA,” which was published in Medical Physics 40(7). This paper looked at how to use biological model-based DVH reduction methods to analyze dose changes observed during per-patient dose QA, but more importantly raised the idea of using bio-models to assess the “robustness” of highly conformal plans (IMRT and VMAT). That is, wouldn’t it be useful to quantify how susceptible these plans are (or would be) to TPS or delivery errors? Moreover, wouldn’t this make sense to do as part of optimizing the plan vs. after? It’s a common sense approach to quality....