The one and only Unicorn Primary Endpoint

The perfect primary endpoint for your study

Jan 29, 2023

Image from Wikipedia commons

You can find the Spanish version below

From the research question emanates the Primary Endpoint, and from this, the whole study: how it will be done, its quality, how many patients you will need, the interest it will arouse in experts and potential readers, and it may even determine that it is impossible to finish it.

The Unicorn Primary Endpoint is one that is clinically relevant, that allows results with a small sample of patients, and that will be accepted by the scientific community as valid.

The Unicorn Primary Endpoint does not exist.

Let me tell you about my experience with primary endpoints and how out of the frustration of a weak primary endpoint came the WATERLAND clinical trial.

In 2011, after reading an open-label randomized clinical trial by Bechien Wu suggesting that patients with acute pancreatitis treated with lactated Ringer's solution had less inflammation than those receiving saline, I made a very important decision.

I decided to do my first randomized clinical trial. To replicate B. Wu's study, but with a triple-blind design (blinded to the patient, the treating physician, and the statistician analyzing the data).

In 2011 I had no experience in multicenter studies, I could only count on my hospital. That severely limited my ability to recruit patients.

I manage 140 patients with pancreatitis per year, but a clinical trial has many restrictions, strict inclusion, and exclusion criteria, and some patients do not want to enter the trial. When it came to choosing the primary endpoint, I decided on the same as in B. Wu's study, the systemic inflammatory response syndrome (SIRS) criteria, and as a secondary endpoint, C-reactive protein (CRP) blood levels.

It took me a long time (5 years) to design the study, look for funding, and solve internal problems such as changing the way of prescribing fluid therapy in the computer system of my hospital. It was complex and difficult, but we succeeded.

The study (just 40 patients) showed that lactated Ringer's solution was associated with a trend towards fewer SIRS criteria at 48h and significantly lower CRP levels at 48 and 72h.

I had a hard time publishing it, as, despite being triple-blind, it was considered a confirmatory study, but it was accepted in one of my favorite journals, United European Gastroenterology journal in 2018.

I thought that this subject was closed. I decided to end the line of research on fluid therapy in acute pancreatitis because I thought I had reached the limit of my possibilities. There were two studies (Bechien’s and mine) showing that patients with pancreatitis treated with lactated Ringer's solution had less inflammation, enough to make it the fluid of choice in this disease.

I was wrong.

I soon became frustrated when I read the American Gastroenterological Association (AGA) clinical practice guidelines, which considered inflammation not to be a meaningful variable. They stated that “there is also no RCT evidence that any particular type of fluid therapy (eg, lactated Ringer’s) reduces the risk of mortality or persistent single or multiple organ failure”. CRP or SIRS criteria were not enough to convince the experts.

What was the point of the RINFIS study? Because of a weak endpoint, all the effort had been wasted.

I was also wrong.

On the one hand, it served to ignite debate around the research question. It provided a piece, though not a definitive one. It served to teach me, and I learned a lot. First, a clinical trial can be done without the help of the pharmaceutical industry in a small hospital on the Spanish coast. Secondly, you must choose a primary endpoint that is relevant, or the effort will not be recognized by the scientific community. I promised myself never again to do a clinical trial based on a weak surrogate endpoint, not directly related to clinically relevant variables. I also clearly understood that clinical research based on a single center does not generally change clinical practice, it only allows hypothesis generation.

The primary endpoint is the one that you consider to be the backbone of your study, the one that most adequately answers your research question, the One Ring of your study, one endpoint to rule them all.

With the primary endpoint, the sample size of the study is calculated. A dichotomous endpoint (only two possibilities, yes or no, e.g., mortality) needs in general a large sample size if the number of events is low. For example, mortality in acute pancreatitis is, fortunately, about 2%. For a study in which treatment lowers mortality to 1% (a 50% drop), you need almost 5,000 patients. That is a lot. A lot.

Quantitative variables, on the other hand, allow smaller sample sizes, since each patient provides a wider range of values than a dichotomous variable, which only provides a "yes" or "no". Therefore, CRP is valuable in clinical studies; it is easier to detect changes in the level of inflammation. If in a study with 40 patients in one treatment arm the mean CRP is 40 mg/l and in the other 5 mg/l, the differences are probably significant.

But CRP is just a number in a citizen's blood test. It does not directly imply that the patient lives or dies, suffers, or is prevented from functioning normally. It is a number. It is related to inflammation, but it is not part of the pathophysiologic chain that produces suffering or mortality in the patient.

CRP is not relevant per se.

Death is relevant, or longer hospital stay, or pain intensity. The presence of SIR criteria is irrelevant to the patient, to your hospital director, or to experts writing the guidelines for acute pancreatitis. The fact that the patient’s kidney fails, and he or she has to undergo hemodialysis, however, is relevant, but it is, again, a dichotomous and infrequent variable.

The unicorn endpoint is elusive; it probably does not exist.

After the RINFIS study, I started doing multicenter studies (ATLANTIS, PAN-PROMISE). I learned to motivate my colleagues and to join forces with pancreatic nerds like me to reach clinically relevant endpoints. In WATERFALL, with 18 centers, we compared aggressive versus moderate fluid therapy in acute pancreatitis. We chose as our endpoint a clinically relevant variable: the incidence of moderate to severe pancreatitis. The patients clearly prefer to have mild pancreatitis, it has direct implications on their suffering, on the time they will be hospitalized, on their chances of dying. The physician also prefers it to be mild, and the director of your center. Everybody likes meaningful clinical endpoints.

With all that experience it was time to go back to the beginning.

The WATERLAND study is an open-label international clinical trial. We will compare fluid therapy based on lactated Ringer's solution with saline. Its strength: the number of interested centers, more than 119, which allows us to try to address a clinically relevant endpoint, again the frequency of moderate to severe pancreatitis. A patient with mild pancreatitis has an excellent clinical evolution, with lower morbidity and zero mortality, but moderate to severe acute pancreatitis entails suffering, longer hospital stay, greater need for invasive treatments and higher mortality.

When choosing the primary endpoint for your study, simply ask yourself whether that variable is directly important to the patient, to the attending physician, and to the health care manager. If so, it is a good endpoint.

Finally, I wanted to show you the logo of our research group ERICA (international league against biliary-pancreatic diseases). We received it this week and I love it; do you like it? (Thanks Sarai Llamas!). ERICA is an international network of enthusiastic healthcare professionals eager to improve the treatment of biliary and pancreatic diseases through ambitious studies of direct benefit to the patient. If you want to participate in WATERLAND you can sign up through this link.

Thank you for reading DeMadaria vs. placebo. This post is public so feel free to share it.

Versión en español:

El Endpoint Primario Unicornio

De la pregunta de investigación emana el Endpoint Primario, y de este, todo el estudio: cómo se hará, su calidad, cuántos pacientes necesitarás, el interés que despertará en los expertos y potenciales lectores e incluso puede determinar que sea imposible terminarlo.

El Endpoint Primario Unicornio es aquel que es clínicamente relevante, que permite resultados con una muestra pequeña de pacientes, y que será aceptado por la comunidad científica como válido.

El Endpoint Primario Unicornio no existe.

Déjame que te cuente mi experiencia con los endpoints primarios y cómo de la frustración de un endpoint primario débil surgió el ensayo clínico WATERLAND.

En 2011, tras leer un ensayo clínico aleatorizado abierto de Bechien Wu que sugería que los pacientes con pancreatitis aguda tratados con solución de Ringer lactato tenían menor inflamación que aquellos que recibían suero salino, tomé una decisión muy importante.

Decidí hacer mi primer ensayo clínico aleatorizado. Replicar el estudio de B. Wu pero con un diseño triple ciego (cegado para el paciente, médico que atiende al paciente y al estadístico que analiza los datos).

En 2011 no tenía experiencia en estudios multicéntricos, solo podía contar con mi hospital. Eso limitaba mucho mi capacidad de reclutar pacientes.

Yo veo 140 pancreatitis al año, pero un ensayo clínico tiene muchas restricciones, criterios de inclusión y exclusión estrictos, y algunos pacientes no desean entrar en el ensayo. A la hora de elegir el endpoint primario, me decidí por los mismos que en el estudio de B. Wu, los criterios de síndrome de respuesta inflamatoria sistémica (SRIS), y como secundario los niveles de proteína C reactiva (PCR).

Me costó mucho (5 años) diseñar el estudio, buscar financiación, arreglar problemas internos como por ejemplo cambiar la forma de pautar fluidoterapia en el sistema informático de mi hospital. Fue complejo y difícil, pero lo conseguimos.

El estudio (solo 40 pacientes) mostró que la solución de Ringer lactato se asociaba a una tendencia a menor número de criterios SRIS a las 48h y de forma significativa a menores niveles de PCR a las 48 y 72h.

Me costó publicarlo, ya que, a pesar de ser triple ciego, era considerado un estudio de confirmación, pero fue aceptado en una de mis revistas favoritas, United European Gastroenterology journal en 2018.

Pensé que este tema estaba zanjado. Decidí terminar la línea de investigación en fluidoterapia en pancreatitis aguda, porque pensaba que había llegado al límite de mis posibilidades. Había dos estudios que mostraban que los pacientes con pancreatitis tratados con solución de Ringer lactato tenían menor inflamación, era suficiente para que fuera el fluido de elección en esta enfermedad.

Me equivocaba.

Pronto sentí frustración al leer las guías de práctica clínica de la American Gastroenterology Association (AGA), que consideraron que la inflamación no era una variable relevante (NOTA: meaningful). Afirmaron que "no existen pruebas de ensayos clínicos aleatorizados de que un tipo concreto de fluidoterapia (p. ej., Ringer lactato) reduzca el riesgo de mortalidad o fallo orgánico único o múltiple persistente". La PCR o los criterios SRIS no eras suficientes para convencer a los expertos.

¿Para qué había servido el esfuerzo del estudio RINFIS? Por culpa de un endpoint débil todo el esfuerzo había sido inútil.

También me equivocaba.

Por un lado, sirvió para encender el debate en torno a la pregunta de investigación. Aportó una pieza, aunque no definitiva. Me sirvió para aprender, y aprendí mucho. En primer lugar, que un ensayo clínico se puede hacer sin la ayuda de la industria farmacéutica en un pequeño hospital de la costa española. En segundo lugar, que hay que elegir un endpoint primario que sea relevante, o el esfuerzo no será reconocido por la comunidad científica. Me prometí no volver a hacer un ensayo clínico basado en un endpoint subrogado débil, no directamente relacionado con variables clínicamente relevantes. También entendí claramente que la investigación clínica basada en un solo centro no cambia en general la práctica clínica, solo permitía generar hipótesis.

El endpoint primario es el que consideras como eje central de tu estudio, el que responde de forma más adecuada a tu pregunta de investigación, el anillo único de tu estudio, un endpoint para gobernar a todos los demás.

Con el endpoint primario se calcula el tamaño muestral del estudio. Un endpoint dicotómico (solo dos posibilidades, sí o no, por ejemplo, mortalidad) necesita en general un gran tamaño muestral si la proporción de pacientes con el endpoint en el brazo control (número de eventos) es baja. Por ejemplo, la mortalidad en pancreatitis aguda, afortunadamente, es de aproximadamente un 2%. Para un estudio en el que el tratamiento baje la mortalidad a un 1% (una bajada del 50%) se necesitan casi 5.000 pacientes. Eso es mucho. Muchísimo.

Las variables que, por el contrario, son cuantitativas, permiten tamaños muestrales menores, ya que cada paciente aporta un mayor rango de valores que una variable dicotómica, que solo aporta un “sí” o un “no”. Por eso la PCR es valiosa en estudios clínicos, es más fácil detectar cambios en el nivel de inflamación. Si en un estudio con 40 pacientes en un brazo de tratamiento la media de PCR es de 40 mg/l y en el otro de 5 mg/l, probablemente las diferencias sean significativas.

Pero la PCR es solo un número en una analítica de un ciudadano. No implica directamente que el paciente viva o muera, sufra o esté impedido para funcionar con normalidad. Un número. Se relaciona con inflamación, pero no es parte de la cadena fisiopatológica que produce sufrimiento o mortalidad en el paciente.

La PCR no es relevante per se.

La muerte sí es relevante, o el estar más tiempo ingresado, o la intensidad del dolor. La presencia de criterios SRIS le dan igual al paciente, al gerente de tu hospital o al experto que hace las guías. El que te falle el riñón y te tengan que hacer hemodiálisis sin embargo sí, pero es, de nuevo, una variable dicotómica y poco frecuente.

El Endpoint Unicornio es esquivo, probablemente no exista.

Tras el estudio RINFIS comencé a hacer estudios multicéntricos (ATLANTIS, PAN-PROMISE), aprendí a motivar a mis compañeros y unir fuerzas con frikis pancreáticos como yo para poder alcanzar endpoints que fueran clínicamente relevantes. En WATERFALL, con 18 centros, comparamos fluidoterapia agresiva frente a moderada en pancreatitis aguda. Elegimos como endpoint una variable clínicamente relevante: la incidencia de pancreatitis moderada a grave. El paciente prefiere claramente tener una pancreatitis leve, tiene implicaciones directas en su sufrimiento, en el tiempo que estará ingresado, en sus probabilidades de fallecer. El médico también prefiere que sea leve, y el gerente de tu hospital.

Con toda esa experiencia era el momento de volver al principio.

El estudio WATERLAND es un ensayo clínico abierto internacional. Su fortaleza: el número de centros interesados, más de 119, lo que nos permite intentar abordar un endpoint clínicamente relevante, de nuevo la frecuencia de pancreatitis moderada a grave. Un paciente con pancreatitis leve tiene un curso clínico excelente, con menor morbilidad y nula mortalidad, pero la pancreatitis aguda moderada a grave implica sufrimiento, aumento de tiempo de ingreso hospitalario, mayor necesidad de tratamientos invasivos y mayor mortalidad.

Cuando elijas el endpoint principal para tu estudio, simplemente pregúntate si esa variable es importante directamente para el paciente, para el médico que lo atiende y para el gestor sanitario. Si es así, es un buen endpoint.

Por último, quería mostraros el logotipo de nuestro grupo de investigación ERICA. Lo hemos recibido esta semana y me encanta; ¿os gusta? (¡gracias Sarai Llamas!) ERICA es una red internacional de profesionales sanitarios entusiastas deseosos de mejorar el tratamiento de las enfermedades biliares y pancreáticas mediante estudios ambiciosos, directamente beneficiosos para el paciente. Si quieres participar en WATERLAND puedes apuntarte mediante este link.

Comparte este post!

The one and only Unicorn Primary Endpoint

The perfect primary endpoint for your study

Discussion about this post