A Guide to Literature Reviews

Importance of a good literature review.

  • Conducting the Literature Review
  • Structure and Writing Style
  • Types of Literature Reviews
  • Citation Management Software This link opens in a new window
  • Acknowledgements

A literature review is not only a summary of key sources, but  has an organizational pattern which combines both summary and synthesis, often within specific conceptual categories . A summary is a recap of the important information of the source, but a synthesis is a re-organization, or a reshuffling, of that information in a way that informs how you are planning to investigate a research problem. The analytical features of a literature review might:

  • Give a new interpretation of old material or combine new with old interpretations,
  • Trace the intellectual progression of the field, including major debates,
  • Depending on the situation, evaluate the sources and advise the reader on the most pertinent or relevant research, or
  • Usually in the conclusion of a literature review, identify where gaps exist in how a problem has been researched to date.

The purpose of a literature review is to:

  • Place each work in the context of its contribution to understanding the research problem being studied.
  • Describe the relationship of each work to the others under consideration.
  • Identify new ways to interpret prior research.
  • Reveal any gaps that exist in the literature.
  • Resolve conflicts amongst seemingly contradictory previous studies.
  • Identify areas of prior scholarship to prevent duplication of effort.
  • Point the way in fulfilling a need for additional research.
  • Locate your own research within the context of existing literature [very important].
  • << Previous: Definition
  • Next: Conducting the Literature Review >>
  • Last Updated: Jul 3, 2024 3:13 PM
  • URL: https://libguides.mcmaster.ca/litreview

University of North Florida

  • Become Involved |
  • Give to the Library |
  • Staff Directory |
  • UNF Library
  • Thomas G. Carpenter Library

Conducting a Literature Review

Benefits of conducting a literature review.

  • Steps in Conducting a Literature Review
  • Summary of the Process
  • Additional Resources
  • Literature Review Tutorial by American University Library
  • The Literature Review: A Few Tips On Conducting It by University of Toronto
  • Write a Literature Review by UC Santa Cruz University Library

While there might be many reasons for conducting a literature review, following are four key outcomes of doing the review.

Assessment of the current state of research on a topic . This is probably the most obvious value of the literature review. Once a researcher has determined an area to work with for a research project, a search of relevant information sources will help determine what is already known about the topic and how extensively the topic has already been researched.

Identification of the experts on a particular topic . One of the additional benefits derived from doing the literature review is that it will quickly reveal which researchers have written the most on a particular topic and are, therefore, probably the experts on the topic. Someone who has written twenty articles on a topic or on related topics is more than likely more knowledgeable than someone who has written a single article. This same writer will likely turn up as a reference in most of the other articles written on the same topic. From the number of articles written by the author and the number of times the writer has been cited by other authors, a researcher will be able to assume that the particular author is an expert in the area and, thus, a key resource for consultation in the current research to be undertaken.

Identification of key questions about a topic that need further research . In many cases a researcher may discover new angles that need further exploration by reviewing what has already been written on a topic. For example, research may suggest that listening to music while studying might lead to better retention of ideas, but the research might not have assessed whether a particular style of music is more beneficial than another. A researcher who is interested in pursuing this topic would then do well to follow up existing studies with a new study, based on previous research, that tries to identify which styles of music are most beneficial to retention.

Determination of methodologies used in past studies of the same or similar topics.  It is often useful to review the types of studies that previous researchers have launched as a means of determining what approaches might be of most benefit in further developing a topic. By the same token, a review of previously conducted studies might lend itself to researchers determining a new angle for approaching research.

Upon completion of the literature review, a researcher should have a solid foundation of knowledge in the area and a good feel for the direction any new research should take. Should any additional questions arise during the course of the research, the researcher will know which experts to consult in order to quickly clear up those questions.

  • << Previous: Home
  • Next: Steps in Conducting a Literature Review >>
  • Last Updated: Aug 16, 2024 10:00 AM
  • URL: https://libguides.unf.edu/litreview
  • UConn Library
  • Literature Review: The What, Why and How-to Guide
  • Introduction

Literature Review: The What, Why and How-to Guide — Introduction

  • Getting Started
  • How to Pick a Topic
  • Strategies to Find Sources
  • Evaluating Sources & Lit. Reviews
  • Tips for Writing Literature Reviews
  • Writing Literature Review: Useful Sites
  • Citation Resources
  • Other Academic Writings

What are Literature Reviews?

So, what is a literature review? "A literature review is an account of what has been published on a topic by accredited scholars and researchers. In writing the literature review, your purpose is to convey to your reader what knowledge and ideas have been established on a topic, and what their strengths and weaknesses are. As a piece of writing, the literature review must be defined by a guiding concept (e.g., your research objective, the problem or issue you are discussing, or your argumentative thesis). It is not just a descriptive list of the material available, or a set of summaries." Taylor, D.  The literature review: A few tips on conducting it . University of Toronto Health Sciences Writing Centre.

Goals of Literature Reviews

What are the goals of creating a Literature Review?  A literature could be written to accomplish different aims:

  • To develop a theory or evaluate an existing theory
  • To summarize the historical or existing state of a research topic
  • Identify a problem in a field of research 

Baumeister, R. F., & Leary, M. R. (1997). Writing narrative literature reviews .  Review of General Psychology , 1 (3), 311-320.

What kinds of sources require a Literature Review?

  • A research paper assigned in a course
  • A thesis or dissertation
  • A grant proposal
  • An article intended for publication in a journal

All these instances require you to collect what has been written about your research topic so that you can demonstrate how your own research sheds new light on the topic.

Types of Literature Reviews

What kinds of literature reviews are written?

Narrative review: The purpose of this type of review is to describe the current state of the research on a specific topic/research and to offer a critical analysis of the literature reviewed. Studies are grouped by research/theoretical categories, and themes and trends, strengths and weakness, and gaps are identified. The review ends with a conclusion section which summarizes the findings regarding the state of the research of the specific study, the gaps identify and if applicable, explains how the author's research will address gaps identify in the review and expand the knowledge on the topic reviewed.

  • Example : Predictors and Outcomes of U.S. Quality Maternity Leave: A Review and Conceptual Framework:  10.1177/08948453211037398  

Systematic review : "The authors of a systematic review use a specific procedure to search the research literature, select the studies to include in their review, and critically evaluate the studies they find." (p. 139). Nelson, L. K. (2013). Research in Communication Sciences and Disorders . Plural Publishing.

  • Example : The effect of leave policies on increasing fertility: a systematic review:  10.1057/s41599-022-01270-w

Meta-analysis : "Meta-analysis is a method of reviewing research findings in a quantitative fashion by transforming the data from individual studies into what is called an effect size and then pooling and analyzing this information. The basic goal in meta-analysis is to explain why different outcomes have occurred in different studies." (p. 197). Roberts, M. C., & Ilardi, S. S. (2003). Handbook of Research Methods in Clinical Psychology . Blackwell Publishing.

  • Example : Employment Instability and Fertility in Europe: A Meta-Analysis:  10.1215/00703370-9164737

Meta-synthesis : "Qualitative meta-synthesis is a type of qualitative study that uses as data the findings from other qualitative studies linked by the same or related topic." (p.312). Zimmer, L. (2006). Qualitative meta-synthesis: A question of dialoguing with texts .  Journal of Advanced Nursing , 53 (3), 311-318.

  • Example : Women’s perspectives on career successes and barriers: A qualitative meta-synthesis:  10.1177/05390184221113735

Literature Reviews in the Health Sciences

  • UConn Health subject guide on systematic reviews Explanation of the different review types used in health sciences literature as well as tools to help you find the right review type
  • << Previous: Getting Started
  • Next: How to Pick a Topic >>
  • Last Updated: Sep 21, 2022 2:16 PM
  • URL: https://guides.lib.uconn.edu/literaturereview

Creative Commons

Research Methods

  • Getting Started
  • Literature Review Research
  • Research Design
  • Research Design By Discipline
  • SAGE Research Methods
  • Teaching with SAGE Research Methods

Literature Review

  • What is a Literature Review?
  • What is NOT a Literature Review?
  • Purposes of a Literature Review
  • Types of Literature Reviews
  • Literature Reviews vs. Systematic Reviews
  • Systematic vs. Meta-Analysis

Literature Review  is a comprehensive survey of the works published in a particular field of study or line of research, usually over a specific period of time, in the form of an in-depth, critical bibliographic essay or annotated list in which attention is drawn to the most significant works.

Also, we can define a literature review as the collected body of scholarly works related to a topic:

  • Summarizes and analyzes previous research relevant to a topic
  • Includes scholarly books and articles published in academic journals
  • Can be an specific scholarly paper or a section in a research paper

The objective of a Literature Review is to find previous published scholarly works relevant to an specific topic

  • Help gather ideas or information
  • Keep up to date in current trends and findings
  • Help develop new questions

A literature review is important because it:

  • Explains the background of research on a topic.
  • Demonstrates why a topic is significant to a subject area.
  • Helps focus your own research questions or problems
  • Discovers relationships between research studies/ideas.
  • Suggests unexplored ideas or populations
  • Identifies major themes, concepts, and researchers on a topic.
  • Tests assumptions; may help counter preconceived ideas and remove unconscious bias.
  • Identifies critical gaps, points of disagreement, or potentially flawed methodology or theoretical approaches.
  • Indicates potential directions for future research.

All content in this section is from Literature Review Research from Old Dominion University 

Keep in mind the following, a literature review is NOT:

Not an essay 

Not an annotated bibliography  in which you summarize each article that you have reviewed.  A literature review goes beyond basic summarizing to focus on the critical analysis of the reviewed works and their relationship to your research question.

Not a research paper   where you select resources to support one side of an issue versus another.  A lit review should explain and consider all sides of an argument in order to avoid bias, and areas of agreement and disagreement should be highlighted.

A literature review serves several purposes. For example, it

  • provides thorough knowledge of previous studies; introduces seminal works.
  • helps focus one’s own research topic.
  • identifies a conceptual framework for one’s own research questions or problems; indicates potential directions for future research.
  • suggests previously unused or underused methodologies, designs, quantitative and qualitative strategies.
  • identifies gaps in previous studies; identifies flawed methodologies and/or theoretical approaches; avoids replication of mistakes.
  • helps the researcher avoid repetition of earlier research.
  • suggests unexplored populations.
  • determines whether past studies agree or disagree; identifies controversy in the literature.
  • tests assumptions; may help counter preconceived ideas and remove unconscious bias.

As Kennedy (2007) notes*, it is important to think of knowledge in a given field as consisting of three layers. First, there are the primary studies that researchers conduct and publish. Second are the reviews of those studies that summarize and offer new interpretations built from and often extending beyond the original studies. Third, there are the perceptions, conclusions, opinion, and interpretations that are shared informally that become part of the lore of field. In composing a literature review, it is important to note that it is often this third layer of knowledge that is cited as "true" even though it often has only a loose relationship to the primary studies and secondary literature reviews.

Given this, while literature reviews are designed to provide an overview and synthesis of pertinent sources you have explored, there are several approaches to how they can be done, depending upon the type of analysis underpinning your study. Listed below are definitions of types of literature reviews:

Argumentative Review      This form examines literature selectively in order to support or refute an argument, deeply imbedded assumption, or philosophical problem already established in the literature. The purpose is to develop a body of literature that establishes a contrarian viewpoint. Given the value-laden nature of some social science research [e.g., educational reform; immigration control], argumentative approaches to analyzing the literature can be a legitimate and important form of discourse. However, note that they can also introduce problems of bias when they are used to to make summary claims of the sort found in systematic reviews.

Integrative Review      Considered a form of research that reviews, critiques, and synthesizes representative literature on a topic in an integrated way such that new frameworks and perspectives on the topic are generated. The body of literature includes all studies that address related or identical hypotheses. A well-done integrative review meets the same standards as primary research in regard to clarity, rigor, and replication.

Historical Review      Few things rest in isolation from historical precedent. Historical reviews are focused on examining research throughout a period of time, often starting with the first time an issue, concept, theory, phenomena emerged in the literature, then tracing its evolution within the scholarship of a discipline. The purpose is to place research in a historical context to show familiarity with state-of-the-art developments and to identify the likely directions for future research.

Methodological Review      A review does not always focus on what someone said [content], but how they said it [method of analysis]. This approach provides a framework of understanding at different levels (i.e. those of theory, substantive fields, research approaches and data collection and analysis techniques), enables researchers to draw on a wide variety of knowledge ranging from the conceptual level to practical documents for use in fieldwork in the areas of ontological and epistemological consideration, quantitative and qualitative integration, sampling, interviewing, data collection and data analysis, and helps highlight many ethical issues which we should be aware of and consider as we go through our study.

Systematic Review      This form consists of an overview of existing evidence pertinent to a clearly formulated research question, which uses pre-specified and standardized methods to identify and critically appraise relevant research, and to collect, report, and analyse data from the studies that are included in the review. Typically it focuses on a very specific empirical question, often posed in a cause-and-effect form, such as "To what extent does A contribute to B?"

Theoretical Review      The purpose of this form is to concretely examine the corpus of theory that has accumulated in regard to an issue, concept, theory, phenomena. The theoretical literature review help establish what theories already exist, the relationships between them, to what degree the existing theories have been investigated, and to develop new hypotheses to be tested. Often this form is used to help establish a lack of appropriate theories or reveal that current theories are inadequate for explaining new or emerging research problems. The unit of analysis can focus on a theoretical concept or a whole theory or framework.

* Kennedy, Mary M. "Defining a Literature."  Educational Researcher  36 (April 2007): 139-147.

All content in this section is from The Literature Review created by Dr. Robert Larabee USC

Robinson, P. and Lowe, J. (2015),  Literature reviews vs systematic reviews.  Australian and New Zealand Journal of Public Health, 39: 103-103. doi: 10.1111/1753-6405.12393

merits of sources of literature review

What's in the name? The difference between a Systematic Review and a Literature Review, and why it matters . By Lynn Kysh from University of Southern California

Diagram for "What's in the name? The difference between a Systematic Review and a Literature Review, and why it matters"

Systematic review or meta-analysis?

A  systematic review  answers a defined research question by collecting and summarizing all empirical evidence that fits pre-specified eligibility criteria.

A  meta-analysis  is the use of statistical methods to summarize the results of these studies.

Systematic reviews, just like other research articles, can be of varying quality. They are a significant piece of work (the Centre for Reviews and Dissemination at York estimates that a team will take 9-24 months), and to be useful to other researchers and practitioners they should have:

  • clearly stated objectives with pre-defined eligibility criteria for studies
  • explicit, reproducible methodology
  • a systematic search that attempts to identify all studies
  • assessment of the validity of the findings of the included studies (e.g. risk of bias)
  • systematic presentation, and synthesis, of the characteristics and findings of the included studies

Not all systematic reviews contain meta-analysis. 

Meta-analysis is the use of statistical methods to summarize the results of independent studies. By combining information from all relevant studies, meta-analysis can provide more precise estimates of the effects of health care than those derived from the individual studies included within a review.  More information on meta-analyses can be found in  Cochrane Handbook, Chapter 9 .

A meta-analysis goes beyond critique and integration and conducts secondary statistical analysis on the outcomes of similar studies.  It is a systematic review that uses quantitative methods to synthesize and summarize the results.

An advantage of a meta-analysis is the ability to be completely objective in evaluating research findings.  Not all topics, however, have sufficient research evidence to allow a meta-analysis to be conducted.  In that case, an integrative review is an appropriate strategy. 

Some of the content in this section is from Systematic reviews and meta-analyses: step by step guide created by Kate McAllister.

  • << Previous: Getting Started
  • Next: Research Design >>
  • Last Updated: Jul 15, 2024 10:34 AM
  • URL: https://guides.lib.udel.edu/researchmethods

Libraries | Research Guides

Literature reviews, what is a literature review, learning more about how to do a literature review.

  • Planning the Review
  • The Research Question
  • Choosing Where to Search
  • Organizing the Review
  • Writing the Review

A literature review is a review and synthesis of existing research on a topic or research question. A literature review is meant to analyze the scholarly literature, make connections across writings and identify strengths, weaknesses, trends, and missing conversations. A literature review should address different aspects of a topic as it relates to your research question. A literature review goes beyond a description or summary of the literature you have read. 

  • Sage Research Methods Core This link opens in a new window SAGE Research Methods supports research at all levels by providing material to guide users through every step of the research process. SAGE Research Methods is the ultimate methods library with more than 1000 books, reference works, journal articles, and instructional videos by world-leading academics from across the social sciences, including the largest collection of qualitative methods books available online from any scholarly publisher. – Publisher

Cover Art

  • Next: Planning the Review >>
  • Last Updated: Jul 8, 2024 11:22 AM
  • URL: https://libguides.northwestern.edu/literaturereviews

Service update: Some parts of the Library’s website will be down for maintenance on August 11.

Secondary menu

  • Log in to your Library account
  • Hours and Maps
  • Connect from Off Campus
  • UC Berkeley Home

Search form

Conducting a literature review: why do a literature review, why do a literature review.

  • How To Find "The Literature"
  • Found it -- Now What?

Besides the obvious reason for students -- because it is assigned! -- a literature review helps you explore the research that has come before you, to see how your research question has (or has not) already been addressed.

You identify:

  • core research in the field
  • experts in the subject area
  • methodology you may want to use (or avoid)
  • gaps in knowledge -- or where your research would fit in

It Also Helps You:

  • Publish and share your findings
  • Justify requests for grants and other funding
  • Identify best practices to inform practice
  • Set wider context for a program evaluation
  • Compile information to support community organizing

Great brief overview, from NCSU

Want To Know More?

Cover Art

  • Next: How To Find "The Literature" >>
  • Last Updated: Apr 25, 2024 1:10 PM
  • URL: https://guides.lib.berkeley.edu/litreview

Usc Upstate Library Home

Literature Review: Purpose of a Literature Review

  • Literature Review
  • Purpose of a Literature Review
  • Work in Progress
  • Compiling & Writing
  • Books, Articles, & Web Pages
  • Types of Literature Reviews
  • Departmental Differences
  • Citation Styles & Plagiarism
  • Know the Difference! Systematic Review vs. Literature Review

The purpose of a literature review is to:

  • Provide a foundation of knowledge on a topic
  • Identify areas of prior scholarship to prevent duplication and give credit to other researchers
  • Identify inconstancies: gaps in research, conflicts in previous studies, open questions left from other research
  • Identify the need for additional research (justifying your research)
  • Identify the relationship of works in the context of their contribution to the topic and other works
  • Place your own research within the context of existing literature, making a case for why further study is needed.

Videos & Tutorials

VIDEO: What is the role of a literature review in research? What's it mean to "review" the literature? Get the big picture of what to expect as part of the process. This video is published under a Creative Commons 3.0 BY-NC-SA US license. License, credits, and contact information can be found here: https://www.lib.ncsu.edu/tutorials/litreview/

Elements in a Literature Review

  • Elements in a Literature Review txt of infographic
  • << Previous: Literature Review
  • Next: Searching >>
  • Last Updated: Sep 10, 2024 11:32 AM
  • URL: https://uscupstate.libguides.com/Literature_Review

www.howandwhat.net

Advantages and disadvantages of literature review

This comprehensive article explores some of the advantages and disadvantages of literature review in research. Reviewing relevant literature is a key area in research, and indeed, it is a research activity in itself. It helps researchers investigate a particular topic in detail. However, it has some limitations as well.

What is literature review?

In order to understand the advantages and disadvantages of literature review, it is important to understand what a literature review is and how it differs from other methods of research. According to Jones and Gratton (2009) a literature review essentially consists of critically reading, evaluating, and organising existing literature on a topic to assess the state of knowledge in the area. It is sometimes called critical review.

A literature review is a select analysis of existing research which is relevant to a researcher’s selected topic, showing how it relates to their investigation. It explains and justifies how their investigation may help answer some of the questions or gaps in the chosen area of study (University of Reading, 2022).

A literature review is a term used in the field of research to describe a systematic and methodical investigation of the relevant literature on a particular topic. In other words, it is an analysis of existing research on a topic in order to identify any relevant studies and draw conclusions about the topic.

A literature review is not the same as a bibliography or a database search. Rather than simply listing references to sources of information, a literature review involves critically evaluating and summarizing existing research on a topic. As such, it is a much more detailed and complex process than simply searching databases and websites, and it requires a lot of effort and skills.

Advantages of literature review

Information synthesis

A literature review is a very thorough and methodical exercise. It can be used to synthesize information and draw conclusions about a particular topic. Through a careful evaluation and critical summarization, researchers can draw a clear and comprehensive picture of the chosen topic.

Familiarity with the current knowledge

According to the University of Illinois (2022), literature reviews allow researchers to gain familiarity with the existing knowledge in their selected field, as well as the boundaries and limitations of that field.

Creation of new body of knowledge

One of the key advantages of literature review is that it creates new body of knowledge. Through careful evaluation and critical summarisation, researchers can create a new body of knowledge and enrich the field of study.

Answers to a range of questions

Literature reviews help researchers analyse the existing body of knowledge to determine the answers to a range of questions concerning a particular subject.

Disadvantages of literature review

Time consuming

As a literature review involves collecting and evaluating research and summarizing the findings, it requires a significant amount of time. To conduct a comprehensive review, researchers need to read many different articles and analyse a lot of data. This means that their review will take a long time to complete.

Lack of quality sources  

Researchers are expected to use a wide variety of sources of information to present a comprehensive review. However, it may sometimes be challenging for them to identify the quality sources because of the availability of huge numbers in their chosen field. It may also happen because of the lack of past empirical work, particularly if the selected topic is an unpopular one.

Descriptive writing

One of the major disadvantages of literature review is that instead of critical appreciation, some researchers end up developing reviews that are mostly descriptive. Their reviews are often more like summaries of the work of other writers and lack in criticality. It is worth noting that they must go beyond describing the literature.

Key features of literature review

Clear organisation

A literature review is typically a very critical and thorough process. Universities usually recommend students a particular structure to develop their reviews. Like all other academic writings, a review starts with an introduction and ends with a conclusion. Between the beginning and the end, researchers present the main body of the review containing the critical discussion of sources.

No obvious bias

A key feature of a literature review is that it should be very unbiased and objective. However, it should be mentioned that researchers may sometimes be influenced by their own opinions of the world.

Proper citation

One of the key features of literature review is that it must be properly cited. Researchers should include all the sources that they have used for information. They must do citations and provide a reference list by the end in line with a recognized referencing system such as Harvard.

To conclude this article, it can be said that a literature review is a type of research that seeks to examine and summarise existing research on a particular topic. It is an essential part of a dissertation/thesis. However, it is not an easy thing to handle by an inexperienced person. It also requires a lot of time and patience.

Hope you like this ‘Advantages and disadvantages of literature review’. Please share this with others to support our research work.

Other useful articles:

How to evaluate website content

Advantages and disadvantages of primary and secondary research

Advantages and disadvantages of simple random sampling

Last update: 08 May 2022

References:

Jones, I., & Gratton, C. (2009) Research Methods for Sports Shttps://www.howandwhat.net/new/evaluate-website-content/tudies, 2 nd edition, London: Routledge

University of Illinois (2022) Literature review, available at: https://www.uis.edu/learning-hub/writing-resources/handouts/learning-hub/literature-review (accessed 08 May 2022)

University of Reading (2022) Literature reviews, available at: https://libguides.reading.ac.uk/literaturereview/starting (accessed 07 May 2022)

Author: M Rahman

M Rahman writes extensively online and offline with an emphasis on business management, marketing, and tourism. He is a lecturer in Management and Marketing. He holds an MSc in Tourism & Hospitality from the University of Sunderland. Also, graduated from Leeds Metropolitan University with a BA in Business & Management Studies and completed a DTLLS (Diploma in Teaching in the Life-Long Learning Sector) from London South Bank University.

Related Posts

How to be a good team player, competitive advantage for tourist destinations, advantages and disadvantages of snowball sampling.

Banner

Literature Review - what is a Literature Review, why it is important and how it is done

  • Strategies to Find Sources

Evaluating Literature Reviews and Sources

Reading critically, tips to evaluate sources.

  • Tips for Writing Literature Reviews
  • Writing Literature Review: Useful Sites
  • Citation Resources
  • Other Academic Writings
  • Useful Resources

A good literature review evaluates a wide variety of sources (academic articles, scholarly books, government/NGO reports). It also evaluates literature reviews that study similar topics. This page offers you a list of resources and tips on how to evaluate the sources that you may use to write your review.

  • A Closer Look at Evaluating Literature Reviews Excerpt from the book chapter, “Evaluating Introductions and Literature Reviews” in Fred Pyrczak’s Evaluating Research in Academic Journals: A Practical Guide to Realistic Evaluation , (Chapter 4 and 5). This PDF discusses and offers great advice on how to evaluate "Introductions" and "Literature Reviews" by listing questions and tips. First part focus on Introductions and in page 10 in the PDF, 37 in the text, it focus on "literature reviews".
  • Tips for Evaluating Sources (Print vs. Internet Sources) Excellent page that will guide you on what to ask to determine if your source is a reliable one. Check the other topics in the guide: Evaluating Bibliographic Citations and Evaluation During Reading on the left side menu.

To be able to write a good Literature Review, you need to be able to read critically. Below are some tips that will help you evaluate the sources for your paper.

Reading critically (summary from How to Read Academic Texts Critically)

  • Who is the author? What is his/her standing in the field.
  • What is the author’s purpose? To offer advice, make practical suggestions, solve a specific problem, to critique or clarify?
  • Note the experts in the field: are there specific names/labs that are frequently cited?
  • Pay attention to methodology: is it sound? what testing procedures, subjects, materials were used?
  • Note conflicting theories, methodologies and results. Are there any assumptions being made by most/some researchers?
  • Theories: have they evolved overtime?
  • Evaluate and synthesize the findings and conclusions. How does this study contribute to your project?

Useful links:

  • How to Read a Paper (University of Waterloo, Canada) This is an excellent paper that teach you how to read an academic paper, how to determine if it is something to set aside, or something to read deeply. Good advice to organize your literature for the Literature Review or just reading for classes.

Criteria to evaluate sources:

  • Authority : Who is the author? what is his/her credentials--what university he/she is affliliated? Is his/her area of expertise?
  • Usefulness : How this source related to your topic? How current or relevant it is to your topic?
  • Reliability : Does the information comes from a reliable, trusted source such as an academic journal?

Useful site - Critically Analyzing Information Sources (Cornell University Library)

  • << Previous: Strategies to Find Sources
  • Next: Tips for Writing Literature Reviews >>
  • Last Updated: Jul 3, 2024 10:56 AM
  • URL: https://lit.libguides.com/Literature-Review

The Library, Technological University of the Shannon: Midwest

  • Privacy Policy

Research Method

Home » Literature Review – Types Writing Guide and Examples

Literature Review – Types Writing Guide and Examples

Table of Contents

Literature Review

Literature Review

Definition:

A literature review is a comprehensive and critical analysis of the existing literature on a particular topic or research question. It involves identifying, evaluating, and synthesizing relevant literature, including scholarly articles, books, and other sources, to provide a summary and critical assessment of what is known about the topic.

Types of Literature Review

Types of Literature Review are as follows:

  • Narrative literature review : This type of review involves a comprehensive summary and critical analysis of the available literature on a particular topic or research question. It is often used as an introductory section of a research paper.
  • Systematic literature review: This is a rigorous and structured review that follows a pre-defined protocol to identify, evaluate, and synthesize all relevant studies on a specific research question. It is often used in evidence-based practice and systematic reviews.
  • Meta-analysis: This is a quantitative review that uses statistical methods to combine data from multiple studies to derive a summary effect size. It provides a more precise estimate of the overall effect than any individual study.
  • Scoping review: This is a preliminary review that aims to map the existing literature on a broad topic area to identify research gaps and areas for further investigation.
  • Critical literature review : This type of review evaluates the strengths and weaknesses of the existing literature on a particular topic or research question. It aims to provide a critical analysis of the literature and identify areas where further research is needed.
  • Conceptual literature review: This review synthesizes and integrates theories and concepts from multiple sources to provide a new perspective on a particular topic. It aims to provide a theoretical framework for understanding a particular research question.
  • Rapid literature review: This is a quick review that provides a snapshot of the current state of knowledge on a specific research question or topic. It is often used when time and resources are limited.
  • Thematic literature review : This review identifies and analyzes common themes and patterns across a body of literature on a particular topic. It aims to provide a comprehensive overview of the literature and identify key themes and concepts.
  • Realist literature review: This review is often used in social science research and aims to identify how and why certain interventions work in certain contexts. It takes into account the context and complexities of real-world situations.
  • State-of-the-art literature review : This type of review provides an overview of the current state of knowledge in a particular field, highlighting the most recent and relevant research. It is often used in fields where knowledge is rapidly evolving, such as technology or medicine.
  • Integrative literature review: This type of review synthesizes and integrates findings from multiple studies on a particular topic to identify patterns, themes, and gaps in the literature. It aims to provide a comprehensive understanding of the current state of knowledge on a particular topic.
  • Umbrella literature review : This review is used to provide a broad overview of a large and diverse body of literature on a particular topic. It aims to identify common themes and patterns across different areas of research.
  • Historical literature review: This type of review examines the historical development of research on a particular topic or research question. It aims to provide a historical context for understanding the current state of knowledge on a particular topic.
  • Problem-oriented literature review : This review focuses on a specific problem or issue and examines the literature to identify potential solutions or interventions. It aims to provide practical recommendations for addressing a particular problem or issue.
  • Mixed-methods literature review : This type of review combines quantitative and qualitative methods to synthesize and analyze the available literature on a particular topic. It aims to provide a more comprehensive understanding of the research question by combining different types of evidence.

Parts of Literature Review

Parts of a literature review are as follows:

Introduction

The introduction of a literature review typically provides background information on the research topic and why it is important. It outlines the objectives of the review, the research question or hypothesis, and the scope of the review.

Literature Search

This section outlines the search strategy and databases used to identify relevant literature. The search terms used, inclusion and exclusion criteria, and any limitations of the search are described.

Literature Analysis

The literature analysis is the main body of the literature review. This section summarizes and synthesizes the literature that is relevant to the research question or hypothesis. The review should be organized thematically, chronologically, or by methodology, depending on the research objectives.

Critical Evaluation

Critical evaluation involves assessing the quality and validity of the literature. This includes evaluating the reliability and validity of the studies reviewed, the methodology used, and the strength of the evidence.

The conclusion of the literature review should summarize the main findings, identify any gaps in the literature, and suggest areas for future research. It should also reiterate the importance of the research question or hypothesis and the contribution of the literature review to the overall research project.

The references list includes all the sources cited in the literature review, and follows a specific referencing style (e.g., APA, MLA, Harvard).

How to write Literature Review

Here are some steps to follow when writing a literature review:

  • Define your research question or topic : Before starting your literature review, it is essential to define your research question or topic. This will help you identify relevant literature and determine the scope of your review.
  • Conduct a comprehensive search: Use databases and search engines to find relevant literature. Look for peer-reviewed articles, books, and other academic sources that are relevant to your research question or topic.
  • Evaluate the sources: Once you have found potential sources, evaluate them critically to determine their relevance, credibility, and quality. Look for recent publications, reputable authors, and reliable sources of data and evidence.
  • Organize your sources: Group the sources by theme, method, or research question. This will help you identify similarities and differences among the literature, and provide a structure for your literature review.
  • Analyze and synthesize the literature : Analyze each source in depth, identifying the key findings, methodologies, and conclusions. Then, synthesize the information from the sources, identifying patterns and themes in the literature.
  • Write the literature review : Start with an introduction that provides an overview of the topic and the purpose of the literature review. Then, organize the literature according to your chosen structure, and analyze and synthesize the sources. Finally, provide a conclusion that summarizes the key findings of the literature review, identifies gaps in knowledge, and suggests areas for future research.
  • Edit and proofread: Once you have written your literature review, edit and proofread it carefully to ensure that it is well-organized, clear, and concise.

Examples of Literature Review

Here’s an example of how a literature review can be conducted for a thesis on the topic of “ The Impact of Social Media on Teenagers’ Mental Health”:

  • Start by identifying the key terms related to your research topic. In this case, the key terms are “social media,” “teenagers,” and “mental health.”
  • Use academic databases like Google Scholar, JSTOR, or PubMed to search for relevant articles, books, and other publications. Use these keywords in your search to narrow down your results.
  • Evaluate the sources you find to determine if they are relevant to your research question. You may want to consider the publication date, author’s credentials, and the journal or book publisher.
  • Begin reading and taking notes on each source, paying attention to key findings, methodologies used, and any gaps in the research.
  • Organize your findings into themes or categories. For example, you might categorize your sources into those that examine the impact of social media on self-esteem, those that explore the effects of cyberbullying, and those that investigate the relationship between social media use and depression.
  • Synthesize your findings by summarizing the key themes and highlighting any gaps or inconsistencies in the research. Identify areas where further research is needed.
  • Use your literature review to inform your research questions and hypotheses for your thesis.

For example, after conducting a literature review on the impact of social media on teenagers’ mental health, a thesis might look like this:

“Using a mixed-methods approach, this study aims to investigate the relationship between social media use and mental health outcomes in teenagers. Specifically, the study will examine the effects of cyberbullying, social comparison, and excessive social media use on self-esteem, anxiety, and depression. Through an analysis of survey data and qualitative interviews with teenagers, the study will provide insight into the complex relationship between social media use and mental health outcomes, and identify strategies for promoting positive mental health outcomes in young people.”

Reference: Smith, J., Jones, M., & Lee, S. (2019). The effects of social media use on adolescent mental health: A systematic review. Journal of Adolescent Health, 65(2), 154-165. doi:10.1016/j.jadohealth.2019.03.024

Reference Example: Author, A. A., Author, B. B., & Author, C. C. (Year). Title of article. Title of Journal, volume number(issue number), page range. doi:0000000/000000000000 or URL

Applications of Literature Review

some applications of literature review in different fields:

  • Social Sciences: In social sciences, literature reviews are used to identify gaps in existing research, to develop research questions, and to provide a theoretical framework for research. Literature reviews are commonly used in fields such as sociology, psychology, anthropology, and political science.
  • Natural Sciences: In natural sciences, literature reviews are used to summarize and evaluate the current state of knowledge in a particular field or subfield. Literature reviews can help researchers identify areas where more research is needed and provide insights into the latest developments in a particular field. Fields such as biology, chemistry, and physics commonly use literature reviews.
  • Health Sciences: In health sciences, literature reviews are used to evaluate the effectiveness of treatments, identify best practices, and determine areas where more research is needed. Literature reviews are commonly used in fields such as medicine, nursing, and public health.
  • Humanities: In humanities, literature reviews are used to identify gaps in existing knowledge, develop new interpretations of texts or cultural artifacts, and provide a theoretical framework for research. Literature reviews are commonly used in fields such as history, literary studies, and philosophy.

Role of Literature Review in Research

Here are some applications of literature review in research:

  • Identifying Research Gaps : Literature review helps researchers identify gaps in existing research and literature related to their research question. This allows them to develop new research questions and hypotheses to fill those gaps.
  • Developing Theoretical Framework: Literature review helps researchers develop a theoretical framework for their research. By analyzing and synthesizing existing literature, researchers can identify the key concepts, theories, and models that are relevant to their research.
  • Selecting Research Methods : Literature review helps researchers select appropriate research methods and techniques based on previous research. It also helps researchers to identify potential biases or limitations of certain methods and techniques.
  • Data Collection and Analysis: Literature review helps researchers in data collection and analysis by providing a foundation for the development of data collection instruments and methods. It also helps researchers to identify relevant data sources and identify potential data analysis techniques.
  • Communicating Results: Literature review helps researchers to communicate their results effectively by providing a context for their research. It also helps to justify the significance of their findings in relation to existing research and literature.

Purpose of Literature Review

Some of the specific purposes of a literature review are as follows:

  • To provide context: A literature review helps to provide context for your research by situating it within the broader body of literature on the topic.
  • To identify gaps and inconsistencies: A literature review helps to identify areas where further research is needed or where there are inconsistencies in the existing literature.
  • To synthesize information: A literature review helps to synthesize the information from multiple sources and present a coherent and comprehensive picture of the current state of knowledge on the topic.
  • To identify key concepts and theories : A literature review helps to identify key concepts and theories that are relevant to your research question and provide a theoretical framework for your study.
  • To inform research design: A literature review can inform the design of your research study by identifying appropriate research methods, data sources, and research questions.

Characteristics of Literature Review

Some Characteristics of Literature Review are as follows:

  • Identifying gaps in knowledge: A literature review helps to identify gaps in the existing knowledge and research on a specific topic or research question. By analyzing and synthesizing the literature, you can identify areas where further research is needed and where new insights can be gained.
  • Establishing the significance of your research: A literature review helps to establish the significance of your own research by placing it in the context of existing research. By demonstrating the relevance of your research to the existing literature, you can establish its importance and value.
  • Informing research design and methodology : A literature review helps to inform research design and methodology by identifying the most appropriate research methods, techniques, and instruments. By reviewing the literature, you can identify the strengths and limitations of different research methods and techniques, and select the most appropriate ones for your own research.
  • Supporting arguments and claims: A literature review provides evidence to support arguments and claims made in academic writing. By citing and analyzing the literature, you can provide a solid foundation for your own arguments and claims.
  • I dentifying potential collaborators and mentors: A literature review can help identify potential collaborators and mentors by identifying researchers and practitioners who are working on related topics or using similar methods. By building relationships with these individuals, you can gain valuable insights and support for your own research and practice.
  • Keeping up-to-date with the latest research : A literature review helps to keep you up-to-date with the latest research on a specific topic or research question. By regularly reviewing the literature, you can stay informed about the latest findings and developments in your field.

Advantages of Literature Review

There are several advantages to conducting a literature review as part of a research project, including:

  • Establishing the significance of the research : A literature review helps to establish the significance of the research by demonstrating the gap or problem in the existing literature that the study aims to address.
  • Identifying key concepts and theories: A literature review can help to identify key concepts and theories that are relevant to the research question, and provide a theoretical framework for the study.
  • Supporting the research methodology : A literature review can inform the research methodology by identifying appropriate research methods, data sources, and research questions.
  • Providing a comprehensive overview of the literature : A literature review provides a comprehensive overview of the current state of knowledge on a topic, allowing the researcher to identify key themes, debates, and areas of agreement or disagreement.
  • Identifying potential research questions: A literature review can help to identify potential research questions and areas for further investigation.
  • Avoiding duplication of research: A literature review can help to avoid duplication of research by identifying what has already been done on a topic, and what remains to be done.
  • Enhancing the credibility of the research : A literature review helps to enhance the credibility of the research by demonstrating the researcher’s knowledge of the existing literature and their ability to situate their research within a broader context.

Limitations of Literature Review

Limitations of Literature Review are as follows:

  • Limited scope : Literature reviews can only cover the existing literature on a particular topic, which may be limited in scope or depth.
  • Publication bias : Literature reviews may be influenced by publication bias, which occurs when researchers are more likely to publish positive results than negative ones. This can lead to an incomplete or biased picture of the literature.
  • Quality of sources : The quality of the literature reviewed can vary widely, and not all sources may be reliable or valid.
  • Time-limited: Literature reviews can become quickly outdated as new research is published, making it difficult to keep up with the latest developments in a field.
  • Subjective interpretation : Literature reviews can be subjective, and the interpretation of the findings can vary depending on the researcher’s perspective or bias.
  • Lack of original data : Literature reviews do not generate new data, but rather rely on the analysis of existing studies.
  • Risk of plagiarism: It is important to ensure that literature reviews do not inadvertently contain plagiarism, which can occur when researchers use the work of others without proper attribution.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Research Results

Research Results Section – Writing Guide and...

Significance of the Study

Significance of the Study – Examples and Writing...

Research Methods

Research Methods – Types, Examples and Guide

Delimitations

Delimitations in Research – Types, Examples and...

Research Techniques

Research Techniques – Methods, Types and Examples

Research Summary

Research Summary – Structure, Examples and...

Frequently asked questions

What is the purpose of a literature review.

There are several reasons to conduct a literature review at the beginning of a research project:

  • To familiarize yourself with the current state of knowledge on your topic
  • To ensure that you’re not just repeating what others have already done
  • To identify gaps in knowledge and unresolved problems that your research can address
  • To develop your theoretical framework and methodology
  • To provide an overview of the key findings and debates on the topic

Writing the literature review shows your reader how your work relates to existing research and what new insights it will contribute.

Frequently asked questions: Academic writing

A rhetorical tautology is the repetition of an idea of concept using different words.

Rhetorical tautologies occur when additional words are used to convey a meaning that has already been expressed or implied. For example, the phrase “armed gunman” is a tautology because a “gunman” is by definition “armed.”

A logical tautology is a statement that is always true because it includes all logical possibilities.

Logical tautologies often take the form of “either/or” statements (e.g., “It will rain, or it will not rain”) or employ circular reasoning (e.g., “she is untrustworthy because she can’t be trusted”).

You may have seen both “appendices” or “appendixes” as pluralizations of “ appendix .” Either spelling can be used, but “appendices” is more common (including in APA Style ). Consistency is key here: make sure you use the same spelling throughout your paper.

The purpose of a lab report is to demonstrate your understanding of the scientific method with a hands-on lab experiment. Course instructors will often provide you with an experimental design and procedure. Your task is to write up how you actually performed the experiment and evaluate the outcome.

In contrast, a research paper requires you to independently develop an original argument. It involves more in-depth research and interpretation of sources and data.

A lab report is usually shorter than a research paper.

The sections of a lab report can vary between scientific fields and course requirements, but it usually contains the following:

  • Title: expresses the topic of your study
  • Abstract: summarizes your research aims, methods, results, and conclusions
  • Introduction: establishes the context needed to understand the topic
  • Method: describes the materials and procedures used in the experiment
  • Results: reports all descriptive and inferential statistical analyses
  • Discussion: interprets and evaluates results and identifies limitations
  • Conclusion: sums up the main findings of your experiment
  • References: list of all sources cited using a specific style (e.g. APA)
  • Appendices: contains lengthy materials, procedures, tables or figures

A lab report conveys the aim, methods, results, and conclusions of a scientific experiment . Lab reports are commonly assigned in science, technology, engineering, and mathematics (STEM) fields.

The abstract is the very last thing you write. You should only write it after your research is complete, so that you can accurately summarize the entirety of your thesis , dissertation or research paper .

If you’ve gone over the word limit set for your assignment, shorten your sentences and cut repetition and redundancy during the editing process. If you use a lot of long quotes , consider shortening them to just the essentials.

If you need to remove a lot of words, you may have to cut certain passages. Remember that everything in the text should be there to support your argument; look for any information that’s not essential to your point and remove it.

To make this process easier and faster, you can use a paraphrasing tool . With this tool, you can rewrite your text to make it simpler and shorter. If that’s not enough, you can copy-paste your paraphrased text into the summarizer . This tool will distill your text to its core message.

Revising, proofreading, and editing are different stages of the writing process .

  • Revising is making structural and logical changes to your text—reformulating arguments and reordering information.
  • Editing refers to making more local changes to things like sentence structure and phrasing to make sure your meaning is conveyed clearly and concisely.
  • Proofreading involves looking at the text closely, line by line, to spot any typos and issues with consistency and correct them.

The literature review usually comes near the beginning of your thesis or dissertation . After the introduction , it grounds your research in a scholarly field and leads directly to your theoretical framework or methodology .

A literature review is a survey of scholarly sources (such as books, journal articles, and theses) related to a specific topic or research question .

It is often written as part of a thesis, dissertation , or research paper , in order to situate your work in relation to existing knowledge.

Avoid citing sources in your abstract . There are two reasons for this:

  • The abstract should focus on your original research, not on the work of others.
  • The abstract should be self-contained and fully understandable without reference to other sources.

There are some circumstances where you might need to mention other sources in an abstract: for example, if your research responds directly to another study or focuses on the work of a single theorist. In general, though, don’t include citations unless absolutely necessary.

An abstract is a concise summary of an academic text (such as a journal article or dissertation ). It serves two main purposes:

  • To help potential readers determine the relevance of your paper for their own research.
  • To communicate your key findings to those who don’t have time to read the whole paper.

Abstracts are often indexed along with keywords on academic databases, so they make your work more easily findable. Since the abstract is the first thing any reader sees, it’s important that it clearly and accurately summarizes the contents of your paper.

In a scientific paper, the methodology always comes after the introduction and before the results , discussion and conclusion . The same basic structure also applies to a thesis, dissertation , or research proposal .

Depending on the length and type of document, you might also include a literature review or theoretical framework before the methodology.

Whether you’re publishing a blog, submitting a research paper , or even just writing an important email, there are a few techniques you can use to make sure it’s error-free:

  • Take a break : Set your work aside for at least a few hours so that you can look at it with fresh eyes.
  • Proofread a printout : Staring at a screen for too long can cause fatigue – sit down with a pen and paper to check the final version.
  • Use digital shortcuts : Take note of any recurring mistakes (for example, misspelling a particular word, switching between US and UK English , or inconsistently capitalizing a term), and use Find and Replace to fix it throughout the document.

If you want to be confident that an important text is error-free, it might be worth choosing a professional proofreading service instead.

Editing and proofreading are different steps in the process of revising a text.

Editing comes first, and can involve major changes to content, structure and language. The first stages of editing are often done by authors themselves, while a professional editor makes the final improvements to grammar and style (for example, by improving sentence structure and word choice ).

Proofreading is the final stage of checking a text before it is published or shared. It focuses on correcting minor errors and inconsistencies (for example, in punctuation and capitalization ). Proofreaders often also check for formatting issues, especially in print publishing.

The cost of proofreading depends on the type and length of text, the turnaround time, and the level of services required. Most proofreading companies charge per word or page, while freelancers sometimes charge an hourly rate.

For proofreading alone, which involves only basic corrections of typos and formatting mistakes, you might pay as little as $0.01 per word, but in many cases, your text will also require some level of editing , which costs slightly more.

It’s often possible to purchase combined proofreading and editing services and calculate the price in advance based on your requirements.

There are many different routes to becoming a professional proofreader or editor. The necessary qualifications depend on the field – to be an academic or scientific proofreader, for example, you will need at least a university degree in a relevant subject.

For most proofreading jobs, experience and demonstrated skills are more important than specific qualifications. Often your skills will be tested as part of the application process.

To learn practical proofreading skills, you can choose to take a course with a professional organization such as the Society for Editors and Proofreaders . Alternatively, you can apply to companies that offer specialized on-the-job training programmes, such as the Scribbr Academy .

Ask our team

Want to contact us directly? No problem.  We  are always here for you.

Support team - Nina

Our team helps students graduate by offering:

  • A world-class citation generator
  • Plagiarism Checker software powered by Turnitin
  • Innovative Citation Checker software
  • Professional proofreading services
  • Over 300 helpful articles about academic writing, citing sources, plagiarism, and more

Scribbr specializes in editing study-related documents . We proofread:

  • PhD dissertations
  • Research proposals
  • Personal statements
  • Admission essays
  • Motivation letters
  • Reflection papers
  • Journal articles
  • Capstone projects

Scribbr’s Plagiarism Checker is powered by elements of Turnitin’s Similarity Checker , namely the plagiarism detection software and the Internet Archive and Premium Scholarly Publications content databases .

The add-on AI detector is powered by Scribbr’s proprietary software.

The Scribbr Citation Generator is developed using the open-source Citation Style Language (CSL) project and Frank Bennett’s citeproc-js . It’s the same technology used by dozens of other popular citation tools, including Mendeley and Zotero.

You can find all the citation styles and locales used in the Scribbr Citation Generator in our publicly accessible repository on Github .

Banner

  • University of La Verne
  • Subject Guides

Literature Review Basics

  • Primary & Secondary Sources
  • Literature Review Introduction
  • Writing Literature Reviews
  • Tutorials & Samples

The Literature

The Literature refers to the collection of scholarly writings on a topic. This includes peer-reviewed articles, books, dissertations and conference papers.

  • When reviewing the literature, be sure to include major works as well as studies that respond to major works. You will want to focus on primary sources, though secondary sources can be valuable as well.

Primary Sources

The term primary source is used broadly to embody all sources that are original. P rimary sources provide first-hand information that is closest to the object of study. Primary sources vary by discipline.

  • In the natural and social sciences, original reports of research found in academic journals detailing the methodology used in the research, in-depth descriptions, and discussions of the findings are considered primary sources of information.
  • Other common examples of primary sources include speeches, letters, diaries, autobiographies, interviews, official reports, court records, artifacts, photographs, and drawings.  

Galvan, J. L. (2013). Writing literature reviews: A guide for students of the social and behavioral sciences . Glendale, CA: Pyrczak.

Secondary Sources

A secondary source is a source that provides non-original or secondhand data or information. 

  • Secondary sources are written about primary sources.
  • Research summaries reported in textbooks, magazines, and newspapers are considered secondary sources. They typically provide global descriptions of results with few details on the methodology. Other examples of secondary sources include biographies and critical studies of an author's work.

Secondary Source. (2005). In W. Paul Vogt (Ed.), Dictionary of Statistics & Methodology. (3 rd ed., p. 291). Thousand Oaks, CA: SAGE Publications, Inc.

Weidenborner, S., & Caruso, D. (1997). Writing research papers: A guide to the process . New York: St. Martin's Press.

More Examples of Primary and Secondary Sources

 
Original artwork Article critiquing the piece of art
Diary of an immigrant from Vietnam Book on various writings of Vietnamese immigrants
Poem Article on a particular genre of poetry
Treaty Essay on Native American land rights
Report of an original experiment Review of several studies on the same topic
Video of a performance Biography of a playwright
  • << Previous: Writing Literature Reviews
  • Next: Tutorials & Samples >>
  • Last Updated: Jul 30, 2024 9:59 AM
  • URL: https://laverne.libguides.com/litreviews

How to Use AI for Literature Review (2024): Complete 7 Step Guide for Researchers

Jc Chaithanya

13 min read

share on facebook

What is a Literature Review and Why Does It Matter?

Different types of literature review,  method 1: using multiple ai tools for literature review, method 2: using elephas for literature review, advantages of using ai for literature review, the traditional approach: manual literature review, the new kid on the block: ai literature review, comparing the two approaches, common concerns and misconceptions about ai in literature review, things to keep in mind while using ai for literature review, conclusion , 1. can gpt-4 do a literature review , 2. can you use ai to do a literature review, 3. what is the ai tool to summarize the literature review.

Literature reviews are a crucial yet time-consuming part of academic research.

With the advent of artificial intelligence (AI), researchers now have tools that can significantly streamline this process. This guide explores how AI can be effectively utilized to enhance and accelerate literature reviews.

We'll cover the following key aspects:​

  • The role of AI in literature reviews
  • Specific AI tools designed for academic research, such as Elephas
  • Best practices for integrating AI into your research workflow
  • Potential benefits and limitations of AI-assisted literature reviews

Whether you're a seasoned researcher or a student embarking on your first major project, this guide will provide practical insights on leveraging AI to improve your literature review process.

So let's get started.

What is a Literature Review?

When researchers start a new project , they don't just jump in blindly. They first look at what others have already figured out. That's where a literature review comes in handy.

It's like piecing together a puzzle. You gather all the bits of information from different studies, articles, and books. Then you start to see the big picture. What do we know so far? Where are the gaps? Are there any hot debates going on?

All these will translate to

It connects their work to the bigger conversation in their field

It stops researchers from reinventing the wheel

It helps them spot new angles to explore

It shows they've done their homework

By digging into existing research, scholars can push knowledge forward. They're not starting from scratch, but building on what's already there. Plus, a good literature review is super helpful for other researchers too. ​

It gives them a quick way to catch up on a topic without having to read through piles of separate studies. This saves time and helps guide future research efforts.

Different Types of Literature Review

Literature reviews can take on various forms depending on their purpose and approach. Below are some of the most popular types of literature reviews:

Narrative Review: This review gives a broad summary of existing studies on a topic but doesn't adhere to a rigid structure. It's often used to provide general insights without analyzing the specifics deeply.

Systematic Review: A systematic review follows a well-defined method to gather, assess, and interpret all available research on a particular question, aiming to reduce bias and provide a more accurate picture of the subject.

Meta-Analysis: This method uses statistical techniques to combine findings from several studies. The goal is to derive a stronger conclusion by merging the data and providing more robust results.

Scoping Review: A scoping review maps out the main ideas and gaps in a field of research, helping to identify where more studies are needed and suggesting potential directions for future research .

Critical Review: This type of review critically examines the strengths and weaknesses of existing research, offering new perspectives or challenging previously accepted theories.

Well, each type offers unique insights based on the research objective, shaping the direction of further inquiry of the research. 

Step-by-Step Guide on How to Use AI for Literature Review

Using AI for literature review can significantly streamline your research process. Let's explore two methods: a multi-tool approach and using Elephas, an all-in-one assistant.

Using Multiple AI Tools for Literature Review

1. Identify Your Research Topic and Keywords

The first step is defining your research area. Use Perplexity AI for topic exploration and generating research questions. This AI tool helps you uncover new angles for research topics by analyzing vast amounts of data quickly.

2. Search for Relevant Articles

Start your literature search by heading to Elicit.org. Export the articles based on categories like abstract, author, title, and publication date. You can also use other AI-powered search tools like Semantic Scholar or Google Scholar to broaden your sources.

3. Generate Summaries and Key Themes with GPT-4o

After gathering your articles, use GPT-4o to analyze the abstracts and generate key themes. Input the abstracts with prompts like, “ Please summarize the key themes from these articles. ” This step saves hours of manual reading and gives you a thematic overview.

4. Draft an Initial Literature Review with Copy.ai

Use Copy.ai to create a first draft of your literature review. Its AI-powered writing features allow you to generate sections of the review quickly and in a structured format. Copy.ai can assist in writing specific sections, such as background or methodology, based on the keywords and themes you provide.

5. Refine with Smart Writing Tools

Use AI tools like Jasper or Writesonic to refine your literature review. These tools help to paraphrase content, improve readability, and adapt the tone to meet academic standards. The rewrites from these tools can help make the content more engaging and coherent. ​

6. Organise References Using Reference Managers

As you finalise your draft, integrate reference management tools like Mendeley or Zotero. These tools can store and organise all the references cited in your paper, and AI integration allows for easy reference generation.

7. Use Perplexity AI for Final Checks

Before submission, use Perplexity AI again to check for any gaps in the research or identify potential new areas to explore. It can provide suggestions based on the latest publications and research trends.

Using Elephas for Literature Review

Elephas offers an all-in-one solution for conducting literature reviews. Here’s how you can use it:

1. Search the Web with Elephas

Elephas’ web search feature allows you to find relevant articles directly within the tool. Simply input your research terms, and it will pull up related papers, articles, and sources for you to analyse.

2. Analyse Key Themes

Using integrated AI models like GPT-4 or Claude, you can analyse the abstracts or summaries of these papers to identify key themes.

3. Generate a Literature Review

With Elephas’ Smart Write feature, you can create a well-structured literature review in just a few prompts. It pulls in the key themes and drafts a coherent review, ensuring that all relevant abstracts are referenced accurately.

4. Organise with Super Brain

Elephas’ Super Brain feature helps you manage the knowledge from the papers, documents, and research you’re using. It organises and categorises the data for easy access during the writing process.

5. Refine and Customise Tone

Elephas allows you to refine the review by using its multiple writing modes (Zinsser, Friendly, Professional, or Viral Mode). You can ensure that the literature review matches your preferred tone and style.

6. Manage References and Citations

With the help of Super Brain, you can manage references and citations within the text, simplifying the process of creating a bibliography.

By using Elephas, you can significantly speed up the literature review process while maintaining high quality. It’s a comprehensive all-in-one AI tool, making it one of the best solutions for conducting literature reviews.

Advantages of Using AI for Literature Review

AI is changing the game for researchers tackling literature reviews. Now, we've got smart tools that can do a lot of the heavy lifting for us. Let's explore some advantages of using AI for literature review. 

Time Efficiency: AI dramatically speeds up the review process. It can analyze thousands of articles in minutes, a task that would take humans days or weeks to complete.

Comprehensive Coverage: AI can thoroughly scan vast databases, ensuring no relevant study slips through the cracks.

Pattern Recognition: AI literature tools excel at identifying trends and connections across multiple studies, often spotting insights that humans might overlook.

Bias Reduction: AI approaches each piece of literature objectively, helping to minimize human biases that can creep into manual reviews.

Multilingual Capabilities: Language barriers become less of an issue. AI can process and analyze research in multiple languages, broadening the scope of reviews.

Data Visualization: Many AI tools can generate clear, insightful visualizations of complex data, making it easier to grasp key findings at a glance.

Continuous Updating: In rapidly evolving fields, AI can keep literature reviews current by continuously incorporating newly published research.

While AI brings these impressive benefits to the table, it's important to remember that it's not a wise move to use AI extensively and limit your human touch in the review. 

The ideal approach is to combine AI with your critical thinking and domain knowledge. As AI technology continues to advance, its role in streamlining and improving literature reviews is only set to grow, opening up exciting new possibilities for more comprehensive and efficient research processes.

Manual Literature Review vs AI Literature Review

In the world of research, literature reviews play a crucial role. They help researchers understand what's already known about a topic and identify gaps in knowledge. Today, we're seeing a shift in how these reviews are conducted, with AI tools coming in and helping researchers to reduce their overall workflow. But what is actually better: manual literature review or AI-assisted literature reviews?

Manual Literature Review

Manual literature reviews have been the standard for a long time. Here's what they typically involve:

They take detailed notes on each source

Researchers spend hours reading through papers and articles

Key themes and patterns are identified through careful analysis

Connections between different studies are made based on the expertise

This method has its strengths. It allows for deep understanding and critical thinking. Researchers can pick up on subtle nuances that might be important. However, it's also very time-consuming and can be limited by the researcher's ability to process large amounts of information.

AI Literature Review

AI-assisted literature reviews are changing the game. Here's how they work:

AI tools can quickly scan thousands of articles

Key themes and patterns are automatically extracted

They use advanced algorithms to identify relevant studies

Connections between studies are made based on data analysis

The speed and efficiency of AI reviews are impressive. They can process far more information than a human could in the same amount of time. This means researchers can get a broader view of their field quickly. AI tools are also great at spotting trends and connections that humans might miss.

When we look at manual and AI literature reviews side by side, we see some interesting differences:

Time efficiency: AI is much faster, potentially saving weeks of work

Scope: AI can cover a broader range of sources

Depth of analysis: Manual reviews often provide deeper insights

Bias: AI can help reduce human bias, but may have its own algorithmic biases

Flexibility: Manual reviews can adapt more easily to unique research needs

Language: AI can work across multiple languages, expanding the scope of research

It's important to note that AI isn't perfect. It might miss context or nuances that a human would catch. That's why many researchers are now using a hybrid approach. 

They use AI to do the initial heavy lifting, then apply their own expertise to refine and interpret the results.

In the end, whether manual or AI-assisted, the goal of a literature review remains the same: to build a solid foundation for new research and contribute to the advancement of knowledge in the field.

About AI in Literature Review

There are many concerns and misconceptions about using AI in literature reviews. It's natural to have doubts about new technology, especially when it comes to something as crucial as research.

One big worry is that AI might replace human researchers. But that's not really the case. AI is a powerful tool, but it can't match the critical thinking and deep understanding that humans bring to the table, at least for now. It's more of a helper than a replacement.

Accuracy Concerns:  There's a misconception that AI might misinterpret or miss important information. Modern AI tools are actually quite accurate, but they do need proper setup and oversight to perform at their best.

Over-reliance on Technology: Some worry researchers might become too dependent on AI, losing their own analytical skills. In reality, AI frees up time for deeper analysis and creative thinking.

Data Privacy Issues: Concerns about data security and privacy are valid. It's crucial to use AI tools that adhere to strict data protection standards.

Limited to Quantitative Analysis: Many think AI can only handle numbers and statistics. Actually, advanced AI can process qualitative data too, including complex text analysis. ​

High Costs: While some AI tools can be expensive, many affordable options exist. The efficiency gains often outweigh the initial investment.

Complexity of Use: There's a belief that AI tools are too complicated for the average researcher. In fact, many are designed with user-friendly interfaces.

When using AI for literature reviews, there are several key points to keep in mind. Let's check out these important considerations that might get you into trouble if not checked properly when using AI for a literature review.

Quality Control: While using AI, you need to always double-check the results it generates. Take the time to review the selected articles and ensure they're truly relevant to your research.

Ethical Considerations: The use of AI in academic work is still a hot topic. Be mindful of ethical concerns, particularly around plagiarism and AI-generated content. Make sure your work is original and properly cited. 

Stay Updated: Keep an eye on the latest developments in AI tools for literature reviews. What's new in the AI market and what’s outdated will help inform you to make the most of these tools.

Define Clear Parameters: Be specific about your research questions, keywords, and inclusion criteria. The more precise your input, the more relevant your results will be.

Understand AI Limitations: AI is great at processing large amounts of data, but it might miss nuances or context that a human would catch. 

Maintain a Critical Perspective: Don't accept AI-generated summaries or analyses at face value. Apply your critical thinking skills. Question the results, look for potential biases, and consider alternative interpretations.

Document Your Process: Keep detailed records of how you used AI in your review. Note which tools you used, what parameters you set, and how you verified the results. This transparency is vital for the credibility of your work.

These tips may be generic and known to everyone, but many researchers, while using AI in their literature writing or revising process, still make these mistakes. Using AI is not wrong, but it's about finding the right balance between technological assistance and human expertise.

You know, it's pretty amazing how AI is shaking things up in the world of research. If you're knee-deep in literature reviews, learning to use AI could be a game-changer for you. It's like having a super-smart assistant who never gets tired and can spot connections you might miss.

There are a bunch of ways to go about it - you could mix and match different AI tools, or go for an all-in-one solution. The trick is finding what clicks for you. Just remember, AI is incredibly helpful, but it's not all good. You've still got to bring your expertise to the table.

As AI keeps evolving, it's opening up new possibilities for research. Who knows what breakthroughs we might see? So, getting comfortable with AI for literature reviews now could really set you up for the future. 

It's an exciting time to be a researcher, that's for sure!

Yes, GPT-4 can help with a literature review by summarizing research papers, analyzing content, and identifying key themes. It speeds up the process by offering relevant insights from sources, but human expertise is still needed to ensure accuracy and a comprehensive understanding.

Yes, AI can assist in conducting a literature review by automating tasks such as summarizing research papers, analyzing large amounts of data, and highlighting important findings. This aids in streamlining the review process, although human judgment is essential for interpreting and validating the results effectively.

AI tools like Elephas, designed for summarizing literature reviews, help streamline the process by providing features such as offline support, multiple language models, and web search integration. These tools can quickly summarize key insights and trends across academic papers and other research sources.

AI assistant

Sign up now

Get a deep dive into the most important AI story of the week. Deliverd to your inbox for free!

Jc Chaithanya

Meet Elephas - Your AI-Powered Knowledge Assistant. Your Personal ChatGPT for all your files. Transform information overload into actionable insights. Organize vast knowledge. Access ideas efortlessly. Save 10 hours a week

You may also want to read

Top 10 Best AI Tools for Literature Review (Free + Paid)

Top 10 Best AI Tools for Literature Review (Free + Paid)

Pinned Post

What’s New from Apple? iPhone 16 with AI, Apple Watch Series 10, and More!

What’s New from Apple? iPhone 16 with AI, Apple Watch Series 10, and More!

iPhone Productivity

Jenni AI Review HONEST Review (2024): Is it the Best AI Writing Assistant for Research Papers

Jenni AI Review HONEST Review (2024): Is it the Best AI Writing Assistant for Research Papers

Previous Post

Deep learning for surgical workflow analysis: a survey of progresses, limitations, and trends

  • Open access
  • Published: 16 September 2024
  • Volume 57 , article number  291 , ( 2024 )

Cite this article

You have full access to this open access article

merits of sources of literature review

  • Yunlong Li 1 ,
  • Zijian Zhao 1 ,
  • Renbo Li 1 &
  • Feng Li 2  

Automatic surgical workflow analysis, which aims to recognize the ongoing surgical events in videos, is fundamental for developing context-aware computer-assisted systems. This paper reviews representative surgical workflow recognition algorithms based on deep learning, outlining their merits, limitations, and future research directions. The literature survey was performed on three large bibliographic databases, covering 67 lary sources, which were comparatively analyzed in terms of spatial feature modeling, spatio-temporal feature modeling, input pre-processing, regularization and post-processing algorithms, as well as learning strategies. Then, common public datasets and evaluation metrics for surgical workflow recognition are also described in detail. Finally, we discuss all literature from different perspectives, and point out the challenges, possible solutions and future trends. The need for more diverse and larger datasets, the potential of unsupervised and semi-supervised learning approaches, comprehensive and equitable metrics, establishing complete regulatory and data standards, and interoperability will be key challenges in translating models to clinical operating rooms. And we propose that surgical activity anticipation and employing large language model as training assistant are interesting research directions in surgical workflow analysis.

Explore related subjects

  • Artificial Intelligence

Avoid common mistakes on your manuscript.

1 Introduction

Modern surgery is increasingly focused on safety, effectiveness, and cost-efficiency (Bharathan et al. 2013 ). The surgical operation quality in traditional operating rooms (ORs) strongly depends on human factors, being related to the observation of human eyes and the cooperation of surgeons. In this case, the surgical experience, skill proficiency, and adaptability of surgeons play a decisive role in the success or failure of surgery. In contrast, in state-of-the-art ORs, surgical data science merges with clinical surgery to provide support for surgeons and better surgical care for patients (Maier-Hein et al. 2017 ). At the same time, due to the progress of information collection and storage systems, massive clinical surgical data can be collected and used for data mining. Therefore, coordinating different data types is an emerging research trend in state-of-the-art ORs (Rodrigues et al. 2019 ).

Computer-assisted surgery (CAS) and robot-assisted intervention have flourished over the past few years, providing surgeons with the necessary assistance to address some complex surgical scenarios (Twinanda et al. 2018 ). With the advent of minimally invasive surgery (MIS), intra-operative imaging has become a vital part of surgical and therapeutic guidance, partially compensating for the lack of information typical of MIS (Maier-Hein et al. 2017 ; Zaffino et al. 2020 ). CAS systems based on intra-operative imaging data are one of the most common systems because image data, such as laparoscopic surgical images, are often readily available in the clinic.

Surgical workflow recognition is a challenging task in CAS to recognize the events occurring in a given surgical video stream (Twinanda et al. 2016 ). Online surgical workflow recognition allows real-time monitoring of the surgical process and early warning of intra-operative risks. Offline surgical workflow recognition can be used to automatically generate surgical reports, which can help surgeons conduct surgical assessments and further normative training (Blum et al. 2010 ; Padoy et al. 2012 ). The key distinction between them lies in whether future information is taken into account. According to the level from coarse-grained to fine-grained, a surgery can be described by surgical procedure, phases, steps, activities, and content (Mascagni et al. 2022 ). Surgical phases, the highest-level term except surgical procedure in ORs, describe a set of fundamental surgical aims to complete the surgical procedure, such as anesthesia, cutting, or suturing. Surgical steps refer to a series of surgical activities performed to accomplish a surgical phase, such as drug injection, drainage tube placement, or irrigation. Surgical activities are formalized as action triplets, i.e., <instrument, verb, target>. A triplet includes the instrument being manipulated, the verb describing the activity at stake, and the anatomy or surgical materials (e.g., compress and endobag) being targeted. It describes the intraoperative instrument-tissue interaction comprehensively. For instance, <grasper, retract, gallbladder> means that a grasper is retracting the gallbladder. This paper presents an overview of surgical workflow recognition at different levels, including phases, steps, and activities. Note that we only focus on surgical activities defined as action triplets.

In early studies, manually designed descriptors were used to extract spatial features for surgical workflow recognition, such as intensity gradients (Blum et al. 2010 ) or a combination of color, shape, and texture (Lalys et al. 2012 ). However, because these features are designed empirically, they are prone to feature loss or redundancy. On the other hand, dynamic time warping (DTW) (Padoy et al. 2012 ; Lalys et al. 2012 ; Blum et al. 2010 ), conditional random field (CRF) (Quellec et al. 2014 ; DiPietro et al. 2016 ; Charriere et al. 2016 ), and hidden Markov models (HMM) (Padoy et al. 2012 ; Lalys et al. 2012 ; DiPietro et al. 2016 ) have been widely used in modeling temporal features. Still, the results of these models are limited and need to adapt to automatic real-time surgical workflow recognition. Due to the increased hardware computing power and a large number of proposed advanced neural network models, deep learning has achieved superior results in computer vision compared to traditional methods. It has dominated and achieved state-of-the-art results in surgical workflow recognition.

There have been several reviews that have summarized the algorithms for surgical workflow analysis. Padoy ( 2019 ) described how machine and deep learning techniques could be used to analyze the activity that occurred in surgical videos, along with two potential clinical applications. Nevertheless, he did not expand on the existing models and methods. Birkhoff et al. ( 2021 ) provided a comprehensive review of current actual AI applications inside the modern ORs, including but not limited to surgical workflow recognition. Garrow et al. ( 2020 ) reviewed the models and datasets for automated surgical phase recognition up to 2019, but the methods they reviewed were all based on supervised learning and were largely based on traditional machine learning algorithms. Moreover, Nwoye et al. ( 2023 ) presented a challenge report to review the methods for surgical action triplet recognition in CholecTriplet2021 benchmark challenge, without including the published research articles. Demir et al. ( 2023 ) reviewed surgical workflow analysis algorithms from 2018 to July 2022, but barely covered surgical action triplet recognition. And they did not summarize regularization and post-processing methods.

This survey aims to provide systematic and comparative reports of the latest deep learning techniques in surgical workflow analysis (i.e., phase, step, and activity recognition), so as to keep researchers abreast of the latest trends and research findings in the field. Our main contributions are as follows:

Based on the recent relevant publications, we review the deep learning-based algorithms of surgical phase, step and activity recognition retrieved from Jan 2018 to April 2024. Compared with some existing review papers, this is the first time that the published papers on surgical triplet recognition have been systematically investigated and analyzed.

We categorize and analyze the above deep learning-based algorithms and approaches from five perspectives: spatial feature modeling, spatio-temporal feature modeling, input pre-processing, regularization and post-processing, and learning strategy, focusing on their remarkable points, advantages, limitations, and compatibility. To the best of our knowledge, it is the first time that regularization and post-processing methods in surgical workflow analysis have been systematically reviewed.

We list common large-scale datasets and evaluation metrics for surgical workflow recognition and give specific details about them.

We summarize the bottlenecks encountered in the development of the field and provide an outlook on future research prospects to help researchers for further studies in this regard.

The rest of this paper is organized as follows: Sect.  2 describes our retrieval methodology. Section  3 analyzes the reviewed papers from different perspectives. Section  4 summarizes the common evaluation metrics and public datasets. Section  5 discusses the strengths and limitations of the reviewed papers, and points out challenges and possible directions in surgical workflow analysis. Section  6 concludes the paper.

2 Retrieval methodology

We conducted the bibliographic search on IEEE Xplore, PubMed, and ScienceDirect. In the process of retrieval, we mainly used the following terms for searching: surgical workflow analysis, surgical workflow recognition, surgical phase recognition, surgical step recognition, surgical activity recognition, surgical action triplet recognition where the term “recognition” could be approximately replaced with “classification” for retrieval since the classification of surgical workflows was always represented by the term recognition. The above terms were concatenated using the logical operator “OR” and searched through all metadata (i.e. title, author, abstract, etc.). We set the retrieval time range from Jan 2018 to Apr 2024 to ensure that we could obtain the most recent and relevant findings while disregarding the outdated ones.

figure 1

The general process of literature research

After researching the above publication databases, we removed duplicates in the obtained papers. Then, in the stage of literature screening, we screened the remaining papers according to the following objective criteria: (1) the papers needed to be written in English, otherwise they would be eliminated; (2) the papers must be research articles; (3) the datasets used must be real or simulated human surgery data, the papers that used animal surgery data and synthetic data would be excluded; (4) the algorithms must be based on deep learning; (5) the algorithms can achieve surgical phase, step or activity recognition. Note that we focus only on surgical activities defined as action triplets. Meanwhile, a manual search as supplements was performed by using “References” and “Cited by” sections. It also followed the above objective criteria. The overall selection process is illustrated in Fig.  1 . Finally, we identified a total of 67 publications based on deep learning to be reviewed. The approximate information on these publications is listed in Table  1 .

3 Models and algorithms based on deep learning

In the surgical workflow recognition task, the input data is mainly in the form of video. Therefore, not only the spatial visual information of a single frame but also the dependencies across frames in the temporal dimension are considered in the modeling process. At the same time, some modeling approaches may employ various preprocessing or post-processing methods or regularization mechanisms to improve performance.

This section provides a comprehensive review of spatial feature modeling methods , spatio-temporal feature modeling methods , input preprocessing approaches , and some regularization and post-processing methods in surgical workflow recognition. Finally, from the perspective of learning strategies , we discuss how the various learning strategies can be used to address the dilemma of the low labeled data abundance. Note that the spatial feature modeling (Sect.  3.1 ) refers to those articles that adopted only spatial features of a single frame to predict surgical workflows, while the spatio-temporal feature modeling (Sect.  3.2 ) refers to articles that considered both spatial relationships and temporal dependencies. In a word, the key difference between the two paradigms is whether the temporal patterns contained in the data are exploited or not.

3.1 Spatial feature modeling

In recent years, convolutional neural networks (CNNs) have been proven to be effective in feature extraction and have shown strong performance in image classification and object detection (Li et al. 2022b ). In contrast to manual feature extractors based on domain prior knowledge, it can automatically learn semantic features at different levels from the input data. As the CNN layers are stacked, the receptive field gradually increases, and the semantic information of the learned features becomes richer. This property of CNNs has led to their rapid popularity in computer vision and their widespread use in spatial feature modeling for surgical workflow recognition.

In earlier research, Twinanda et al. ( 2016 ) proposed EndoNet, which applied AlexNet to feature extraction, achieving SOTA results on phase recognition and tool presence detection. However, the network structure of EndoNet is too shallow, so there is much room for performance improvement. Extracting discriminative features is vital for process recognition tasks, and network depth is critical in determining CNN performance. Qi et al. ( 2019 ) employed a deeper network structure and compared the effect of different network depths on the model’s performance. They found that When the network is too deep, it can be computationally expensive and may make the optimization more difficult, so they chose ResNet-50 as their CNN on balance. In the above 67 papers, 58% of the studies adopted ResNet, widely used in deep learning, as a feature extractor. Nwoye et al. ( 2020 ) first used the recognition a fully convolutional neural network to recognize surgical action triplets, which consisted of instrument, verb and target. They utilized the class activation map of instruments to assist in the detection of verbs and targets, and finally modeled the associations between detected components through a novel 3D interaction space.

Although the CNN-based feature extractor achieves excellent performance, it does not learn global and remote semantic information interactions well due to the inductive bias inherent in convolution. Recently, some studies have attempted to use Transformers to model spatial features. For example, Li et al. ( 2022a ) proposed a DETR-like approach for surgical action recognition, where multi-head self-attention learned complex relationships among surgical triplets, and multi-head cross-attention captured dependencies between image features and each surgical triplet. Since the number of triplet classes that could be modeled by 3D interaction space far exceeded the number of valid triplet classes, the training of Nwoye et al. ( 2020 ) in was difficult. Later, Nwoye et al. ( 2022 ) recognized verbs by channel attention and targets by position attention based on the assumption that verbs were mainly affected by the type of instruments, while targets were often determined by the position of instruments. To solve the association problem, they designed a Multi-Head Mixed Attention (MHMA), which combined multiple self-attention and cross-attention to capture associations between instruments, verbs, and targets efficiently.

In addition to the above methods based on CNNs and Transformers, other architectures or algorithms have also been explored. To tackle the long-tail distribution and complex associations in action triplet recognition, Xi et al. ( 2022 ) constructed a classification forest consisting of three classification trees, whose parent nodes were instrument, verb, and target, respectively, to calibrate the logits of triplet classes. In addition, they introduced graph convolution network to model the complex dependencies among the triplet classes effectively and achieve the state-of-the-art performance. Feng et al. ( 2024 ) proposed a three-stage framework to infer surgical phases based on spatial geometric attributes. Specifically, they first extracted the geometric properties from multi-object segmentation, then calculated the relative position between instruments and corneas, and finally predicted surgical phases through their reasoning algorithm that was defined based on prior geometric properties.

In summary, these approaches employed CNNs, Transformers, GCNs, or geometric attributes to recognize surgical workflows based solely on frame-level features. However, temporal cues in surgical videos remained unexploited. Subsequently, we will review the approaches that integrated temporal information.

3.2 Spatio-temporal feature modeling

One of the most critical challenges in automatically identifying surgical workflows from videos is that complex surgical scenes usually have limited inter-phase variances but high intra-phase variances as shown in Fig.  2 . Additionally, the fast motion of the camera and the gas produced during the surgery results in severe scene blurring, adding to the difficulty of recognition. Several studies (Twinanda et al. 2016 ; Qi et al. 2019 ; Sánchez-Matilla et al. 2022 ) have used only framewise information for surgical workflow recognition, but the results are unsatisfactory, and there is still much room for improvement. Gao et al. ( 2021 ) reported that ResNet’s inference results generated a large number of jumps on account of the lack of temporal information. Therefore, it is challenging to perform accurate workflow recognition only by visual features of a single frame. Since the surgical procedure evolves, the phase, step or action at the current time-stamp naturally depends on adjacent frames. Accurate workflow recognition needs to leverage temporal relationships and effectively capture sequential dynamics. Most studies first perform spatial feature extraction and then mine for temporal dependencies, as shown in Fig.  3 .

figure 2

a and b Reflect lower inter-phase variance and higher intra-phase variance, respectively

figure 3

The mainstream pipeline in the literature reviewed. The dashed boxes indicate optional information

The following subsections review the papers, which employed spatio-temporal feature modeling, from two major aspects. On the first hand, we review how spatial feature extraction was performed in these papers, i.e., Sect.  3.2.1 . In the second aspect, we describe various approaches to temporal dependency modeling, such as recurrent neural networks (RNNs), temporal convolutional networks (TCNs), nonlocal operations, self or cross-attention, and others, namely Sects.  3.2.2 – 3.2.5 . The statistics of temporal components used in the literature we reviewed are shown in Fig.  4 .

figure 4

Statistics of temporal components used in the literature we reviewed

3.2.1 Spatial feature extraction

For surgical workflow analysis based on spatio-temporal feature modeling, spatial feature extraction is the cornerstone of constructing an effective recognition model. Although consecutive frames of surgical videos may be highly similar visually, it is crucial for understanding the temporal dynamics of the surgery to extract the spatial features of each frame accurately. This section will focus on how to extract spatial features within spatio-temporal framework to enhance the ability of surgical workflow recognition.

Given that different surgical phases might have highly similar visual features, which might adversely affect the prediction of the frames, Fang et al. ( 2022 ) fine-tuned the pre-trained YOLOv3 detector to extract the patches of pupils and surgical instruments in frames, a method whose core spirit is similar to the region proposal network (RPN) in Faster R-CNN (Ren et al. 2017 ). Then, they performed further feature extraction on these patches with fine-grained local features and the original video frames with global information to generate global–local information that is beneficial to the recognition of hard frames. Their ablation study proved that the global–local feature extraction could mitigate temporal and spatial variations and obtain more robust temporal information. Gao et al. ( 2021 ) trained the model using a two-step strategy, where the spatial feature extractor was first trained by frame-wise supervision and then the trained and fixed one was utilized to generate temporal embeddings.

Pan et al. ( 2022 ) proposed learning highly expressive representations by Swin Transformer. Chen et al. ( 2022 ) fine-tuned the pre-trained PCPVT (Chu et al. 2021 ) as their spatial transformer to produce a latent representation of each time step to model spatial relationships globally. Zhang et al. ( 2021a ) conducted a comparative experiment using different designs of feature extraction networks and action segmentation networks, i.e., fully convolutional networks, fully transformer networks, and hybrid networks. Their experimental results showed that the overall performance was the best when the full convolutional network R(2+1)D, which was a spatio-temporal video model and could capture spatial-temporal features simultaneously, was used as the feature extraction network.

In summary, the above methods are used to initially extract features by Transformer, 3DCNN, or object detector. Unlike CNN, Transformer capture global spatial relationships directly. However, in the literature we reviewed, a CNN backbone (e.g., ResNet) is the common spatial feature extractor.

3.2.2 Recurrent neural networks (RNNs)

Recurrent neural network (RNN), whose output depends not only on the input at the current time step but also on the previous states, is a classical network structure for processing sequential data. Among 67 studies in Table  1 , 26 works used RNN-based temporal components, with the majority being standard LSTMs. Nakawala et al. ( 2018 ) proposed Deep-Onto, which combined deep learning models composed of InceptionV3 and LSTM with ontology and production rules to recognize phases, actions, and instruments. Mondal et al. ( 2019 ) utilized a BiLSTM to encapsulate long-term dependencies, both in the past and the future, making it only possible to implement offline recognition. Ban et al. ( 2021 ) used the memory vectors of all past moments to calculate the sufficient statistics of past moments. Then, the sufficient statistical features of past moments were concatenated with CNN features into enhanced features and fed into LSTM for prediction. Shi et al. ( 2022 ) employed IndyLSTM as part of a temporal model to capture single spatial-temporal information independently and achieved 89.8% accuracy and 23.3 FPS.

In summary, given the inherent suitability of RNNs for capturing the sequential nature of surgery, 26 works used RNN-based temporal components to capture temporal dependencies, with the majority being standard LSTMs. Nevertheless, RNNs have some evident drawbacks that constrain their potential. The sequential dependency of RNNs restricts parallel computing. And considering that clinical surgical videos often span tens of minutes or even several hours, the time cost of model training becomes prohibitively high.

3.2.3 Temporal convolutional networks (TCNs)

Temporal convolutional network (TCN) is a specialized structure designed for processing sequential data (Lea et al. 2016a ). It aims to capture long-term temporal contexts efficiently by expanding the receptive field through dilated convolutions rather than increasing network depth excessively.

Czempiel et al. ( 2020 ) introduced multi-stage TCNs (MS-TCNs) to surgical phase recognition for the first time and compared them with LSTM-based networks. They also explored different numbers of TCN stages and reported that adding just one TCN stage could significantly improve performance, but adding three TCN stages could lead to overfitting. Ramesh et al. ( 2021 ) proposed MTMS-TCN based on TeCNO (Czempiel et al. 2020 ) that only performed phase recognition. Their model could jointly recognize two complementary levels of phase and step, and outperformed TeCNO in both individual and joint recognition tasks. And the step-phase accuracy was very close to the step accuracy, indicating that the model could benefit from inherent hierarchical relationships. At the same time, they observed from the experiment that multi-stage TCN did not provide a significant improvement compared to single-stage TCN. Therefore, they adopted single-stage TCN in their subsequent work (Ramesh et al. 2023b ). Golany et al. ( 2022 ) employed an approach similar to the method of Czempiel et al. ( 2020 ) to classify laparoscopic cholecystectomy into five complexity levels and recognize the corresponding phases and adverse events. Zhang et al. ( 2021c ) utilized a four-stage acausal TCN to extract global temporal features for offline surgical workflow recognition. Inspired by Inception, Jin et al. ( 2021 ) used temporal convolution with multiscale kernels to encode long-range features at different temporal scales and tolerate variability in temporal duration. They compared the effects of different lengths of supporting features and found that performance deteriorated when the length was overly long. This might be because excessive long-range information brought too much irrelevant noise and changed the background. Park et al. ( 2023 ) introduced the positional encoding technique into MS-TCN for the first time, and improved the accuracy by 1.31% compared to the baseline.

The above-mentioned papers demonstrate that TCN has been a powerful tool for surgical workflow analysis, particularly phase and step recognition. Compared to RNN, TCN can compute in parallel, making it more attractive for real-time applications. However, the number of TCN stages and the dilation rate may need to be carefully optimized to avoid overfitting and introducing excessive noise (Czempiel et al. 2020 ).

3.2.4 Nonlocal and self-attention or cross-attention mechanism

RNNs or TCNs are progressive operations, i.e., they deal with local regions in the time dimension, resulting in only repeated operations to capture long-term associations. Nonlocal operations can directly compute the relationship between any two positions, which can be temporal, spatial, or spatio-temporal (Wang et al. 2018 ). The nonlocal operation is defined as follows:

Since nonlocal operations can globally consider the multiscale features of past moments, Jin et al. ( 2021 ) created a nonlocal bank operator to integrate temporal clues, i.e., to enhance the current spatio-temporal features by leveraging the information stored in the memory bank. Shi et al. ( 2022 ) inserted a nonlocal block at the top of the network to enable the network to enhance the uniqueness of features. They achieved a 2.2% improvement in accuracy compared to the model without a nonlocal block. This demonstrated that non-local blocks could capture cross-frame dependencies among all frames and make effective use of remote temporal information. Ding et al. ( 2020 ) embedded nonlocal blocks into 3D ResNet-18 to learn richer temporal features and achieved promising results on the MICCAI 2016 Workflow Challenge dataset. Shi et al. ( 2020 ) employed a nonlocal block to capture the long-term dependency of frames within each clip and selected video clips with less nonlocal intra-clip dependency for annotation queries in active learning mode.

Self-attention can be viewed as a form of the nonlocal mean (Wang et al. 2018 ). Transformer is a deep neural network based on self-attention and cross-attention mechanisms and shows extraordinary capabilities in visual feature representation. Transformers explicitly model all pairwise interactions between elements in a sequence, which helps to capture cross-frame dependencies and preserve essential features in an ultralong sequence.

In the process of temporal feature extraction, partial visual features may be lost. Gao et al. ( 2021 ) employed Transformer to perform cross-layer aggregation of spatial and temporal embeddings, similar to residual connectivity, to compensate for the loss of fine-grained spatial information and achieved 90.3 ± 7.1% accuracy on the Cholec80 dataset. In their ablation study, a maximum accuracy of 88.6 ± 7.8% was achieved when using a network without a Transformer. When Transformer was introduced and temporal embeddings were used as keys or queries, the performance could reach a minimum of 89.1 ± 7.8%. This proved the effectiveness of the Transformer in temporal modeling. However, the spatial feature extractor trained using the strategy of Li et al. ( 2022a ), mentioned in Sect.  3.1 , might be suboptimal due to only relying on frame-level supervision. To overcome this limitation, Liu et al. ( 2023a ) enhanced the first step of the aforementioned two-step strategy to train a temporally-rich spatial feature extractor, i.e. they introduced temporal supervision during the training process. Then, a frozen extractor whose temporal component (i.e., a Transformer with causal mask) was removed was used for subsequent training. Visualizing through principal component analysis (PCA), they observed that the frame features embedded by temporally-rich spatial feature extractor were more discriminative. With similar motivations, Chen et al. ( 2023b ) devised a Multi-Scale Surgical Temporal Action (MS-STA) module that progressively performed temporal difference operations to extract multi-scale features and inserted it into the backbone, allowing it to capture spatio-temporal information at the computational cost of 2D networks. In contrast to the method of Liu et al. ( 2023a ), they retained the MS-STA in the second step of training.

Czempiel et al. ( 2021 ) extended Transformer architecture with the addition of attention regularization that forced the model to focus on higher-quality CNN features. Zhang et al. ( 2022 ) created a sequence-to-sequence formulation for phase recognition, with different configurations (time-synchronous or time-shift), architectures (LSTM or Transformer), and learning strategies. The model with Transformer outperformed the one with LSTM in most cases when other settings were equal. Ding and Li ( 2022 ) used a Transformer to capture the intrinsic connection between segment-wise and frame-wise features as a way to improve frame-wise error predictions with high-level information on segments. Similarly, Zhang et al. ( 2024a ) extracted temporal information at the segment-level and frame-level from the entire video by fast and slow paths, and then merged and refined them to produce offline predictions. Chen et al. ( 2022 ) used different types of Transformers to decompose the spatial and temporal dimensions of the surgery. To handle the multiplicity of phase duration, they designed a dual pyramid pattern as a temporal Transformer to capture multi-scale contexts. Zheng et al. ( 2022 ) proposed MT-ViT to recognize tool presence and episodes in bronchoscopy videos. The tool branch utilized MLP to capture spatial information, while the other branch employed Transformer encoders to model spatio-temporal interactions between patched embeddings. The global receptive field of Transformer enabled their model to capture the fast tool motion effectively, reducing training time by 85% compared to TeCNO (Czempiel et al. 2020 ).

While numerous works aggregated temporal context, they were usually limited to a single level. However, this strategy may be insufficient or adverse. For example, for frames that can be accurately classified only by spatial features, excessive fusion of temporal information may be counterproductive. To address this issue, Yue et al. ( 2023 ) proposed an Adaptive Multi-Level Context Aggregation (AMCA) module that fused frame-specific spatial features, frame-level temporal context, and stage-level temporal context for each frame adaptively. Zhang et al. ( 2023 ) improved the encoder and decoder blocks of Cross-Enhancement Transformer (CETNet) by dilated causal convolution to prevent future information leakage. At the same time, inspired by FPN, they fused the output features of the encoder layer and the decoder layer in a bottom-up mode to integrate global and local information better. Tao et al. ( 2023 ) proposed a latent space-constrained transformer with two branches, known as LAST, to comprehensively capture the underlying semantic structure of surgical procedures through a variational autoencoder (VAE) and encourage the model’s output to adhere the learned statistical distributions.

Despite the aforementioned works have demonstrated the capability of self-attention in capturing temporal relationships, its quadratic time and memory complexity pose limitations on processing long videos. Liu et al. ( 2023a ) utilized a more efficient self-attention called ProbSparse, which reduced both complexities from quadratic order to linear logarithmic order, to aggregate global temporal information. Later, they proposed a novel key-recorder, which used key pooling to record the appeared key events, with O (1) time complexity (Liu et al. 2023b ).

Both the nonlocal operation and Transformer excel in overcoming the constraints of local context and capturing comprehensive information when analyzing surgical videos. From the review above and Fig.  4 b, it can be seen that Transformer is more popular than nonlocal operation, thanks to its generality, high parallel computing efficiency, and excellent community ecology. Nevertheless, vanilla Transformer is challenged by high computational complexity and reliance on extensive data. However, by introducing various optimization techniques such as sparse attention (Liu et al. 2023a ), the computational cost of Transformer can be reduced.

3.2.5 Other architectures or algorithms

Hard frames are frames that possess similar visual characteristics but belong to different phases. According to the experimental results of Yi and Jiang ( 2019 ), mixing hard frames and simple frames, i.e., using raw video, would reduce the model’s performance for training. They proposed separating hard frames from simple frames and then recognizing the two types of frames separately. Yi et al. ( 2023 ) reported that the end-to-end training strategy would significantly curtail its refinement capability when simply applying the multi-stage structure to surgical workflow recognition. They proposed a non-end-to-end training strategy that trained the predictor stage and the refinement stage separately. The predictor stage was trained by raw video data. Two kinds of disturbed sequences, designed for simulating the predictor stage’s imperfect predictions, were utilized for training the refinement stage. When TCN was employed in the refinement stage, the accuracy of the end-to-end strategy was 89.8 ± 6.6%, while the non-end-to-end strategy improved to 92.8 ± 5.0%. Chen et al. ( 2023a ) decomposed the surgical triplet recognition task into five sub-networks, in other words, optimizing three different classification problems jointly. Specifically, the first sub-network performed the numerical supplementary task predicting the presence/number of three components, the second learned the association between them, and the other three utilized the CAGAM (Nwoye et al. 2022 ) to predict the three components respectively. Furthermore, the authors proposed a hierarchical training schedule that divided training into multiple stages to avoid distraction from the key task caused by optimizing multiple auxiliary objectives simultaneously.

Kadkhodamohammadi et al. ( 2022 ) utilized graph neural networks (GNNs) to integrate temporal information, where each frame was a node in the graph, and edges in the graph were used to define the temporal connection between nodes. Their approach took the temporal order into account by positional encoding. Their ablation experiments showed that when the temporal component is their proposed GNN-based network, the model’s overall performance improved by 2% and approximately 3% over using LSTM and TCN, respectively. Considering that Nwoye et al. ( 2022 ) only utilized frame-level information and did not exploit temporal information, Sharma et al. ( 2022 ) extended it by designing a plug-and-play Temporal Attention Module (TAM), which computed attention weights to the features of each frame and fused them into the final frame. Pradeep and Sinha ( 2021 ) proposed a fully convolutional network as a decoder, ST-EFRNet, with the architecture of encoder-decoder-encoder. Both encoders enforce learning on the feature space, but the decoder learns on the spatio-temporal space.

Some studies attempted to improve recognition performance by adding other auxiliary information. Considering that edge information is the most essential feature of an image due to its invariant property, Qi et al. ( 2019 ) used different edge detection operators to generate edge information and train it jointly with the original frames. Sánchez-Matilla et al. ( 2022 ) compared the effect of different backbones and sets of annotations (namely, phase, scene segment, and instrument presence) on phase estimation performance. They concluded that using a data-centric approach by merging data from other sources could improve the performance of workflow recognition.In clinical surgery, surgeons often operate with corresponding tools for a specific surgical phase. Mondal et al. ( 2019 ) and Jin et al. ( 2020 ) performed multi-task learning to mutually benefit the two tasks to exploit the correlation between tool presence and phase. Zisimopoulos et al. ( 2018 ) used ResNet to generate tool features or binary presence, which was then fed into RNN for phase recognition. They reported that tool features besides binary presence supplied discriminative information for LSTM, especially in phase transitions. In addition, motion clues from low-level gestures could be used as a supplement to visual information.

In conclusion, the aforementioned literature has demonstrated the novelty and practicality in the directions of training strategy, lightweight temporal attention, modeling temporal dependencies using GNN, and auxiliary information. Yi and Jiang ( 2019 ), Yi et al. ( 2023 ) and Chen et al. ( 2023a ) enhanced recognition performance by optimizing training strategy, but these algorithms introduced additional manual operations that made the training process more complicated. GNN (Kadkhodamohammadi et al. 2022 ) and lightweight attention module (Sharma et al. 2022 ) have fewer parameters compared to vanilla Transformer, which is crucial in surgical workflow recognition with limited data. Adding auxiliary information can improve the performance but relies on additional annotations (Sánchez-Matilla et al. 2022 ) or image processing (Qi et al. 2019 ).

3.3 Input preprocessing

In the literature reviewed, downsampling of raw video to reduce redundancy and data enhancement were widely used as data preprocessing methods. Lee et al. ( 2024 ) a novel undersampling method using short clips with adaptive temporal subsampling to mitigate biased learning caused by class imbalance. With the exception of Xi et al. ( 2022 ) and Xi et al. ( 2023 ), all literature used data augmentation techniques to augment the training data. Furthermore, Ramesh et al. ( 2023a ) proposed an automatic data augmentation method for surgical video data, which incorporated temporal dimension and parameterized by only three hyperparameters.

3.4 Regularization and post-processing

The introduction of regularization and post-processing can improve the performance of the model to a certain extent. In this subsection, we systematically review regularization and post-processing strategies involved in 67 articles to help researchers get a clear idea of these tricks.

In clinical surgery, the surgeon usually follows a designated workflow and order to perform the operation. Based on a prior knowledge of the sequence of surgical phases, three works (Jin et al. 2018 ; Zhang et al. 2021b ; Pan et al. 2022 ) utilized analog filtering algorithms to correct the initial predictions. Zia et al. ( 2018 ) and Mondal et al. ( 2019 ) employed a median filter as a post-processing step to ensure that the predictions of frames within a sliding window were consistent.Czempiel et al. ( 2020 ) calculated class weights using median frequency balance to mitigate the imbalance between phases. Li et al. ( 2022a ) devised two post-processing modules, i.e. weighted attention module and valid triplet decoder, to enhance the probability variance among the predicted triplets. The former module measured the importance of each preliminary triplet prediction and its components, which was then utilized by the latter module for the final triplet predictions.

Shi et al. ( 2021 ) added two regularizations, namely, spatial and temporal perturbations, to the unlabeled dataset, feeding them into the teacher and student models, respectively. Since the predictions of the two models should be consistent, the models were forced to learn the rich motion cues in the videos by minimizing the consistency loss. Ding and Li ( 2022 ) extracted segment-level and frame-level information to improve the recognition of hard frames by regularizing the predictions of frames and their corresponding segments through a semantic consistency loss. Xia and Jia ( 2021 ) applied spatio-temporal features to recognize two different granularities and diminished the false predictions of ambiguous frames by their interaction. Specifically, they expanded the distance between ambiguous sequences belonging to different steps and closed the distance between those belonging to the same steps. Chen et al. ( 2023b ) proposed the Dual-classifier Sequence Regularization (DSR) to improve network training. Specifically, at the early sequences, the task classifier was regularized by the frame-wise auxiliary classifier, because the early sequences with a limited quantity of past frames in the task classifier provided insufficient temporal knowledge. In turn, the task classifier with rich spatio-temporal information regularized the auxiliary classifier at the late sequences. Park et al. ( 2023 ) proposed Moment Loss to penalize undesirable phase transitions and prevent over-segmentation, which significantly improved the performance of the model. Unlike TeCNO (Czempiel et al. 2020 ), which added regularization loss to force each stage to output as perfect a result as possible, Yue et al. ( 2023 ) designed a refinement loss for their cascade structure. It retained prediction errors in the early stages intentionally, so that the later stages could learn how to extract high-quality context by correcting the errors. To incorporate the crucial temporal information embedded in phase transitions, Liu et al. ( 2023a ) projected transition frames onto a phase transition map by a one-dimensional asymmetric Gaussian kernel to provide phase transition-aware supervision for the model. Although global features synthesized all class knowledge, representative features of each triplet class were key to improve performance. Li et al. ( 2023 ) proposed a multi-label mutual channel loss to force each sub-branch to extract class-level local discriminative and diverse features.

The aforementioned studies incorporated regularization and post-processing techniques from various perspectives to facilitate the model in making more rational decisions, such as incorporating domain-specific prior knowledge, ensuring consistency, and employing innovative loss functions, etc. Calibration with predefined surgical orders (Jin et al. 2018 ; Zhang et al. 2021b ; Pan et al. 2022 ) is more suitable for phase and step recognition, which are more procedural in the process. However, the scalability of this approach is limited. Consistency includes many aspects, such as temporal consistency, semantic consistency, predictive consistency, etc. Temporal coherence that is an aspect of temporal consistency (Zia et al. 2018 ; Mondal et al. 2019 ) serves to prevent over-segmentation. Semantic consistency (Ding and Li 2022 ) calibrates prediction errors of ambiguous frames with the help of semantic-consistent information of other levels. Predictive consistency (Shi et al. 2021 ) forces the model to produce consistent outputs under disturbances, thereby obtaining a more robust model. However, temporal coherence and calibration with predefined surgical orders may be not suitable for fine-grained action triplet recognition due to its multi-label and irregular characteristics. Furthermore, devising novel and appropriate loss functions (Li et al. 2023 ; Park et al. 2023 ; Yamlahi et al. 2023 ) can enhance the quality of intermediate features and improve recognition performance.

3.5 Learning strategy

Supervised learning was the most frequently used learning strategy in the reviewed studies (55 of 67 papers). However, supervised learning paradigms require sufficient amounts of labeled data for training. The process of annotating large-scale datasets is time-consuming, repetitive, and tedious. In particular, for surgery videos, annotations need to be provided by medical experts with highly specialized knowledge. However, unlabeled surgery videos are much easier to access. Consequently, there is an urgent need to explore alternative learning paradigms that can leverage a larger number of unlabeled datasets.

Yu et al. ( 2018 ) proposed a teacher-student model in which the teacher model was trained on the ground-truth-annotated data and inferred synthetic labels for unlabeled data. The student model was trained jointly using the ground-truth-annotated data and data with generated synthetic labels. Shi et al. ( 2021 ) first used visual and temporal consistency to generate pseudo-labels encoded with rich motion pre-knowledge and were more reliable than traditional labels. Next, the data with artificial labels and pseudo-labels were mixed for supervised training. Ding et al. ( 2023 ) introduced timestamp supervision for phase recognition, reducing annotation time by 74% and achieving performance comparable to full annotation. Meanwhile, they proposed a technique known as Uncertainty-Aware Temporal Diffusion (UATD) to generate trustworthy pseudo labels for training, which could also be used as a plug-and-play method to eliminate ambiguous labels at phase transitions and improve the performance of existing phase recognition methods. Considering that the coarse-grained phase labels were easier to be annotated, Ramesh et al. ( 2023b ) utilized them as weakly supervised signals to improve the performance of step recognition. The core components of the weakly supervised branch were the step-stage mapping matrix and dependency loss. And the supervised branch supervised step prediction with step labels. Their experimental findings demonstrated that achieving comparable results to the baseline that was trained on the completed step-annotated dataset was possible even when training the model with only 50% of step and 50% of phase annotated videos. Thus, the dependence on fully-supervised data was alleviated.

Chen et al. ( 2018 ) trained the unsupervised generative adversarial network (GAN) based on many unlabeled data and then used the pre-trained discriminator as a spatial model. They performed self-supervised pre-training of the temporal components by sorting two given frames into the correct temporal order. Funke et al. ( 2018 ) used a Slow Feature Analysis (SFA)-based approach for self-supervised pre-training and compared the effects of employing three different loss functions, namely contrastive loss, ranking loss, or 1st and 2nd order contrastive loss. They found that the largest accuracy improvement (10.7%) happened when using 1st and 2nd order contrastive loss, especially when using only 20 labeled videos. Considering that the remaining time of the surgery can be obtained directly from the time-stamp of the video, Yengera et al. ( 2018 ) utilized predicting the remaining surgery duration as a self-supervised pre-training task. Hirsch et al. ( 2023 ) collected a large unlabeled dataset and investigated the effectiveness of a leading self-supervised learning (SSL) framework, namely Masked Siamese Networks (MSNs), for cholecystectomy phase recognition and optical polyp characterization. The framework demonstrated strong generalization after SSL pre-training, achieving the comparable performance with just 50% labeled data to a baseline trained on the whole labeled datasets. Their ablation study showed that the performance of the model after secondary training was positively correlated with both the scale of dataset for pre-training and the model size. Ramesh et al. ( 2023c ) presented a comprehensive benchmark analysis of four state-of-the-art SSL methodologies in the surgical domain. The authors conducted extensive experiments on Cholec80 and analyzed the impact of various hyperparameter settings, thereby providing valuable insights for domain transfer for SSL methods. Furthermore, the authors evaluated the generalization of MoCo v2 on five distinct surgical datasets using the recommended hyperparameters derived from their experiments, showing excellent performance on all datasets.

Additionally, Kassem et al. ( 2023 ) successfully combined federated learning and self-supervised learning for phase recognition, which guaranteed data privacy and exploited videos without annotations. Neimark et al. ( 2021 ) preliminarily explored the effect of transfer learning on four different laparoscopic procedures and pre-trained their proposed Time-Series Adaptive Network using self-supervised learning on sequence sorting tasks. Eckhoff et al. ( 2023 ) investigated the impact of transfer learning with limited data and observed that employing transfer learning between two relevant upper gastrointestinal procedures could yield expected accuracy, particularly in phases with high overlap. Furthermore, co-training on both procedures could enhance the recognition accuracy of specific phases.

Recently, some researchers have attempted to introduce knowledge distillation into surgical workflow analysis. Zhang et al. ( 2024b ) first introduced self-knowledge distillation into surgical phase recognition, which not only did not increase any complexity, but also improved performance significantly. They explored the impact of reducing training data on the model and observed that even if the scale of training data was reduced by 50%, the model applied with self-distillation mechanism still achieved performance comparable to that of the same model trained on the complete dataset. This observation demonstrated that self-knowledge distillation could leverage training data effectively, offering a novel insight for mitigating the requirement on large-scale labeled datasets. Yamlahi et al. ( 2023 ) applied self-distillation to surgical action triplet recognition. Unlike Zhang et al. ( 2024b ), which utilized hard labels, soft labels and feature similarity to train student model, Yamlahi et al. ( 2023 ) only used noisy soft labels generated by the teacher model to train the student model. In their ablation experiment, the addition of self-distillation increased 3.8% mAP compared to baseline, demonstrating that the soft labels were more suitable for surgical triplet recognition than hard labels. Gui et al. ( 2024 ) proposed a multi-teacher distillation framework to alleviate the over-learning of the features of predominant triplet classes. They trained three independent teacher models on three sub-task labels (i.e., instrument, verb, target), which were less imbalanced, and subsequently conducted both feature-level and prediction-level distillation to student model. Their approach achieved significant enhancements on the recognition of minor component classes.

4 Evaluation metrics and benchmark datasets

In this section, we discuss available evaluation metrics, as well as their application to common datasets used to train and test surgical workflow recognition models.

4.1 Evaluation metrics

Evaluating model performance is an integral part of the pipeline for machine learning, which is beneficial to measure the strengths of proposed algorithms and facilitates comparisons with peers. Confusion matrices and color-coded ribbon illustrations are commonly used to evaluate the model’s predictive performance qualitatively. Common quantitative evaluation metrics used in surgical phase or step recognition include Accuracy ( Acc ), Precision ( PR ), Recall ( RE ), F1 score ( \(F_1\) ), Jaccard score ( JA ), segmental edit score, and segmental F1 score. Average precision (AP) and mean AP (mAP) are commonly used to evaluate the performance of surgical action triplet recognition (Nwoye and Padoy 2022 ).

4.1.1 Frame-level metrics

Acc is defined as the percentage of frames correctly classified in the overall video and computed at the video level. However, PR , RE , \(F_1\) and JA are calculated for every phase category and then averaged over all the categories to get the corresponding values of the entire video. If denoting Pred and GT as the prediction set and ground truth set of a phase, respectively, the above five metrics are expressed as follows:

4.1.2 Segment-level metrics

Segment-level metrics have also been used for evaluating the performance of surgical workflow recognition (Zhang et al. 2021c ). Segmental F1 score (Lea et al. 2016b ) measures the overlap between predicted and ground-truth segments, thus penalizing over-segmentation. Segmental edit score (Lea et al. 2016b ) evaluates the temporal order of predictions, but allows for small temporal offsets between between the ground truth and prediction.

If a set of frame-level labels is \(F_{gt}=\{AABBBCC\}\) (where A , B , and C are different labels), then the corresponding segment-level label of this sequence is \(S_{gt} = \{ABC\}\) . Similarly, the segment-level prediction \(S_p\) can be defined. Subsequently, the unnormalized edit score can be calculated by an edit distance \(E_d\) . Then, the unnormalized edit score is normalized using the maximum between \(L_{gt}\) and \(L_p\) , where \(L_{gt}\) and \(L_p\) are the length of \(S_{gt}\) and \(S_p\) . The formulation of segmental edit score E can be defined as follows:

Segmental F1 score is calculated with segment-level precision ( \(P_s\) ) and recall ( \(R_s\) ). \(P_s\) and \(R_s\) are computed using true positives, false positives and false negatives, which are obtained by computing temporal Intersection over Union (IoU) between predicted and ground-truth segments and comparing with a threshold k . Segmental F1 score at a certain threshold k can be formulated as follows:

4.1.3 Metrics for action triplet recognition

AP can be measured as the area under the precision-recall curve, and mAP can be obtained by averaging category AP. Per-category AP is computed across all frames in a given video, then category AP can be computed by averaging per-category APs across all videos. Then, mAP will be obtained by averaging N category AP. The formulas are expressed as follows:

More specifically, \(mAP_I\) , \(mAP_V\) and \(mAP_T\) are used to evaluate component recognition performance of triplets, while \(mAP_{IV}\) , \(mAP_{IT}\) and \(mAP_{IVT}\) are used to denote the joint recognition performance of instrument-verb, instrument-target and instrument-verb-target.

4.2 Benchmark datasets

The quantity and quality of data determine the ceiling of machine learning. Therefore, it is very necessary to establish some excellent benchmark datasets. Next, we will look at seven common datasets used in surgical workflow recognition. M2CAI16 and Cholec80 are the two most common benchmark datasets based on laparoscopic surgery. Among the literature we reviewed, there are 38 articles in which experiments have been conducted based on these two datasets. The CATARACTS and Cataract-101 datasets are two cataract surgery datasets that have higher inter-class similarity because the microscopic camera only focuses on a limited field of the eye resulting in a nearly fixed background. CholecT50 is used for fine-grained action recognition in laparoscopic surgery, focusing on instrument-tissue interactions as opposed to simple phase recognition datasets. HeiCo and HeiChole are two recently released laparoscopic surgery datasets with richer annotation information. Their details are summarized in Table  2 and Sects.  4.2.1 – 4.2.7 .

4.2.1 M2CAI16

The M2CAI16 dataset consisted of two sub-datasets: (1) m2cai16 workflow for surgical workflow recognition (Twinanda et al. 2016 ; Stauder et al. 2016 ) and (2) m2cai16-tool for surgical tool detection (Twinanda et al. 2016 ). The first dataset contained 41 laparoscopic cholecystectomy videos recorded at 25 fps with a resolution of \(1920 \times 1080\) . The annotated videos had eight phase classes: Trocar Placement, Preparation, Calot’s Triangle Dissection, Clipping and Cutting of Cystic Duct and Artery, Gallbladder Dissection, Gallbladder Packaging, Cleaning and Coagulation, and Gallbladder Retraction. The second dataset, m2cai16-tool, consisted of 15 videos recorded at 25 fps of cholecystectomy procedures. Every frame in m2cai16-tool was labeled with the binary presence of tools. Some state-of-the-art results on the M2CAI16 dataset are listed in Table  3 .

4.2.2 Cholec80

The Cholec80 dataset (Twinanda et al. 2016 ) consisted of 80 cholecystectomy videos captured at 25 fps with a \(1920 \times 1080\) resolution. This dataset contained two types of annotations, phase and tool presence. The phases were defined as Preparation, Calot Triangle dissection, Clipping and Cutting, Gallbladder Dissection, Gallbladder Packaging, Cleaning and Coagulation, and Gallbladder Retraction. The categories of the tools were Grasper, Bipolar, Hook, Clipper, Scissors, Irrigator, and Specimen Bag. It is noteworthy that since tools are not always obvious in the videos, they will be defined as present as long as more than half of the tool tip is visible. Some state-of-the-art results on the Cholec80 dataset are listed in Table  4 .

4.2.3 CATARACTS

This dataset has 50 videos of phacoemulsification cataract surgeries performed at Brest University Hospital, France (Al Hajj et al. 2019 ). The resolution of the videos was \(1920 \times 1080\) , and the frame rate was approximately 30 fps. The videos had a duration of 10 min and 56 s on average. In this dataset, a total of 21 tools were annotated for usage. Later, Zisimopoulos et al. ( 2018 ) manually labeled the CATARACTS dataset to generate phase annotations, including 14 classes: Access the Anterior Chamber (ACC): Sideport Incision, AAC: Mainport Incision, Implantable Contact Lenses (ICL): Inject Viscoelastic, ICL: Removal of Lens, Phacoemulsification (PE): Inject Viscoelastic, PE: Capsulorhexis, PE: Hydrodissection of Lens, PE: Phacoemulsification, PE: Removal of Soft Lens Matter, Inserting of the Intraocular Lens (IIL): Inject Viscoelastic, IIL: Intraocular Lens Insertion, IIL: Aspiration of Viscoelastic, IIL: Wound Closure, and IIL: Wound Closure with Suture.

4.2.4 Cataract-101

The dataset contains videos of 101 cataract surgeries performed by four different surgeons (Schoeffmann et al. 2018 ). Videos were captured at 25 fps and had a resolution of \(720 \times 540\) . The following phase classes of cataract surgery were used for this dataset: Incision, Viscous Agent Injection, Rhexis, Hydrodissection, Phacoemulsification, Irrigation and Aspiration, Capsule Polishing, Lens Implant Setting-Up, Viscous Agent Removal, Tonifying, and Antibiotics.

4.2.5 CholecT50

CholecT50 consists of 50 endoscopic videos of laparoscopic cholecystectomy, 45 from the Cholec80 dataset and 5 from an in-house dataset of the same surgical procedure (Nwoye et al. 2022 ). The providers downsampled the videos to 1 fps to generate 100.86K frames and annotated 161K triplet instances. A triplet is represented by <instrument, verb, target>, where there are 6 classes of instrument, 10 classes of verbs, 15 classes of targets, and 100 classes of triplets. Moreover, bounding boxes over the instrument tips were annotated for 5 videos. Phase label for each frame was also provided.

4.2.6 HeiCo

The dataset includes 30 laparoscopic videos and corresponding sensor data from three types of laparoscopic surgery, i.e., proctocolectomy, rectal resection, and sigmoid resection (Maier-Hein et al. 2020 ). The videos were recorded in the operating room at a resolution of \(1920 \times 1080\) , and then downsampled to \(960 \times 540\) . All frames were annotated with surgical phases, and over 10,000 frames were annotated with the presence of surgical instruments and corresponding instance segmentation masks. The defined phases are as follows: General Preparation and Orientation in The Abdomen, Dissection of Lymph Nodes and Blood Vessels En Bloc, Retroperitoneal Preparation towards Lower Pancreatic Border, Retroperitoneal Preparation of Duodenum and Pancreatic Head, Mobilization of the Sigmoid Colon and Descending Colon, Mobilization of Splenic Flexure, Mobilization of Transverse Colon, Mobilization of Ascending Colon, Dissection and Resection of the Rectum, Extra-abdominal Preparation of Anastomosis, Intra-abdominal Preparation of Anastomosis, Creation of Stoma, Finalization of Operation, Exception. Note that not all of the phases are included in each surgical procedure. For example, proctocolectomy does not include Retroperitoneal Preparation of Duodenum and Pancreatic Head and Exception. The inclusion relationship between surgical procedures and phases can be found in Table 3 of Maier-Hein et al. ( 2020 ).

4.2.7 HeiChole

The dataset comprised 33 laparoscopic cholecystectomy videos annotated with 7 phases, 4 actions, and 21 tools (Wagner et al. 2023 ). The definition of the phases is identical to Cholec80, seeing in  4.2.2 . The videos were recorded at three different surgical centers, with a total duration of 22 h. Among them, 15 videos were recorded at the University Hospital Heidelberg with a 2D camera with a resolution of \(960 \times 540\) pixels and 25 fps. The other 15 videos were recorded at Salem Hospital, and the remaining 3 videos were recorded at the GRN-hospital Sinsheim. Except for 3 videos recorded with a resolution of \(720 \times 576\) and 25 fps at Salem Hospital, all other videos were captured at a resolution of \(1920 \times 1080\) and 50 fps.

5 Discussion, challenges and development

In this section, we will discuss the literature that has been reviewed. Sections  5.1 – 5.5 will cover the input data, modeling paradigms and network architectures, post-processing and regularization, learning strategies, and evaluation metrics used in the literature, respectively. We will also highlight the challenges associated with these approaches and suggest possible solutions. Additionally, in Sect.  5.6 , we will discuss the challenges of integrating surgical workflow algorithms into clinical systems. Finally, in Sect.  5.7 , we will focus on the research trend of surgical workflow analysis, which involves future activities anticipation and large language model assistant. A concise summary of the results is presented in Table  5 .

5.1 Impacts of data

5.1.1 data types and generalization.

As can be seen from the second column of Table  1 , Cholec80 is the dataset with the highest frequency of use, followed by the M2CAI16 dataset, all of which are laparoscopic cholecystectomy. Some state-of-the-art results on the two datasets are listed in Tables  3 and 4 . In the studies we reviewed, all studies except Xia and Jia ( 2021 ) and Eckhoff et al. ( 2023 ) were developed and evaluated on a single type of procedure rather than a synthesis of different types. However, general medical AI algorithms are expected to be generic across different types of surgical procedures, not just for a specific type. Therefore, it is necessary to construct a large-scale dataset integrating multiple types for surgical workflow analysis. An existing but suboptimal example is the HeiCo dataset mentioned in Sect.  4.2.6 , which contains three types of laparoscopic surgery, namely proctocolectomy, rectal resection, and sigmoid resection. Unfortunately, HeiCo provided only 33 videos in total, which was not even half of the 80 videos that were provided by Cholec80.

Another annoyance caused by data type is the problem of model generalization. Good generalization means that the model can match or even exceed the performance during training when deployed in real-world scenarios. However, different surgeons have different operating styles and levels of experience, and at the same time, surgical instruments and video recording equipment vary from one medical center to another. These uncertainties lead to differences in the probability distributions between the dataset and deployment scenarios, which challenges model generalization capability. Kirtac et al. ( 2022 ) revealed a model selection bias between a single public dataset and real-world data. Bar et al. ( 2020 ) explored the impact of different scales and medical centers of data on the algorithm’s generalization. Their experiments showed that the most cost-effective improvement could be achieved when training with more than 100 but fewer than 1000 videos. Furthermore, despite the data imbalance from multiple centers, model was able to maintain stable performance on test data of different centers by using a large and diverse training dataset. However, there was still a slight favor for the centers with more training data.

5.1.2 Data modalities

In addition, the introduction of data from other modalities can be used to improve model performance. There is a specific correlation between the different modalities, as they are descriptions of the same object from different perspectives. Through the complementary fusion of data across modalities, the algorithm can enhance the effectiveness and robustness of data mining tasks. In the studies we reviewed, all the other work except Xi et al. ( 2023 ) only modeled on the visual modality. Xi et al. ( 2023 ), the authors transformed surgical triplet recognition into multi-modal video reasoning. Language information helped model training in two positions: (1) a pre-defined template was employed to generate text descriptions for each triplet class, which were then fed into CLIP to generate triplet prompt features; (2) BLIP model was employed to produce captions for each frame, which were then used to generate initial caption features by CLIP. Other modalities, such as kinematic data, if available, can also be considered for model training.

5.1.3 Data pre-processing

Usually, providers of public datasets have preprocessed the data as much as possible (such as data cleaning, annotation, normalization, etc.), so that users can use the dataset for analysis and modeling directly.

In stage of annotating, the definition of phases or steps is an aspect that requires attention. Apart from the segments that can be clearly represented by professional surgical terms, there may be some special segments, such as transition segments, out-of-body segments, etc. Nevertheless, surgical phase definitions from public datasets are usually continuous. For instance, in Cholec80, whether it is a transition segment or out-of-body segment, it will be annotated as one of the 7 phases. This approach, while simplifying the annotation process, will inject noise into the model training inadvertently. Therefore, these special segments should not be annotated crudely. The inherent ambiguity of transitions can be mitigated through consensus-building methods, such as expert panel discussions or averaging multiple personal annotations (Demir et al. 2023 ). To deal with these segments uniformly, Zhang et al. ( 2021b ) proposed to annotate them with a “Not a Phase” label. This intermediate label might effectively alleviate the noise caused by forcing these segments into predefined phases.

Data preprocessing, particularly data augmentation that can enhance generalization, is another crucial aspect. Data augmentations for natural static images were widely used in the literature we reviewed to mitigate the challenge posed by limited surgical data, such as rotations, random cropping and color jitter. However, applying them directly to surgical video may result in suboptimal performance, because they may generate some irrelevant or illogical frames (Garcea et al. 2023 ) and do not take temporal dependence constraints and domain-specific knowledge into account at all. Additionally, deep generative models like GANs, VAEs, and diffusion models have been explored as an emerging data augmentation technique in medical imaging (Garcea et al. 2023 ). Despite they have higher costs compared with the traditional data augmentation methods mentioned above, they can generate the data that are more diverse and consistent with the real distribution. However, to the best of our knowledge, this approach remains unexplored in surgical workflow analysis. Finally, it is worth noting that whatever data augmentation method is used, it is necessary to ensure that it must comply with ethical and privacy standards.

5.2 Modeling paradigms and network architectures

Spatial feature modeling has advantages in computational efficiency and simplicity. However, when spatial configuration alone is not sufficient to eliminate ambiguity, spatial feature modeling may result in a large number of false positives. This methodology may be more suitable for static tasks, such as surgical tool detection (Song et al. 2024 ) and anatomical structure segmentation (den Boer et al. 2023 ).

Surgical video is a consecutive representation of moving visual images. Compared with a static frame, it has an extra temporal axis, so it contains rich temporal information that can be utilized. Spatio-temporal feature modeling can fully utilize temporal information to provide a more comprehensive understanding of the surgical workflow, but it increases computational complexity and is sensitive to the length and quality of sequential data (Jin et al. 2021 ). As can be seen from Fig.  4 a, most studies tend to choose spatio-temporal feature modeling as this paradigm is more general.

Whether spatial feature modeling or spatio-temporal feature modeling, the majority of literature tended to employ CNN as the primary spatial feature extractor. However, the limitation of CNN lies in its inherent inductive bias. For deep spatial semantics, Transformer has been chosen by recent papers due to its capacity for learning global interactions. This is particularly evident in surgical action recognition (Li et al. 2022a ; Nwoye et al. 2022 ; Sharma et al. 2022 ), where the Transformer could capture instrument-tissue interactions better.

For temporal components, Transformer and TCN have gradually replaced RNN as depicted in Fig.  4 . Unlike RNN, TCN does not have to build memories and supports parallel computing. Additionally, RNN may exhibit long-term memory decay, which is negative for longer surgical videos. In contrast, TCN achieves a large receptive field through layer stacking, while vanilla Transformer and non-local block can capture long-term dependencies directly. Furthermore, a unique advantage of Transformer is that it can realize unified modeling of spatial and temporal, visual and text data (Xi et al. 2023 ), which lays the foundation for multi-modal fusion. Despite the quadratic complexity of vanilla Transformer, it can mitigate the demand for computational resources and abundant surgical samples through sparse attention (Liu et al. 2023a ) or other tricks. In addition, GNN has fewer parameters than vanilla Transformer and is an option to avoid overfitting in cases of limited samples.

5.3 Regularization and post-processing

The incorporation of domain-specific prior knowledge for post-processing or regularization can be customized according to specific task requirements to help the model adapt to specific scenarios and conditions better. For instance, calibrating the incorrect predictions according to the order of surgical phases to improve the performance of the model (Jin et al. 2018 ; Zhang et al. 2021b ; Pan et al. 2022 ). In addition, the use of temporal consistency to force the output to be consistent within a small temporal window can also be advantageous for surgical phase or step recognition (Zia et al. 2018 ; Mondal et al. 2019 ). However, these two common methods may not be suitable for the fine-grained task of surgical action triplet recognition, because its action transitions are irregular. Meanwhile, an excessive focus on consistency may result in delayed responses to real-time changes. For instance, Feng et al. ( 2024 ) found that the incorrect predictions in Ding and Li ( 2022 ) persisted for a longer period due to hierarchical consistency. This means that the model may not be able to adapt to the new phase in time when unexpected events arise during surgery.

Just from the work summarized in Sect.  3.4 , integrating post-processing or regularization techniques can benefit model training. However, they may necessitate additional computational resources and storage space. Especially in real-time applications, complex post-processing processes could result in increased latency and impact the user experience. At the same time, with the update of algorithms and datasets, the original post-processing and regularization strategies may need to be adjusted frequently, which will also increase the complexity and cost of model maintenance.

5.4 Options of learning strategies

Supervised learning is still one of the most frequently used learning paradigms in machine learning. However, its nature of requiring large amounts of labeled data makes its performance on medical artificial intelligence face some limitations. When attempting to improve the performance of a model by deepening the network, insufficient training data will lead to overfitting of the model, resulting in performance degradation (Yi et al. 2023 ). As annotating surgical videos, which is undoubtedly tedious and costly, needs to be carried out by experts with a relevant professional background, the scale of the surgical workflow dataset is much smaller compared to natural image datasets such as ImageNet.

In recent years, some research has been undertaken using non-supervised learning for surgical workflow analysis. The percentage of studies we reviewed that used non-supervised learning strategies is approximately 16%, which is much less than the studies using supervised learning. Two papers (Bodenstedt et al. 2019 ; Shi et al. 2020 ) used active learning to iteratively select the most informative unlabeled data for annotation. Chen et al. ( 2018 ), Funke et al. ( 2018 ), Yengera et al. ( 2018 ), Neimark et al. ( 2021 ), Kassem et al. ( 2023 ), Eckhoff et al. ( 2023 ), Hirsch et al. ( 2023 ) and Ramesh et al. ( 2023c ) employed self-supervised learning to pretrain on large amounts of unlabeled data to obtain strong representations and facilitate downstream tasks with few labels. These methods alleviate the scarcity of annotated data and achieve results compatible with supervised learning. Also, the application of knowledge distillation for reducing data requirements has been investigated and demonstrated to be feasible by Zhang et al. ( 2024b ), Yamlahi et al. ( 2023 ) and Gui et al. ( 2024 ).

Multi-task learning has been proven effective in computer vision. In the studies we reviewed, when additional annotations are available, such as tool presence, other granularity, or action triplets, multi-task learning using related tasks as auxiliary tasks allows features from different tasks to benefit from each other. They usually embed the feature representations of multiple tasks into a shared semantic space and then extract task-specific representations through task-specific layers for each task, namely, hard sharing. However, hard sharing is simple to implement and suitable for tasks with strong correlation but often performs poorly when encountering weakly correlated tasks (Zhang and Yang 2022 ). Tool usage is a more common piece of supporting information in surgical phase or step recognition. Although the extra annotations for tool presence add to the workload, they only require a little expertise to complete. Given that surgical triplet labels contain three components, multi-task learning has become the mainstream approach for surgical action triplet recognition. It is worth noting that blindly adopting the same feature processing strategy for different subtasks may lead to suboptimal results, i.e., task-specific processing is required. Therefore, considering that the temporal dependency of phase recognition may be distinct from tool usage detection, Tao et al. ( 2023 ) designed the task-specific width of banded causal mask for them.

In addition, although the methods based on deep learning have achieved remarkable performance, they are not a panacea. For instance, in cases where the dataset is small, the features are evident, or better interpretability is required, deep learning may not be suitable. In such situations, traditional machine learning strategies may be considered.

5.5 Evaluation metrics

In the field of surgical workflow analysis, accuracy, precision and recall are the most commonly used evaluation metrics. However, these metrics, along with the others mentioned in Sect.  4.1.1 , can only evaluate the performance at the frame level, with inadequate consideration of other aspects, such as class imbalance and temporal information. This underestimates the actual clinical application requirements.

In terms of class imbalance, a model often tends to learn better for majority classes and worse for minority classes. Even if there are more mispredictions in the minority classes, it will have a limited reduction in the final results. If a model with high overall accuracy but poor performance in recognize the minority phases is applied to the clinic, it may lead to some medical accidents, because it is important for any phase to be recognized correctly in actual application. To alleviate the dilemma caused by class imbalance, the common approaches are to use a specialized loss function or data sampling strategy, such as weighted cross-entropy loss (Czempiel et al. 2020 ; Fang et al. 2022 ), focal loss, and Synthetic Minority Oversampling Technique (SMOTE) (Zhang et al. 2021b ), etc. These approaches help mitigate the negative impact of class imbalance during model training. In the model selection stage, it is helpful to pick a more clinically applicable model by calculating the accuracy of each class individually (Jin et al. 2018 ; Pan et al. 2022 ; Shi et al. 2022 ) or paying more attention to F1 score and balanced accuracy (Brodersen et al. 2010 ) that can reflect the performance in the case of class imbalance.

Most of the literature we reviewed incorporated temporal modeling. In video data, temporal order and temporal smoothness are two common aspects of temporal consistency that need to be incorporated into consideration. For example, in the Cholec80 dataset, the ‘Preparation’ phase cannot occur after the ‘Calot Triangle Dissection’ phase, and if the model is predicted in the opposite temporal order, the model may be suboptimal. For temporal smoothness, the duration of most phases is relatively long, and a robust model should identify the correct phase class steadily and continuously over this period. However, most of the literature we reviewed only used frame-level metrics for evaluation, but ignored the segment-level ones in Sect.  4.1.2 . Meanwhile, to the best of our knowledge, quantitative evaluation of temporal performance of surgical action triplet recognition is also missing. Although temporal order and temporal smoothness can be represented to a certain extent on color-coded ribbon illustrations in their papers, this qualitative visualization is not accurate enough.

To address the weakness on evaluation metrics, Dergachyova et al. ( 2016 ) proposed three new metrics and an error estimation method. These metrics are more informative and more representative of real application-based requirements. Conducting research on confidence intervals for accuracy and other metrics is also an idea to evaluate the usability of a model quantitatively (Guo et al. 2023 ). In a word, the community needs more innovative metrics to evaluate relevant aspects beyond frame-level performance, to meet the practical scenarios of the clinic.

Furthermore, there are some potential inconsistencies in evaluation metrics that will lead to unfair comparisons. Relaxed metrics, which don’t evaluate prediction errors within a 10-second window around phase boundaries, were widely used in the literature we reviewed, such as Gao et al. ( 2021 ), Jin et al. ( 2021 ), etc [more papers can be found in Funke et al. ( 2023 )]. Unfortunately, most authors did not declare the use of relaxed metrics in their papers. Additionally, the calculation of mean and standard deviation may be also different. For example, Czempiel et al. ( 2020 ) computed them through averaging fivefold cross-validation results, while most studies calculated by averaging video-wise results. Researchers may ignore these inconsistencies and compared the performance unfairly. Recently, these inconsistencies on Cholec80 benchmark have been discussed in detail by Funke et al. ( 2023 ) and should be taken seriously by researchers.

5.6 Challenges of deployment to clinical systems

Laparoscopic cholecystectomy is the most common surgical procedure and has low complexity and robust standardization, so most of the literature has attempted workflow recognition based on it. Even so, substantial translation of these techniques into clinical applications has hardly transpired. That’s because there are many gaps and challenges to integrating new technology in the laboratory into complex existing clinical systems.

An important barrier to the promotion of AI technologies in healthcare is the lack of relevant standards, including data standards, regulatory and ethical standards, etc. For example, Cholec80 was constructed on purpose (Kirtac et al. 2022 ), i.e., phases that occurred irregularly were excluded. Therefore, it presents a linear workflow. However, there is no consensus on whether this subjective approach to data processing should be widely adopted. It exacerbates the gap with real-world surgery and has a negative impact on generalization. In terms of regulatory standards, in the United States, AI algorithms must be certified by the Food and Drug Administration (FDA), if they want to be clinical, and the approved algorithms are mainly focused on radiology used for early diagnosis (Topol 2019 ). However, surgical workflow analysis algorithms involve more complex scenarios, privacy, and ethical issues, and therefore require more robust policies and guidelines to determine whether to approve them for clinical use. There is also a delicate and difficult balance for the relevant agencies between protecting patients and promoting innovation, that is, over-regulation may stifle innovation and progress in this area, while insufficient regulation may pose a risk to patient safety (He et al. 2019 ). Therefore, they need to promote the implementation of AI technologies in healthcare by clarifying requirements for data privacy, algorithm transparency and interpretability, safety assessment and effectiveness verification as soon as possible.

Another major challenge in making algorithms clinical is interoperability. The HIMSS dictionary defines interoperability in healthcare as the ability of health information systems to work together within and across organizational boundaries to advance the health status of, and the effective delivery of healthcare for, individuals and communities (Information and Society 2017 ). If a surgical workflow analysis algorithm is only applied to post-operative scenarios, such as skill assessment, then the interoperability requirements may be lower. However, when used online during surgery, the practical effectiveness and security of the algorithm will be limited severely if no effort is made to optimize interoperability. Firstly, advances in assistive systems have brought a wealth of data, but these data often present in isolation and fragmented. Considering data interoperability, the algorithm needs to be integrated with the existing surgical system seamlessly, so that it can fully communicate and feedback with medical devices and clinical information systems. In addition, data should be shared and reusable between different algorithms or devices to improve the efficiency of data exchange. In the scenario of robotic surgery, data and functions are distributed over various technical actuators. Vision algorithms may need to combine with other sensors to provide feedback signals to each actuator in the control system to make all parts of a robot work coordinately, achieving physical interoperability as well. Moreover, the vocabulary used for class definition is rarely standardized and reproducible, leading to poor interoperability of data (Timoh et al. 2023 ). An existing example is that CATARACTS and Cataract-101 does not use the same phase definitions. This may explain why there are currently few successful cases in the field of surgical data.

In addition, the lack of manufacturer-independent interoperability is one of the reasons that hinders the deployment of collaborative assistive systems (Kasparick et al. 2018 ). The algorithm may be deployed on devices produced by different manufacturers, so standard interfaces and protocols for medical devices, such as OpenICE, are formulated and followed to ensure the compatibility of algorithms on different platforms. Since interoperability allows the various modules in the system to communicate fully, algorithms have the opportunity to access other sensor signals apart from visual information. Therefore, investigating multimodal algorithms, like DiPietro et al. ( 2016 ) and Sarikaya et al. ( 2018 ), allows the benefits of interoperability to be fully exploited.

When selecting a trained model for deployment in a clinical setting, it is important to consider more than just accuracy. Relying solely on accuracy can lead to issues, as discussed in the previous section. Therefore, it is worth exploring methods for choosing a model that is better suited for deployment in clinical environments. Furthermore, engineers may be required to optimize the inference efficiency to meet the demands of real-time algorithms. A recent study (Mascagni et al. 2023 ) presented an early-stage clinical evaluation of the feasibility of deploying a deep neural network toolkit (SurgFlow) in the OR to assess its malfunction rate and clinical value. They used TensorRT to optimize the performance of several computationally expensive deep networks to meet the computational requirements in the OR. It is also crucial to develop a user-friendly interface and provide educational training for successful implementation. In conclusion, implementing surgical workflow algorithms is a complex system engineering that necessitates collaboration among engineers, clinicians, and domain experts. Ultimately, this will enable advancements in modern operating rooms.

5.7 Interesting areas of future development

5.7.1 surgical activity anticipation.

Most existing research in surgical workflow analysis has focused on real-time recognition of the current or previous state. This is more useful for postoperative assessment, generation of operative reports, and education and training of surgeons.

However, in clinical surgery, the surgical plan is not a dogma but needs to be reasonably adjusted at any time according to the patient’s specific condition and the operation’s progress. Many inexperienced surgeons may need help to update the surgical plan quickly and flexibly based on the information provided by the CAS system. To adequately warn and prevent intra-operative adverse events, the CAS system must be able to anticipate future activities. Earlier works only reported predicting some simple factors, such as remaining surgery duration (Twinanda et al. 2018 ) and instrument usage [104]. Surgical activity prediction offers excellent potential for early warning of surgical complications, reducing surgical errors, and enhancing patient safety (Maier-Hein et al. 2017 ). Recently, there has been a gradual rise in research on surgical workflow anticipation. Ban et al. ( 2022 ) proposed a network that used GAN to predict future surgical phases and phase transitions jointly. At the same time, they quantified the subjective plausibility of predictions based on a survey of surgeons to verify the model’s prediction. Wang et al. ( 2022 ) proposed a network suitable for the real-time prediction of surgical phases in an Internet of Medical Things (IoMT) environment to guide the next stage of surgery. Yuan et al. ( 2022 ) modeled anticipation as a real-time remaining time regression problem.

5.7.2 Employing large language model as training assistant

Leveraging the power of large language models (LLMs) to facilitate surgical workflow analysis may be an interesting direction. Recently, LLMs like ChatGPT have made revolutionary breakthroughs in the capacity of natural language comprehension and generation. Despite these advances, the triumph of LLMs can not be extended to surgical workflow analysis tasks effortlessly due to inherent disparities between language and vision. One potential idea is to let Visual-Language Models (VLMs) serve as a bridge between LLMs and surgical workflow recognition. We observed that it has been explored by Xi et al. ( 2023 ). The authors transformed the surgical triplet recognition into visual prompt generation from large-scale VLMs and explicitly decomposed the task into a series of video reasoning processes. Nevertheless, since both BLIP and CLIP contain no medical knowledge, the authors calibrated caption features by the assistance of the BioMed Language Model [BioMedLM, Bolton et al. ( 2024 )] to ensure semantic consistency between visual frame features and frame caption features. The results of their ablation experiment demonstrated a significant improvement on the training effect when incorporating large language models embedded with extensive medical knowledge.

We can observe that an obvious shortcoming of the work (Xi et al. 2023 ) is that BioMedLM is not an LLM specific to the surgical domain. Hence, an intuitive idea is to employ an LLM specific to the surgical domain. An existing instance is that Bombieri et al. ( 2023b ) released a novel LLM fine-tuned on annotated surgical procedural texts (Bombieri et al. 2023a ) and demonstrated strong performance on multiple downstream language tasks (Bombieri et al. 2023c ). Meanwhile, they also released a publicly accessible language dataset for procedural surgical workflow analysis from textbooks (Bombieri et al. 2023a ). This dataset was annotated with many contents using Robotic-Surgery Procedural Framebank they proposed, such as actions, instruments, the anatomical part being targeted, spatial and temporal relationships, etc. This implies that researchers can also utilize this dataset to fine-tune some powerful but not specific medical LLMs, such as Med-PaLM (Singhal et al. 2023 ) and BioMedLM (Bolton et al. 2024 ).

Until large vision models for surgery become available, using knowledgeable LLMs as training assistants can be a worthwhile option.

6 Conclusion

This survey lists and analyzes representative state-of-the-art deep learning-based surgical workflow recognition algorithms published from Jan 2018 to Apr 2024. It is evident that deep learning plays a preeminent role in surgical workflow analysis. The scarcity of labeled data and the singleness of procedure type are the main factors that hinder model performance and generalization. Public datasets with extensive data and diverse procedures are required in the future. Additional annotations or modalities may facilitate the performance of model training, while unreasonable label definition and data augmentation will introduce noise. Spatio-temporal feature modeling was employed by most papers we reviewed. The mainstream approach for spatial feature extraction was utilizing CNNs, with a minority using 3DCNN or Transformer. Temporal dependencies were modeled in various ways, including RNN, TCN, nonlocal operation, Transformer, and others. Some regularization mechanisms or post-processing algorithms based on prior knowledge have also been incorporated to improve recognition performance, but complex post-processing may not be desirable in real-time applications. Since annotating surgical videos requires special knowledge from seasoned experts and is time-consuming, many algorithms based on active learning, self-supervised learning or knowledge distillation have been explored to alleviate the dilemma of less labeled data. Accuracy, precision, and recall, which are challenging for measuring class imbalance and temporal performance, were commonly evaluated for phase and step recognition, while segment-level metrics were frequently overlooked despite their significance for temporal performance. Furthermore, the calculation of metrics in many papers lacks transparency, leading to invalid comparisons. Challenges such as insufficient regulatory standards and interoperability are impeding the integration of surgical workflow algorithms into clinical systems. Moreover, in future work, surgical activity anticipation and employing LLM as training assistant are the interesting directions that are instrumental in building more robust CAS systems.

Data availability

No datasets were generated or analysed during the current study.

Al Hajj H, Lamard M, Conze PH, Roychowdhury S, Hu X, Maršalkaitė G, Zisimopoulos O, Dedmari MA, Zhao F, Prellberg J, Sahu M, Galdran A, Araújo T, Vo DM, Panda C, Dahiya N, Kondo S, Bian Z, Vahdat A, Bialopetravičius J, Flouty E, Qiu C, Dill S, Mukhopadhyay A, Costa P, Aresta G, Ramamurthy S, Lee SW, Campilho A, Zachow S, Xia S, Conjeti S, Stoyanov D, Armaitis J, Heng PA, Macready WG, Cochener B, Quellec G (2019) CATARACTS: challenge on automatic tool annotation for cataRACT surgery. Med Image Anal 52:24–41. https://doi.org/10.1016/j.media.2018.11.008

Article   Google Scholar  

Ban Y, Rosman G, Ward T, Hashimoto D, Kondo T, Iwaki H, Meireles O, Rus D (2021) Aggregating long-term context for learning laparoscopic and robot-assisted surgical workflows. In: 2021 IEEE international conference on robotics and automation (ICRA). pp 14531–14538

Ban Y, Rosman G, Eckhoff JA, Ward TM, Hashimoto DA, Kondo T, Iwaki H, Meireles OR, Rus D (2022) Supr-Gan: surgical prediction GAN for event anticipation in laparoscopic and robotic surgery. IEEE Robot Autom Lett 7(2):5741–5748. https://doi.org/10.1109/LRA.2022.3156856

Bar O, Neimark D, Zohar M, Hager G, Girshick R, Fried G, Wolf T, Asselmann D (2020) Impact of data on generalization of AI for surgical intelligence applications. Sci Rep. https://doi.org/10.1038/s41598-020-79173-6

Bharathan R, Aggarwal R, Darzi A (2013) Operating room of the future. Best Pract Res Clin Obstet Gynaecol 27(3):311–322. https://doi.org/10.1016/j.bpobgyn.2012.11.003

Birkhoff D, Dalen ASH, Schijven M (2021) A review on the current applications of artificial intelligence in the operating room. Surg Innov 28:611–619. https://doi.org/10.1177/1553350621996961

Blum T, Feußner H, Navab N (2010) Modeling and segmentation of surgical workflow from laparoscopic video. In: Medical image computing and computer-assisted intervention: MICCAI ... International conference on medical image computing and computer-assisted intervention, vol 13. pp 400–407. https://doi.org/10.1007/978-3-642-15711-0_50

Bodenstedt S, Rivoir D, Jenke A, Wagner M, Breucha M, Müller B, Mees S, Weitz J, Speidel S (2019) Active learning using deep Bayesian networks for surgical workflow analysis. Int J Comput Assist Radiol Surg 14:1079–1087. https://doi.org/10.1007/s11548-019-01963-9

Bolton E, Venigalla A, Yasunaga M, Hall D, Xiong B, Lee T, Daneshjou R, Frankle J, Liang P, Carbin M, Manning CD (2024) BioMedLM: a 2.7B parameter language model trained on biomedical text. http://arxiv.org/abs/2403.18421

Bombieri M, Rospocher M, Ponzetto S, Fiorini P (2023a) The robotic-surgery propositional bank. Lang Resour Eval. https://doi.org/10.1007/s10579-023-09668-x

Bombieri M, Rospocher M, Ponzetto S, Fiorini P (2023b) SurgicBERTa: a pre-trained language model for procedural surgical language. Int J Data Sci Anal 18:1–13. https://doi.org/10.1007/s41060-023-00433-5

Bombieri M, Rospocher M, Ponzetto SP, Fiorini P (2023c) Machine understanding surgical actions from intervention procedure textbooks. Comput Biol Med 152:106415. https://doi.org/10.1016/j.compbiomed.2022.106415

Brodersen KH, Ong CS, Stephan KE, Buhmann JM (2010) The balanced accuracy and its posterior distribution. In: 2010 20th international conference on pattern recognition. pp 3121–3124

Charriere K, Quelled G, Lamard M, Martiano D, Cazuguel G, Coatrieux G, Cochener B (2016) Real-time multilevel sequencing of cataract surgery videos. In: 2016 14th international workshop on content-based multimedia indexing (CBMI). pp 1–6

Chen Y, Sun Q, Zhong K (2018) Semi-supervised spatio-temporal CNN for recognition of surgical workflow. EURASIP J Image Video Process 2018:1–9. https://doi.org/10.1186/s13640-018-0316-4

Chen HB, Li Z, Fu P, Ni ZL, Bian GB (2022) Spatio-temporal causal transformer for multi-grained surgical phase recognition. In: 2022 44th annual international conference of the IEEE Engineering in Medicine & Biology Society (EMBC). pp 1663–1666

Chen Y, He S, Jin Y, Qin J (2023a) Surgical activity triplet recognition via triplet disentanglement. In: Greenspan H, Madabhushi A, Mousavi P, Salcudean S, Duncan J, Syeda-Mahmood T, Taylor R (eds) Medical image computing and computer assisted intervention—MICCAI 2023. Springer Nature Switzerland, Cham, pp 451–461

Google Scholar  

Chen Z, Zhai Y, Zhang J, Wang J (2023b) Surgical temporal action-aware network with sequence regularization for phase recognition. In: 2023 IEEE international conference on bioinformatics and biomedicine (BIBM). pp 1836–1841

Chu X, Tian Z, Wang Y, Zhang B, Ren H, Wei X, Xia H, Shen C (2021) Twins: revisiting the design of spatial attention in vision transformers. In: Ranzato M, Beygelzimer A, Dauphin Y, Liang P, Vaughan JW (eds) Advances in neural information processing systems, vol 34. Curran Associates Inc, pp 9355–9366

Czempiel T, Paschali M, Keicher M, Simson W, Feussner H, Kim ST, Navab N (2020) Tecno: Surgical phase recognition with multi-stage temporal convolutional networks. In: Medical image computing and computer assisted intervention—MICCAI 2020: 23rd international conference, Lima, Peru, October 4-8, 2020, proceedings, part III. Springer-Verlag, Berlin, Heidelberg, pp 343–352

Czempiel T, Paschali M, Ostler D, Kim ST, Busam B, Navab N (2021) Opera: attention-regularized transformers for surgical phase recognition. In: Medical image computing and computer assisted intervention—MICCAI 2021: 24th international conference, Strasbourg, France, September 27-October 1, 2021, proceedings, part IV. Springer-Verlag, Berlin, Heidelberg, pp 604–614

Demir KC, Schieber H, Weise T, Roth D, May M, Maier A, Yang SH (2023) Deep learning in surgical workflow analysis: a review of phase and step recognition. IEEE J Biomed Health Inform 27(11):5405–5417. https://doi.org/10.1109/JBHI.2023.3311628

den Boer R, Jaspers T, de Jongh C, Pluim J, Sommen F, Boers T, Hillegersberg R, Eijnatten M, Ruurda J (2023) Deep learning-based recognition of key anatomical structures during robot-assisted minimally invasive esophagectomy. Surg Endosc 37:1–12. https://doi.org/10.1007/s00464-023-09990-z

Dergachyova O, Bouget D, Huaulmé A, Morandi X, Jannin P (2016) Automatic data-driven real-time segmentation and recognition of surgical workflow. Int J Comput Assist Radiol Surg 11:1081–1089

Ding X, Li X (2022) Exploring segment-level semantics for online phase recognition from surgical videos. IEEE Trans Med Imaging 41(11):3309–3319. https://doi.org/10.1109/TMI.2022.3182995

Ding Y, Fan J, Pang K, Li H, Fu T, Song H, Chen L, Yang J (2020) Surgical workflow recognition using two-stream mixed convolution network. In: 2020 3rd international conference on advanced electronic materials, computers and software engineering (AEMCSE). pp 264–269

Ding X, Yan X, Wang Z, Zhao W, Zhuang J, Xu X, Li X (2023) Less is more: surgical phase recognition from timestamp supervision. IEEE Trans Med Imaging 42(6):1897–1910. https://doi.org/10.1109/TMI.2023.3242980

DiPietro RS, Stauder R, Kayis E, Schneider A, Kranzfelder M, Feußner H, Hager G, Navab N (2015) Automated surgical-phase recognition using rapidly-deployable sensors. In Proceedings of Modeling and Monitoring of Computer Assisted Interventions Workshop in Conjunction with Medical Image Computing and Computer Assisted Interventions

Eckhoff J, Ban Y, Rosman G, Müller D, Hashimoto D, Witkowski E, Babic B, Rus D, Bruns C, Fuchs H, Meireles O (2023) TEsoNet: knowledge transfer in surgical phase recognition from laparoscopic sleeve gastrectomy to the laparoscopic part of Ivor-Lewis esophagectomy. Surg Endosc 37:1–14. https://doi.org/10.1007/s00464-023-09971-2

Fang L, Mou L, Gu Y, Hu Y, Chen B, Chen X, Wang Y, Liu J, Zhao Y (2022) Global-local multi-stage temporal convolutional network for cataract surgery phase recognition. BioMed Eng OnLine. https://doi.org/10.1186/s12938-022-01048-w

Feng X, Zhang X, Shi X, Li L, Wang S (2024) ST-ITEF: spatio-temporal intraoperative task estimating framework to recognize surgical phase and predict instrument path based on multi-object tracking in keratoplasty. Med Image Anal 91:103026. https://doi.org/10.1016/j.media.2023.103026

Funke I, Jenke A, Mees ST, Weitz J, Speidel S, Bodenstedt S (2018) Temporal coherence-based self-supervised learning for laparoscopic workflow analysis. In: Stoyanov D, Taylor Z, Sarikaya D, McLeod J, González Ballester MA, Codella NC, Martel A, Maier-Hein L, Malpani A, Zenati MA, De Ribaupierre S, Xiongbiao L, Collins T, Reichl T, Drechsler K, Erdt M, Linguraru MG, Oyarzun Laura C, Shekhar R, Wesarg S, Celebi ME, Dana K, Halpern A (eds) OR 2.0 context-aware operating theaters, computer assisted robotic endoscopy, clinical image-based procedures, and skin image analysis. Springer International Publishing, Cham, pp 85–93

Funke I, Rivoir D, Speidel S (2023) Metrics matter in surgical phase recognition. http://arxiv.org/abs/2305.13961

Gao X, Jin Y, Long Y, Dou Q, Heng PA (2021) Trans-SVNet: accurate phase recognition from surgical videos via hybrid embedding aggregation transformer. In: International conference on medical image computing and computer-assisted intervention. pp 593–603

Garcea F, Serra A, Lamberti F, Morra L (2023) Data augmentation for medical imaging: a systematic literature review. Comput Biol Med 152:106391. https://doi.org/10.1016/j.compbiomed.2022.106391

Garrow C, Kowalewski KF, Li L, Wagner M, Schmidt M, Engelhardt S, Hashimoto D, Kenngott H, Bodenstedt S, Speidel S, Müller B, Nickel F (2020) Machine learning for surgical phase recognition: a systematic review. Ann Surg. https://doi.org/10.1097/SLA.0000000000004425

Golany T, Aides A, Freedman D, Rabani N, Liu Y, Rivlin E, Corrado GS, Matias Y, Khoury W, Kashtan H, Reissman P (2022) Artificial intelligence for phase recognition in complex laparoscopic cholecystectomy. Surg Endosc 36:9215–9223. https://doi.org/10.1007/s00464-022-09405-5

Gui S, Wang Z, Chen J, Zhou X, Zhang C, Cao Y (2024) MT4MTL-KD: a multi-teacher knowledge distillation framework for triplet recognition. IEEE Trans Med Imaging 43(4):1628–1639. https://doi.org/10.1109/TMI.2023.3345736

Guo K, Tao H, Zhu Y, Li B, Fang C, Qian Y, Yang J (2023) Current applications of artificial intelligence-based computer vision in laparoscopic surgery. Laparosc Endosc Robot Surg 6(3):91–96. https://doi.org/10.1016/j.lers.2023.07.001

He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K (2019) The practical implementation of artificial intelligence technologies in medicine. Nat Med 25:30–36. https://doi.org/10.1038/s41591-018-0307-0

Hirsch R, Caron M, Cohen R, Livne A, Shapiro R, Golany T, Goldenberg R, Freedman D, Rivlin E (2023) Self-supervised learning for endoscopic video analysis. In Medical Image Computing and Computer Assisted Intervention– MICCAI 2023. Springer Nature Switzerland, Cham, pp 569–578

Information H, Society MS (2017) HIMSS dictionary of health information technology terms, acronyms, and organizations. CRC Press, Boca Raton

Jin Y, Dou Q, Chen H, Yu L, Qin J, Fu CW, Heng PA (2018) SV-RCNet: workflow recognition from surgical videos using recurrent convolutional network. IEEE Trans Med Imaging 37(5):1114–1126. https://doi.org/10.1109/TMI.2017.2787657

Jin Y, Li H, Dou Q, Chen H, Qin J, Fu CW, Heng PA (2020) Multi-task recurrent convolutional network with correlation loss for surgical video analysis. Med Image Anal 59:101572. https://doi.org/10.1016/j.media.2019.101572

Jin Y, Long Y, Chen C, Zhao Z, Dou Q, Heng PA (2021) Temporal memory relation network for workflow recognition from surgical video. IEEE Trans Med Imaging 40(7):1911–1923. https://doi.org/10.1109/TMI.2021.3069471

Kadkhodamohammadi A, Luengo I, Stoyanov D (2022) PATG: position-aware temporal graph networks for surgical phase recognition on laparoscopic videos. Int J Comput Assist Radiol Surg 17:849–856. https://doi.org/10.1007/s11548-022-02600-8

Kasparick M, Schmitz M, Andersen B, Rockstroh M, Franke S, Schlichting S, Golatowski F, Timmermann D (2018) OR.NET: a service-oriented architecture for safe and dynamic medical device interoperability. Biomed Eng 63:11–30

Kassem H, Alapatt D, Mascagni P, Karargyris A, Padoy N (2023) Federated cycling (FedCy): semi-supervised federated learning of surgical phases. IEEE Trans Med Imaging 42(7):1920–1931. https://doi.org/10.1109/TMI.2022.3222126

Kirtac K, Aydin N, Lavanchy JL, Beldi G, Smit M, Woods MS, Aspart F (2022) Surgical phase recognition: from public datasets to real-world data. Appl Sci. https://doi.org/10.3390/app12178746

Lalys F, Riffaud L, Bouget D, Jannin P (2012) A framework for the recognition of high-level surgical tasks from video images for cataract surgeries. IEEE Trans Biomed Eng 59:966–976

Lea C, Vidal R, Reiter A, Hager GD (2016a) Temporal convolutional networks: a unified approach to action segmentation. In: Hua G, Jégou H (eds) Computer vision—ECCV 2016 workshops. Springer International Publishing, Cham, pp 47–54

Chapter   Google Scholar  

Lea C, Vidal R, Hager GD (2016b) Learning convolutional action primitives for fine-grained action recognition. In: 2016 IEEE international conference on robotics and automation (ICRA). pp 1642–1649

Lee SG, Kim GY, Hwang YN, Kwon JY, Kim SM (2024) Adaptive undersampling and short clip-based two-stream CNN-LSTM model for surgical phase recognition on cholecystectomy videos. Biomed Signal Process Control 88:105637. https://doi.org/10.1016/j.bspc.2023.105637

Li L, Li X, Ding S, Fang Z, Xu M, Ren H, Yang S (2022a) SIRNet: fine-grained surgical interaction recognition. IEEE Robot Autom Lett 7(2):4212–4219. https://doi.org/10.1109/LRA.2022.3148454

Li Z, Liu F, Yang W, Peng S, Zhou J (2022b) A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Trans Neural Netw Learn Syst 33(12):6999–7019. https://doi.org/10.1109/TNNLS.2021.3084827

Article   MathSciNet   Google Scholar  

Li Y, Xia T, Luo H, He B, Jia F (2023) MT-FiST: a multi-task fine-grained spatial-temporal framework for surgical action triplet recognition. IEEE J Biomed Health Inform 27(10):4983–4994. https://doi.org/10.1109/JBHI.2023.3299321

Liu Y, Boels M, García-Peraza-Herrera LC, Vercauteren TKM, Dasgupta P, Granados A, Ourselin S (2023a) LoViT: long video transformer for surgical phase recognition. http://arxiv.org/abs/2305.08989

Liu Y, Huo J, Peng J, Sparks R, Dasgupta P, Granados A, Ourselin S (2023b) Skit: a fast key information video transformer for online surgical phase recognition. In: 2023 IEEE/CVF international conference on computer vision (ICCV). pp 21017–21027

Maier-Hein L, Vedula SS, Speidel S, Navab N, Kikinis R, Park AE, Eisenmann M, Feußner H, Forestier G, Giannarou S, Hashizume M, Katic D, Kenngott H, Kranzfelder M, Malpani A, März K, Neumuth T, Padoy N, Pugh CM, Schoch N, Stoyanov D, Taylor RH, Wagner M, Hager G, Jannin P (2017) Surgical data science for next-generation interventions. Nat Biomed Eng 1:691–696. https://doi.org/10.1038/s41551-017-0132-7

Maier-Hein L, Wagner M, Ross T, Reinke A, Bodenstedt S, Full PM, Hempe H, Filimon DM, Scholz P, Tran TN, Bruno P, Kisilenko A, Müller B, Davitashvili T, Capek M, Tizabi MD, Eisenmann M, Adler TJ, Gröhl J, Schellenberg M, Seidlitz S, Lai TYE, Roethlingshoefer V, Both F, Bittel S, Mengler M, Apitz M, Speidel S, Kenngott H, Müller-Stich BP (2020) Heidelberg colorectal data set for surgical data science in the sensor operating room. Sci Data 8:101

Mascagni P, Alapatt D, Sestini L, Altieri M, Madani A, Watanabe Y, Alseidi A, Redan J, Alfieri S, Costamagna G, Boskoski I, Padoy N, Hashimoto D (2022) Computer vision in surgery: from potential to clinical value. npj Digit Med 5:163. https://doi.org/10.1038/s41746-022-00707-5

Mascagni P, Alapatt D, Lapergola A, Vardazaryan A, Mazellier JP, Dallemagne B, Mutter D, Padoy N (2023) Early-stage clinical evaluation of real-time artificial intelligence assistance for laparoscopic cholecystectomy. Br J Surg 111(1):znad353. https://doi.org/10.1093/bjs/znad353

Mondal SS, Sathish R, Sheet D (2019) Multitask learning of temporal connectionism in convolutional networks using a joint distribution loss function to simultaneously identify tools and phase in surgical videos. http://arxiv.org/abs/1905.08315

Nakawala HC, Bianchi R, Pescatori LE, Cobelli OD, Ferrigno G, Momi ED (2018) “Deep-Onto’’ network for surgical workflow and context recognition. Int J Comput Assist Radiol Surg 14:685–696. https://doi.org/10.1007/s11548-018-1882-8

Neimark D, Bar O, Zohar M, Hager G, Asselmann D (2021) “Train one, classify one, teach one”—cross-surgery transfer learning for surgical step recognition. http://arxiv.org/abs/2102.12308

Nwoye CI, Padoy N (2022) Data splits and metrics for method benchmarking on surgical action triplet datasets. http://arxiv.org/abs/2204.05235

Nwoye CI, Gonzalez C, Yu T, Mascagni P, Mutter D, Marescaux J, Padoy N (2020) Recognition of instrument-tissue interactions in endoscopic videos via action triplets. In: Medical image computing and computer assisted intervention—MICCAI 2020: 23rd international conference, Lima, Peru, October 4–8, 2020, proceedings, part III. Springer-Verlag, Berlin, Heidelberg, pp 364–374

Nwoye CI, Yu T, Gonzalez C, Seeliger B, Mascagni P, Mutter D, Marescaux J, Padoy N (2022) Rendezvous: attention mechanisms for the recognition of surgical action triplets in endoscopic videos. Med Image Anal 78:102433. https://doi.org/10.1016/j.media.2022.102433

Nwoye CI, Alapatt D, Yu T, Vardazaryan A, Xia F, Zhao Z, Xia T, Jia F, Yang Y, Wang H, Yu D, Zheng G, Duan X, Getty N, Sanchez-Matilla R, Robu M, Zhang L, Chen H, Wang J, Wang L, Zhang B, Gerats B, Raviteja S, Sathish R, Tao R, Kondo S, Pang W, Ren H, Abbing JR, Sarhan MH, Bodenstedt S, Bhasker N, Oliveira B, Torres HR, Ling L, Gaida F, Czempiel T, Vilaça JL, Morais P, Fonseca J, Egging RM, Wijma IN, Qian C, Bian G, Li Z, Balasubramanian V, Sheet D, Luengo I, Zhu Y, Ding S, Aschenbrenner JA, van der Kar NE, Xu M, Islam M, Seenivasan L, Jenke A, Stoyanov D, Mutter D, Mascagni P, Seeliger B, Gonzalez C, Padoy N (2023) Cholectriplet 2021: a benchmark challenge for surgical action triplet recognition. Med Image Anal 86:102803. https://doi.org/10.1016/j.media.2023.102803

Padoy N (2019) Machine and deep learning for workflow recognition during surgery. Minim Invasive Ther Allied Technol 28:82–90

Padoy N, Blum T, Ahmadi SA, Feußner H, Berger MO, Navab N (2012) Statistical modeling and recognition of surgical workflow. Med Image Anal 16:632–641

Pan X, Gao X, Wang H, Zhang W, Mu Y, He X (2022) Temporal-based swin transformer network for workflow recognition of surgical video. Int J Comput Assist Radiol Surg 18:139–147. https://doi.org/10.1007/s11548-022-02785-y

Park M, Oh S, Jeong T, Yu S (2023) Multi-stage temporal convolutional network with moment loss and positional encoding for surgical phase recognition. Diagnostics. https://doi.org/10.3390/diagnostics13010107

Pradeep CS, Sinha N (2021) Spatio-temporal features based surgical phase classification using CNNs. In: 2021 43rd annual international conference of the IEEE engineering in medicine & biology society (EMBC). pp 3332–3335

Qi B, Qin X, Liu J, Xu Y, Chen Y (2019) A deep architecture for surgical workflow recognition with edge information. In: 2019 IEEE international conference on bioinformatics and biomedicine (BIBM). pp 1358–1364

Quellec G, Lamard M, Cochener B, Cazuguel G (2014) Real-time segmentation and recognition of surgical tasks in cataract surgery videos. IEEE Trans Med Imaging 33(12):2352–2360. https://doi.org/10.1109/TMI.2014.2340473

Ramesh S, Dall’Alba D, Gonzalez C, Yu T, Mascagni P, Mutter D, Marescaux J, Fiorini P, Padoy N (2021) Multi-task temporal convolutional networks for joint recognition of surgical phases and steps in gastric bypass procedures. Int J Comput Assist Radiol Surg 16:1111–1119. https://doi.org/10.1007/s11548-021-02388-z

Ramesh S, Dall’Alba D, Gonzalez C, Yu T, Mascagni P, Mutter D, Marescaux J, Fiorini P, Padoy N (2023a) Trandaugment: temporal random augmentation strategy for surgical activity recognition from videos. Int J Comput Assist Radiol Surg 18:1665–1672. https://doi.org/10.1007/s11548-023-02864-8

Ramesh S, DalľAlba D, Gonzalez C, Yu T, Mascagni P, Mutter D, Marescaux J, Fiorini P, Padoy N (2023b) Weakly supervised temporal convolutional networks for fine-grained surgical activity recognition. IEEE Trans Med Imaging 42(9):2592–2602. https://doi.org/10.1109/TMI.2023.3262847

Ramesh S, Srivastav V, Alapatt D, Yu T, Murali A, Sestini L, Nwoye CI, Hamoud I, Sharma S, Fleurentin A, Exarchakis G, Karargyris A, Padoy N (2023c) Dissecting self-supervised learning methods for surgical computer vision. Med Image Anal 88:102844. https://doi.org/10.1016/j.media.2023.102844

Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031

Rodrigues VF, da Rosa Righi R, da Costa CA, Eskofier B, Maier A (2019) On providing multi-level quality of service for operating rooms of the future. Sensors 19:1–27. https://doi.org/10.3390/s19102303

Sánchez-Matilla R, Robu MR, Grammatikopoulou M, Luengo I, Stoyanov D (2022) Data-centric multi-task surgical phase estimation with sparse scene segmentation. Int J Comput Assist Radiol Surg 17:953–960. https://doi.org/10.1007/s11548-022-02616-0

Sarikaya D, Guru KA, Corso JJ (2018) Joint surgical gesture and task classification with multi-task and multimodal learning. arXiv Preprint. http://arxiv.org/abs/1805.00721

Schoeffmann K, Taschwer M, Sarny S, Münzer B, Primus MJ, Putzgruber D (2018) Cataract-101: video dataset of 101 cataract surgeries. In: Proceedings of the 9th ACM multimedia systems conference, MMSys ’18, New York, NY, USA. Association for Computing Machinery, pp 421–425

Sharma S, Nwoye CI, Mutter D, Padoy N (2022) Rendezvous in time: an attention-based temporal fusion approach for surgical triplet recognition. Int J Comput Assist Radiol Surg 18:1053–1059. https://doi.org/10.1007/s11548-023-02914-1

Shi X, Jin Y, Dou Q, Heng PA (2020) LRTD: long-range temporal dependency based active learning for surgical workflow recognition. Int J Comput Assist Radiol Surg 15:1573–1584

Shi X, Jin Y, Dou Q, Heng PA (2021) Semi-supervised learning with progressive unlabeled data excavation for label-efficient surgical workflow recognition. Med Image Anal 73:102158. https://doi.org/10.1016/j.media.2021.102158

Shi P, Zhao Z, Liu K, Li F (2022) Attention-based spatial-temporal neural network for accurate phase recognition in minimally invasive surgery: feasibility and efficiency verification. J Comput Des Eng 9(2):406–416. https://doi.org/10.1093/jcde/qwac011

Singhal K, Azizi S, Tu T, Mahdavi S, Wei J, Chung H, Scales N, Tanwani A, Cole-Lewis H, Pfohl S, Payne P, Seneviratne M, Gamble P, Kelly C, Babiker A, Schärli N, Chowdhery A, Mansfield P, Demner-Fushman D, Natarajan V (2023) Large language models encode clinical knowledge. Nature 620:1–9. https://doi.org/10.1038/s41586-023-06291-2

Song H, Zhao Z, Liu K, Wu Y, Li F (2024) Anchor-free convolutional neural network application to enhance real-time surgical tool detection in computer-aided surgery. IEEE Trans Med Robot Bion 6(1):73–83. https://doi.org/10.1109/TMRB.2023.3328658

Stauder R, Ostler D, Kranzfelder M, Koller S, Feußner H, Navab N (2016) The TUM LapChole dataset for the M2CAI 2016 workflow challenge. http://arxiv.org/abs/1610.09278

Tao R, Zou X, Zheng G (2023) Last: latent space-constrained transformers for automatic surgical phase recognition and tool presence detection. IEEE Trans Med Imaging 42(11):3256–3268. https://doi.org/10.1109/TMI.2023.3279838

Timoh KN, Huaulmé A, Cleary K, Zaheer MA, Lavoué V, Donoho DA, Jannin P (2023) A systematic review of annotation for surgical process model analysis in minimally invasive surgery based on video. Surg Endosc 37:4298–4314

Topol EJ (2019) High-performance medicine: the convergence of human and artificial intelligence. Nat Med 25:44–56. https://doi.org/10.1038/s41591-018-0300-7

Twinanda AP, Shehata S, Mutter D, Marescaux J, de Mathelin M, Padoy N (2016) EndoNet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans Med Imaging 36:86–97. https://doi.org/10.1109/TMI.2016.2593957

Twinanda AP, Yengera G, Mutter D, Marescaux J, Padoy N (2018) RSDNet: learning to predict remaining surgery duration from laparoscopic videos without manual annotations. IEEE Trans Med Imaging 38:1069–1078. https://doi.org/10.1109/TMI.2018.2878055

Wagner M, Müller-Stich BP, Kisilenko A, Tran D, Heger P, Mündermann L, Lubotsky DM, Müller B, Davitashvili T, Capek M, Reinke A, Reid C, Yu T, Vardazaryan A, Nwoye CI, Padoy N, Liu X, Lee EJ, Disch C, Meine H, Xia T, Jia F, Kondo S, Reiter W, Jin Y, Long Y, Jiang M, Dou Q, Heng PA, Twick I, Kirtac K, Hosgor E, Bolmgren JL, Stenzel M, von Siemens B, Zhao L, Ge Z, Sun H, Xie D, Guo M, Liu D, Kenngott HG, Nickel F, von Frankenberg M, Mathis-Ullrich F, Kopp-Schneider A, Maier-Hein L, Speidel S, Bodenstedt S (2023) Comparative validation of machine learning algorithms for surgical workflow and skill analysis with the HeiChole benchmark. Med Image Anal 86:102770. https://doi.org/10.1016/j.media.2023.102770

Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. pp 7794–7803

Wang H, Ding S, Yang S, Liu C, Yu S, Zheng X (2022) Guided activity prediction for minimally invasive surgery safety improvement in the internet of medical things. IEEE Internet Things J 9(6):4758–4768. https://doi.org/10.1109/JIOT.2021.3108457

Xi N, Meng J, Yuan J (2022) Forest graph convolutional network for surgical action triplet recognition in endoscopic videos. IEEE Trans Circuits Syst Video Technol 32(12):8550–8561. https://doi.org/10.1109/TCSVT.2022.3191838

Xi N, Meng J, Yuan J (2023) Chain-of-look prompting for verb-centric surgical triplet recognition in endoscopic videos. In: Proceedings of the 31st ACM international conference on multimedia, MM ’23, New York, NY, USA. Association for Computing Machinery, pp 5007–5016

Xia T, Jia F (2021) Against spatial-temporal discrepancy: contrastive learning-based network for surgical workflow recognition. Int J Comput Assist Radiol Surg 16:839–848. https://doi.org/10.1007/s11548-021-02382-5

Yamlahi A, Tran TN, Godau P, Schellenberg M, Michael D, Smidt FH, Nölke JH, Adler TJ, Tizabi MD, Nwoye CI, Padoy N, Maier-Hein L (2023) Self-distillation for surgical action recognition. In: Medical image computing and computer assisted intervention—MICCAI 2023: 26th international conference, Vancouver, BC, Canada, October 8-12, 2023, proceedings, part IX. Springer-Verlag, Berlin, Heidelberg, pp 637–646

Yengera G, Mutter D, Marescaux J, Padoy N (2018) Less is more: surgical phase recognition with less annotations through self-supervised pre-training of CNN-LSTM networks. http://arxiv.org/abs/1805.08569

Yi F, Jiang T (2019) Hard frame detection and online mapping for surgical phase recognition. In: Medical image computing and computer assisted intervention—MICCAI 2019: 22nd international conference, Shenzhen, China, October 13-17, 2019, proceedings, part V. Springer-Verlag, Berlin, Heidelberg, pp 449–457

Yi F, Yang Y, Jiang T (2023) Not end-to-end: explore multi-stage architecture for online surgical phase recognition. In: Wang L, Gall J, Chin T-J, Sato I, Chellappa R (eds) Computer vision—ACCV 2022. Springer Nature Switzerland, Cham, pp 417–432

Yu T, Mutter D, Marescaux J, Padoy N (2018) Learning from a tiny dataset of manual annotations: a teacher/student approach for surgical phase recognition. http://arxiv.org/abs/1812.00033

Yuan K, Holden M, Gao S, Lee W (2022) Anticipation for surgical workflow through instrument interaction and recognized signals. Med Image Anal 82:102611. https://doi.org/10.1016/j.media.2022.102611

Yue W, Liao H, Xia Y, Lam V, Luo J, Wang Z (2023) Cascade multi-level transformer network for surgical workflow analysis. IEEE Trans Med Imaging 42(10):2817–2831. https://doi.org/10.1109/TMI.2023.3265354

Zaffino P, Moccia S, Momi ED, Spadea MF (2020) A review on advances in intra-operative imaging for surgery and therapy: imagining the operating room of the future. Ann Biomed Eng 48:2171–2191. https://doi.org/10.1007/s10439-020-02553-6

Zhang Y, Yang Q (2022) A survey on multi-task learning. IEEE Trans Knowl Data Eng 34(12):5586–5609. https://doi.org/10.1109/TKDE.2021.3070203

Zhang B, Abbing JR, Ghanem A, Fer D, Barker J, Abukhalil R, Goel VK, Milletarì F (2021a) Towards accurate surgical workflow recognition with convolutional networks and transformers. Comput Methods Biomech Biomed Eng Imaging Vis 10:349–356. https://doi.org/10.1080/21681163.2021.2002191

Zhang B, Ghanem A, Simes A, Choi H, Yoo A (2021b) Surgical workflow recognition with 3DCNN for sleeve gastrectomy. Int J Comput Assist Radiol Surg 16:2029–2036. https://doi.org/10.1007/s11548-021-02473-3

Zhang B, Ghanem A, Simes A, Choi H, Yoo A, Min A (2021c) SWNet: surgical workflow recognition with deep convolutional network. In: International conference on medical imaging with deep learning

Zhang Y, Bano S, Page AS, Deprest JA, Stoyanov D, Vasconcelos F (2022) Large-scale surgical workflow segmentation for laparoscopic sacrocolpopexy. Int J Comput Assist Radiol Surg 17:467–477

Zhang B, Fung A, Torabi M, Barker J, Foley G, Abukhalil R, Gaddis ML, Petculescu S (2023) C-ECT: Online surgical phase recognition with cross-enhancement causal transformer. In: 2023 IEEE 20th international symposium on biomedical imaging (ISBI). pp 1–5

Zhang B, Sarhan MH, Goel B, Petculescu S, Ghanem A (2024a) SF-TMN: Slowfast temporal modeling network for surgical phase recognition. Int J Comput Assist Radiol Surg 19(5):871–880. https://doi.org/10.1007/s11548-024-03095-1

Zhang J, Barbarisi S, Kadkhodamohammadi A, Stoyanov D, Luengo I (2024b) Self-knowledge distillation for surgical phase recognition. Int J Comput Assist Radiol Surg 19:61–68. https://doi.org/10.1007/s11548-023-02970-7

Zheng M, Ye M, Rafii-Tari H (2022) Automatic biopsy tool presence and episode recognition in robotic bronchoscopy using a multi-task vision transformer network. In: 2022 international conference on robotics and automation (ICRA). pp 7349–7355

Zia A, Hung A, Essa I, Jarc A (2018) Surgical activity recognition in robot-assisted radical prostatectomy using deep learning. In: Frangi AF, Schnabel JA, Davatzikos C, Alberola-López C, Fichtinger G (eds) Medical image computing and computer assisted intervention—MICCAI 2018. Springer International Publishing, Cham, pp 273–280

Zisimopoulos O, Flouty E, Luengo I, Giataganas P, Nehme J, Chow A, Stoyanov D (2018) Deepphase: surgical phase recognition in cataracts videos. In: Frangi AF, Schnabel JA, Davatzikos C, Alberola-López C, Fichtinger G (eds) Medical image computing and computer assisted intervention—MICCAI 2018. Springer International Publishing, Cham, pp 265–272

Download references

Acknowledgements

This research was funded by the National Key Research and Development Program of China under Grant 2019YFB1311300. We thank all research participants for their assistance and participation in this survey.

Author information

Authors and affiliations.

School of Control Science and Engineering, Shandong University, Jinan, 250061, Shandong Province, China

Yunlong Li, Zijian Zhao & Renbo Li

Department of General Surgery, Qilu Hospital of Shandong University, Jinan, 250012, Shandong Province, China

You can also search for this author in PubMed   Google Scholar

Contributions

Y.L and Z.Z conceived and designed the framework of survey; Y.L and R.L participated in the literature search and screening; Y.L wrote the main manuscript text; Z.Z wrote and modified the paper; F.L was responsible for clinical study and investigation. All authors reviewed the manuscript.

Corresponding author

Correspondence to Zijian Zhao .

Ethics declarations

Conflict of interest.

The authors declare no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Li, Y., Zhao, Z., Li, R. et al. Deep learning for surgical workflow analysis: a survey of progresses, limitations, and trends. Artif Intell Rev 57 , 291 (2024). https://doi.org/10.1007/s10462-024-10929-6

Download citation

Accepted : 28 August 2024

Published : 16 September 2024

DOI : https://doi.org/10.1007/s10462-024-10929-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Deep learning
  • Computer vision
  • Surgical workflow analysis
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. How to write a literature review: Tips, Format and Significance

    merits of sources of literature review

  2. 15 Literature Review Examples (2024)

    merits of sources of literature review

  3. RES 10 Sources & Location of literature review in research / lecture and notes

    merits of sources of literature review

  4. 2: Sources of literature review

    merits of sources of literature review

  5. Literature Review: Outline, Strategies, and Examples

    merits of sources of literature review

  6. Review of Related Literature: Format, Example, & How to Make RRL

    merits of sources of literature review

VIDEO

  1. M-111(C): Print & Electronic Sources & Literature in Social Sciences By Mr. Shashi Shekhar Kumar

  2. Trade Credit

  3. Research Methods: Lecture 3

  4. He Said, She Said: Proper Use of Citations in Academic Writing

  5. How to find Literature Review for Research

  6. Sources of literature review #bsc nursing #nursing research

COMMENTS

  1. Literature review as a research methodology: An overview and guidelines

    As mentioned previously, there are a number of existing guidelines for literature reviews. Depending on the methodology needed to achieve the purpose of the review, all types can be helpful and appropriate to reach a specific goal (for examples, please see Table 1).These approaches can be qualitative, quantitative, or have a mixed design depending on the phase of the review.

  2. The Literature Review: A Foundation for High-Quality Medical Education

    Purpose and Importance of the Literature Review. An understanding of the current literature is critical for all phases of a research study. Lingard 9 recently invoked the "journal-as-conversation" metaphor as a way of understanding how one's research fits into the larger medical education conversation. As she described it: "Imagine yourself joining a conversation at a social event.

  3. Approaching literature review for academic purposes: The Literature

    A sophisticated literature review (LR) can result in a robust dissertation/thesis by scrutinizing the main problem examined by the academic study; anticipating research hypotheses, methods and results; and maintaining the interest of the audience in how the dissertation/thesis will provide solutions for the current gaps in a particular field.

  4. Importance of a Good Literature Review

    A literature review is not only a summary of key sources, but has an organizational pattern which combines both summary and synthesis, often within specific conceptual categories.A summary is a recap of the important information of the source, but a synthesis is a re-organization, or a reshuffling, of that information in a way that informs how you are planning to investigate a research problem.

  5. Conducting a Literature Review

    Upon completion of the literature review, a researcher should have a solid foundation of knowledge in the area and a good feel for the direction any new research should take. Should any additional questions arise during the course of the research, the researcher will know which experts to consult in order to quickly clear up those questions.

  6. Evaluating Sources & Lit. Reviews

    A good literature review evaluates a wide variety of sources (academic articles, scholarly books, government/NGO reports). It also evaluates literature reviews that study similar topics. This page offers you a list of resources and tips on how to evaluate the sources that you may use to write your review.

  7. Reviewing literature for research: Doing it the right way

    Literature search. Fink has defined research literature review as a "systematic, explicit and reproducible method for identifying, evaluating, and synthesizing the existing body of completed and recorded work produced by researchers, scholars and practitioners."[]Review of research literature can be summarized into a seven step process: (i) Selecting research questions/purpose of the ...

  8. Literature Review: The What, Why and How-to Guide

    Example: Predictors and Outcomes of U.S. Quality Maternity Leave: A Review and Conceptual Framework: 10.1177/08948453211037398 ; Systematic review: "The authors of a systematic review use a specific procedure to search the research literature, select the studies to include in their review, and critically evaluate the studies they find." (p. 139).

  9. How to Write a Literature Review

    Examples of literature reviews. Step 1 - Search for relevant literature. Step 2 - Evaluate and select sources. Step 3 - Identify themes, debates, and gaps. Step 4 - Outline your literature review's structure. Step 5 - Write your literature review.

  10. Literature Review Research

    Literature Review is a comprehensive survey of the works published in a particular field of study or line of research, usually over a specific period of time, in the form of an in-depth, critical bibliographic essay or annotated list in which attention is drawn to the most significant works.. Also, we can define a literature review as the collected body of scholarly works related to a topic:

  11. Research Guides: Literature Reviews: What is a Literature Review?

    A literature review is a review and synthesis of existing research on a topic or research question. A literature review is meant to analyze the scholarly literature, make connections across writings and identify strengths, weaknesses, trends, and missing conversations. A literature review should address different aspects of a topic as it ...

  12. Conducting a Literature Review: Why Do A Literature Review?

    Literature review is approached as a process of engaging with the discourse of scholarly communities that will help graduate researchers refine, define, and express their own scholarly vision and voice. This orientation on research as an exploratory practice, rather than merely a series of predetermined steps in a systematic method, allows the ...

  13. Writing a literature review

    Writing a literature review requires a range of skills to gather, sort, evaluate and summarise peer-reviewed published data into a relevant and informative unbiased narrative. Digital access to research papers, academic texts, review articles, reference databases and public data sets are all sources of information that are available to enrich ...

  14. LibGuides: Literature Review: Purpose of a Literature Review

    The purpose of a literature review is to: Provide a foundation of knowledge on a topic; Identify areas of prior scholarship to prevent duplication and give credit to other researchers; Identify inconstancies: gaps in research, conflicts in previous studies, open questions left from other research;

  15. Advantages and disadvantages of literature review

    Creation of new body of knowledge. One of the key advantages of literature review is that it creates new body of knowledge. Through careful evaluation and critical summarisation, researchers can create a new body of knowledge and enrich the field of study. Answers to a range of questions. Literature reviews help researchers analyse the existing ...

  16. Evaluating Literature Reviews and Sources

    A good literature review evaluates a wide variety of sources (academic articles, scholarly books, government/NGO reports). It also evaluates literature reviews that study similar topics. This page offers you a list of resources and tips on how to evaluate the sources that you may use to write your review.

  17. What is a Literature Review?

    A literature review is a comprehensive summary of previous research on a topic. The literature review surveys scholarly articles, books, and other sources relevant to a particular area of research. The review should enumerate, describe, summarize, objectively evaluate and clarify this previous research. It should give a theoretical base for the ...

  18. Chapter 9 Methods for Literature Reviews

    9.3. Types of Review Articles and Brief Illustrations. EHealth researchers have at their disposal a number of approaches and methods for making sense out of existing literature, all with the purpose of casting current research findings into historical contexts or explaining contradictions that might exist among a set of primary research studies conducted on a particular topic.

  19. Literature Review

    Analyze and synthesize the literature: Analyze each source in depth, identifying the key findings, methodologies, and conclusions. Then, synthesize the information from the sources, identifying patterns and themes in the literature. ... Advantages of Literature Review. There are several advantages to conducting a literature review as part of a ...

  20. What is the purpose of a literature review?

    A literature review is a survey of scholarly sources (such as books, journal articles, and theses) related to a specific topic or research question. It is often written as part of a thesis, dissertation, or research paper, in order to situate your work in relation to existing knowledge.

  21. Primary & Secondary Sources

    The term primary source is used broadly to embody all sources that are original. Primary sources provide first-hand information that is closest to the object of study. Primary sources vary by discipline. In the natural and social sciences, original reports of research found in academic journals detailing the methodology used in the research, in ...

  22. How to Use AI for Literature Review (2024): Complete 7 Step Guide for

    You can also use other AI-powered search tools like Semantic Scholar or Google Scholar to broaden your sources. 3. Generate Summaries and Key Themes with GPT-4o. ... Advantages of Using AI for Literature Review. AI is changing the game for researchers tackling literature reviews. Now, we've got smart tools that can do a lot of the heavy lifting ...

  23. Revolutionizing Neonatal Care: A Comprehensive Review of ...

    Neonatal resuscitation is a critical procedure aimed at ensuring the successful transition of newborns from intrauterine to extrauterine life. Traditionally, this involves immediate clamping and cutting of the umbilical cord, but recent advances have introduced intact cord resuscitation (ICR) as an alternative approach. This review aims to comprehensively analyze ICR, exploring its evolution ...

  24. Systematically Reviewing the Literature: Building the Evidence for

    Systematic reviews that summarize the available information on a topic are an important part of evidence-based health care. There are both research and non-research reasons for undertaking a literature review. It is important to systematically review the literature when one would like to justify the need for a study, to update personal ...

  25. Sources of Sexual Knowledge and Information, and Sexual Attitudes of

    This study sought to synthesise evidence on the sources of sexual knowledge and information and relationship with sexual attitudes of cis men. From a review of existing literature, five categories were obtained from 11 studies and grouped into three syntheses: (1) sources of sexual knowledge and information, (2) sexual attitudes and (3) the relationship between sources of sexual knowledge and ...

  26. Deep learning for surgical workflow analysis: a survey of progresses

    Automatic surgical workflow analysis, which aims to recognize the ongoing surgical events in videos, is fundamental for developing context-aware computer-assisted systems. This paper reviews representative surgical workflow recognition algorithms based on deep learning, outlining their merits, limitations, and future research directions. The literature survey was performed on three large ...

  27. Ten Simple Rules for Writing a Literature Review

    When searching the literature for pertinent papers and reviews, the usual rules apply: be thorough, use different keywords and database sources (e.g., DBLP, Google Scholar, ISI Proceedings, JSTOR Search, Medline, Scopus, Web of Science), and. look at who has cited past relevant papers and book chapters.