AI Tool Profile

Bio-SIEVE: Exploring Instruction Tuning Large Language Models for Systematic Review Automation

Medical systematic reviews can be very costly and resource intensive.

Paper and LLMs Medical and Health PICO

Website

github.com

Pricing model

Free

Price start

Free

GitHub Link

The GitHub link is https://github.com/ambroser53/bio-sieve

Introduce

The project "Bio-SIEVE" explores the use of Large Language Models (LLMs) for automating literature screening in medical systematic reviews. The study focuses on training LLMs to perform abstract screening based on specific selection criteria. The best model developed, named Bio-SIEVE, outperforms both ChatGPT and traditional methods, displaying better generalization across medical domains. The study also investigates multi-task training but finds that single-task Bio-SIEVE performs better. The models, code, and dataset information are made available for reproducibility. The project's models, training process, and evaluation on various datasets are detailed, highlighting its potential for streamlining biomedical systematic reviews. Medical systematic reviews can be very costly and resource intensive.

Content

The adapter weights for the 4 best models trained as part of this project can be found and used from HuggingFace: Instruct Cochrane consists of 5 main splits as detailed in the table below: The dataset can be constructed from separate lists of DOIs, as described in data/README.md. Models are all trained using a modified version of the QLoRA training script (Dettmers et. al.). An example training script is found below with the necessary parameters to recreate out model from the dataset. Models are evaluated on four datasets: Test, Subsets, Safety-First and Irrelevancy. Test evaluates the performance on the raw cochrane reviews. Subsets allow for comparison with logistic regression baselines as it allows for k-fold cross validation while training per review, simluating the existing active learning methods in literature. Safety-First better approximates the include/exclude process on just abstracts and titles. The test set is the final decision based on full-text screening, hence it is not always possible to derive their decision from the abstract and title alone. Irrelevancy is based on the subsets, wherein abstracts from completely different reviews are tested to evaluate whether the model can exclude samples far from the decision boundary. Details on using the evaluations scripts can be found in evaluation/README.md.

Alternatives & Similar Tools

Med PaLM-Medical Large Language Model from Google Research Freemium

Google's AI Medical Language Model

Visit →

Replicate-AI model GFPGAN can help restore old photos Paid

Replicate – Run open-source machine learning models with a cloud API

Visit →

Free Google Gemini: the best largest and most capable AI model Free

Google Gemini, a multimodal AI by DeepMind, processes text, audio, images, and more. Gemini outperforms in AI benchmarks, is optimized for varied devices, and has been tested for safety and bias, adhering to responsible AI practices.

Visit →

Docus.ai- Diagnose fast with AI, verify with top human doctors Freemium

AI-Powered Health Platform

Visit →

Video ReTalking-focuses on audio-based lip synchronization for talking head video editing Open Source

Video ReTalking, advanced real-world talking head video according to input audio, producing a high-quality

Visit →

UniSim-Chat Control Video and Virtual simulation Open Source

Then transplant it to the real world to solve complex problems

Visit →

Compare Bio-SIEVE: Exploring Instruction Tuning Large Language Models for Systematic Review Automation

Quick compare routes for nearby alternatives.

All compare routes →

Bio-SIEVE: Exploring Instruction Tuning Large Language Models for Systematic Review Automation vs Med PaLM-Medical Large Language Model from Google Research

Compare Bio-SIEVE: Exploring Instruction Tuning Large Language Models for Systematic Review Automation with Med PaLM-Medical Large Language Model from Google Research and jump into the preserved compare route.

Open compare route →

Bio-SIEVE: Exploring Instruction Tuning Large Language Models for Systematic Review Automation vs Replicate-AI model GFPGAN can help restore old photos

Compare Bio-SIEVE: Exploring Instruction Tuning Large Language Models for Systematic Review Automation with Replicate-AI model GFPGAN can help restore old photos and jump into the preserved compare route.

Open compare route →

Bio-SIEVE: Exploring Instruction Tuning Large Language Models for Systematic Review Automation vs Free Google Gemini: the best largest and most capable AI model

Compare Bio-SIEVE: Exploring Instruction Tuning Large Language Models for Systematic Review Automation with Free Google Gemini: the best largest and most capable AI model and jump into the preserved compare route.

Open compare route →